4

I have an M4A file which is also converted to a FLAC file. I'd like to see if the conversion is lossless, namely, whether the output to pcm from M4A is exactly identical to the one from FLAC decoding.

I assume there's a way to use FFmpeg or Libav to produce some "raw" output and compare them?

slhck
  • 223,558
  • 70
  • 607
  • 592
Determinant
  • 1,260
  • 4
  • 16
  • 23

3 Answers3

15

You can use the hash muxer to generate a checksum of the decoded media. No need to convert files, and it is unaffected by metadata or other factors that can cause a standalone sum tool to report false differences.

Example to compare WAV → FLAC. Because FLAC is lossless the hashes should be the same:

$ ffmpeg -loglevel error -i input.wav output.flac

$ ffmpeg -loglevel error -i input.wav -map 0 -f hash -
  SHA256=c1acb198952f5c341190ffb62eeafe4f10c8f48c67a188e25087471a74eaa957

$ ffmpeg -loglevel error -i output.flac -map 0 -f hash -
  SHA256=c1acb198952f5c341190ffb62eeafe4f10c8f48c67a188e25087471a74eaa957
  • There are many available hash algorithms to choose from. Some are faster than others. You can select an algorithm with the -hash option, such as -hash md5.

  • -map 0 is used in the examples to include all streams into the checksum. Without it the default stream selection behavior will only choose one stream per stream type. If you want to exclude/include specific streams then do so with the -map option with stream specifiers. For example, to exclude all video use negative mapping with -map -0:v, or to only include audio use -map 0:a, or to only include the third audio stream use -map 0:a:2.

  • The streamhash muxer is similar to hash, but it will output a hash per stream, such as one for video and one for audio. Again, it also will use the default stream selection behavior unless you add -map.

  • If you want to compare each individual frame/packet then use the framehash muxer.

llogan
  • 57,139
  • 15
  • 118
  • 145
  • 2
    +1 This is good also because it completely avoids the issue of metadata in the uncompressed file, which otherwise could make identical-audio files differ. – user Jan 10 '13 at 21:28
4

I'd try converting them both to WAV and comparing their checksums.

ffmpeg -i file1.m4a file1.wav
ffmpeg -i file2.flac file2.wav
md5sum file1.wav
md5sum file2.wav
rm file?.wav

Compare the md5s produced. If they match, congratulations! Your files contain the same data. If they don't match, post the output of those commands here, and I'll look. Potentially there is a bitrate difference or something (there ought not to be... but there may be, I don't know.)

Note that the ffmpegs will generate comparatively large intermediate files.

thirtythreeforty
  • 1,163
  • 17
  • 34
  • It seems that the output size by `ffmpeg -y -i in.m4a -ac 2 -ar 48000 -acodec flac out.flac` differs from that of `ffmpeg -y -i in.m4a -acodec flac out.flac`. I have no idea what's going on when converting as well as the subtle paramters. Could you explain a little bit? – Determinant Jan 10 '13 at 01:39
  • With the latter command, md5sum is the same. – Determinant Jan 10 '13 at 01:42
  • And the former command is copied from a forum, I guess the file size has something to do with the number "48000", right? – Determinant Jan 10 '13 at 01:45
  • 1
    Yup. See, the `-ar 48000` says to use 48000 samples per second. If that is different than the source's number of samples per second, ffmpeg interpolates (sticks additional values in between), and that makes the resulting file different. If you just let ffmpeg autodetect everthing, it tries to change as little as it can. – thirtythreeforty Jan 10 '13 at 01:58
  • One more question, is wav file the raw file standard? I mean are there any other alternative raw formats besides wav in PC area? – Determinant Jan 10 '13 at 02:00
  • I suppose there are two well-known uncompressed standards: WAV and AIFF. AIFF is used a lot by Apple's systems; the rest of everybody uses WAV. – thirtythreeforty Jan 10 '13 at 02:05
  • Thx! Fast and clear answer. – Determinant Jan 10 '13 at 02:06
  • 7
    @ymfoi WAV is **not** a raw file standard per se. WAV files are just containers and therefore can contain different audio codecs. In this case it will be PCM audio (pulse-code modulated), which is lossless. But there can also be compressed codecs inside a WAV file: http://en.wikipedia.org/wiki/Wav#WAV_file_compression_codecs_compared – slhck Jan 10 '13 at 07:51
  • @slhck so, are there any methods to extract raw PCM data from the decoder? – Determinant Jan 10 '13 at 09:18
  • 1
    @ymfoi FFmpeg will choose 16-bit PCM by default, so you already get uncompressed, "unaltered" audio (unless your source used more bit depth like 32 bit; in that case you could specify `-c:a pcm_s32le`, for example). – slhck Jan 10 '13 at 09:22
  • @slhck I see. Can I say that the WAV file I've got from FFmpeg consists of several chuncks of PCM raw data with some additional information, and there're some other similar formats like WAV, say AIFF, etc.? – Determinant Jan 10 '13 at 09:28
  • @ymfoi That is correct. WAV, like AIFF, are just containers that indeed store so-called "chunks" of audio data. – slhck Jan 10 '13 at 09:30
  • @slhck Thx~ Now I have a clearer understanding of WAV, which has been a mystery to me for years. – Determinant Jan 10 '13 at 09:35
  • @slhck thanks. I had forgotten to make that distinction. In 99% of the cases you'll see, however, PCM is the only stream format you'll see in a WAV container. – thirtythreeforty Jan 10 '13 at 15:19
0

You could try visualizing the audio's spectrum and then compare the two side by side.

Here's a way to do that with the showcqt filter.

ffmpeg -i "video_file_with_audio.mkv"  -i "alternative_audio.m4a" \
        -filter_complex "[0:a]asplit[a1][ao];
              [a1]volume=1.2,showcqt=fps=23.976:s=480x1080:count=3:axis=0:axis_h=30:bar_h=100:basefreq=200:endfreq=12495[vfreq1];
              [1:a]showcqt=fps=23.976:s=480x1080:count=3:axis=0:axis_h=30:bar_h=100:basefreq=200:endfreq=12495[vfreq2];
              [vid1]fps=23.976,scale=-1:1080[v1];[v1][vfreq1][vfreq2]hstack=3[vo]" \
        -map "[vo]" -map "[ao]"  output.mp4

This assumes you have a video with audio and then an alternative audio for that video.

See this screenshot of showcqt side by side

Chris
  • 163
  • 8