John Fremlin's blog: The truth about MP3 files

Posted 2009-03-04 23:00:00 GMT

It is rather complicated to merge multiple MP3 files to play sequentially, due to the profusion of undocumented frame formats in widespread use. Here I present a Ruby library that can be used to clean up the input files and add the index afterwards, so that concatenating is as simple as using the + operator [provided the files have the same sample rate]. Fixing the VBR index is easy too.

Despite being the most popular way of storing electronic music, the MP3 file format wasn't actually thought through and specified by any committee. It grew out of the rather sketchy practice of simply shoving raw MPEG2 Layer 3 audio frames into a file. Each frame is logically independent and specifies whether it is stereo or mono, the bitrate, the sample rate (frequency), and the actual waveform for a fraction of a second of music. However, many players will get confused if the sample rate changes (e.g. from 48kHz to 44.1kHz).

In between the motley collection of independent frames, it's quite possible to interject any sort of information at all. At worst (if it contains a frame start header) an MP3 player might emit an annoying squeak before skipping to the next frame start header. [Unfortunately, frames may contain frame start headers, so that you have to actually figure out the frame length (annoyingly complicated, as it depends on the Layer, the MPEG version, the sample rate, the bitrate and the padding field) to parse the MP3 file.]

The promiscuous generosity of most MP3 players allows people to squeeze in any number of strange and poorly specified fragments of information: for example indices for seeking, tags and so on.

The MP3 format does not have any way to find out how long a file is, in terms of play time, except to run through and add up the length of each frame, which requires reading in the whole file. This is very slow and clearly can be improved. [Of course, if the file has a fixed bitrate then it is of course possible to easily interpolate.] In fact, two popular competing index formats sprang up: VBRI (from the Fraunhofer institute, famously the people who try to extort money from implementors of MP3 players), and the famous Xing header, which is more well-known, from Real Networks. These are both widely supported by MP3 players to quickly find out the length of an MP3 file and seek to a certain point.

The Xing header contains the play time and an index to 100 equally spaced (by play time) byte positions, so you can quickly seek to approximately 42% of the way through the file. This is unfortunately only approximate. However, it is simple, small and widely supported.

The problem with the profusion of tag formats and index formats comes when you are trying to modify or concatenate MP3 files. Logically it should be possible to simply concatenate the contents (provided the sample rate is the same) and have them play fine, and it is provided they don't contain either Xing or VBRI indices. If they do, then the player might find it and will get a very wrong idea of how long the file is.

Additionally, ID3 and other tags for storing the song, artist and other meta-information about the file come in a range of strange formats. They need to be updated or removed.

Unfortunately, there is no free program that correctly removes all the tags. The best that I know of is vbrfix, but it does not remove or fix VBRI tags. So when we had problems with merging MP3 files we developed our own system.

Here is the small Ruby library. Use

data =
to get just the frames out of an MP3 file [this function will remove the Xing, ID3, VBRI and other miscellaneous tags at the start or end of the file]. Concatenate as many of these together as you like then use
to add an index. There are plenty of ID3 libraries already for writing out tags.

(Thanks to Cerego for supporting the development and allowing it to be released publicly.)

Hey John,

Thanks for putting this Ruby library up. This solved my problem, where I was trying to concatenate mp3 files without having to recode/transcode them with ffmpeg/lame. I searched everywhere, tried everything, nothing worked until I stumbled across your code! Now my mp3 players get the correct song length and don't stop too early. Yay!

-- Anon mp3 guy

Posted 2015-01-30 18:10:55 GMT by Anonymous from

Post a comment