An Introduction to MP3 Encoding

StormChaser · Post by **StormChaser** » Sat Jul 24, 2010 11:38 pm

by StormChaser

Why bother?

With the ever dropping price of storage media, it's a perfectly reasonable question to ask, but regardless of how cheap the storage media becomes, it's still far cheaper if you can fit far more audio content into the same given storage space, especially if the loss in quality is too small for you to notice. Even at fairly high compression levels, the LAME MP3 encoder can produce results that are perceptually transparent (ie, indistinguishable from the original) to all but a tiny percentage of people. Extensive public group listening tests over the years have proven this beyond any reasonable doubt.

Why MP3?

Whilst it's true to say that several other audio compression standards have become available since the launch of the MP3 standard, none have achieved such widespread support across both the computer market and domestic hardware markets as far as compatibility is concerned. This is particularly true for those of us who can't always afford to buy the latest and greatest hardware, and let's face it, that's going to be most of us in reality. If your hardware supports any of the compressed audio formats, you can guarantee that MP3 will be one of them. Purists may still decide to encode using a lossless compression format such as FLAC for true 1:1 backups, but the compression ratio is seldom much more than 2:1 and compatibility with standalone commercial media players is still next to nil.

Won't MP3 'kill' my music?

The short answer is "used sensibly, no.". Read on for further technical detail as to why...

The popular claim that MP3 "throws sounds away" isn't entirely true. MP3 analyses the signal in order to determine what you are and aren't going to hear due to natural masking effects of simultaneous sounds before deciding upon the resolution to use to store that particular part of the sound. MP3 encoding effectively uses more bits to store important detail, and less bits to store less important detail that you're incredibly unlikely to hear anyway due to these masking effects. Sounds that fall below the absolute threshold of audibility will disappear altogether, but as they were inaudible in the first place, nothing audible has been discarded. This does, of course, rely on the masking and audibility threshold detection modelling the human ear with great accuracy and, through very careful design, it almost always does.

By default, the LAME MP3 encoder also uses a method known as Joint Stereo to save even more space. The name is a little misleading as the channels aren't actually joined. What the method entails is determining whether it's more efficient to store the left and right channels as discrete left and right channels or as the sum of both channels and the difference between both channels. The encoder can switch between both modes on a frame-by frame basis (roughly every 26 milliseconds) using whichever method leads to the most efficient usage of the available bitrate. The decoder is told which mode to switch to seamlessly on-the-fly during playback. The result is increased quality in CBR and ABR, or decreased file size for the same quality in VBR. Joint Stereo is a lossless process and has no effect on channel separation whatsoever, so there is no logical reason to avoid its usage, and it is default behaviour for LAME.

On top of this, MP3 also uses a method known as Huffman coding. This is very similar to the compression used by PKZIP and WINRAR, and this phase of the compression process is also entirely lossless. This can reduce encoded file size by up to a further 20%. If you've ever wondered why a zipped MP3 file is still almost exactly the same size as the same file not zipped, the fact that Huffman coding has already been used on it once is the reason why.

Given sufficient bitrate to work with, MP3 will provide perceptually transparent results for almost everyone almost all of the time.

Why LAME in particular?

LAME is always under constant development and has long been considered by those with 'golden ears' as being the best MP3 encoder available for producing the best possible quality at the lowest possible bitrate. Although MP3 is technically a fixed standard, there's still room for improvement within the confines of the specification, and the LAME developers never give up on making a good thing better if at all possible. Every developer involved in the LAME project has been a dedicated and highly skilled freelancer doing it for the love of it, not because a boss is telling them to do it, so that makes it kind of special in my estimations too.

CBR? ABR? VBR? What are they?

Let's take a look at the available encoding methods highlighting their relative strengths and weaknesses. There are three basic encoding methods available with LAME. CBR (constant bitrate), ABR (average bitrate), and VBR (variable bitrate).

CBR

With CBR, the bitrate remains constant. CBR is incapable of storing all of the required data to take full advantage of the MP3 standard unless you always encode at 320kbps. Forcing MP3 to encode in CBR at anything less than 320kbps will almost always lose you something on hard to encode musical passages, although these losses are minor and may not be noticed by many until the bitrate drops to around 128kbps. Only you know what bitrate sounds good to you, so experiment.

In summary, CBR = constant bitrate but constantly varying quality. The file size will always be at a fixed number of Megabytes per hour.

ABR

ABR allows you to choose a target bitrate, but has the ability to increase or decrease the bitrate dynamically within internally defined limits, and its efficiency and quality exceed any sub-320kbps CBR encoding when compared at the same average bitrate. It is technically better than CBR encoding as more bitrate can be given to complex passages with less given to easier to encode passages, but it's still a compromise compared to VBR.

In summary, ABR = constantly varying quality but to a lesser degree than with CBR. The file size will be very close to a fixed number of Megabytes per hour.

VBR

VBR gives the encoder access to any bitrate that complies with the MP3 standard (32 to 320kbps) and dynamically selects an appropriate bitrate to suit the complexity of the content for every encoded frame (roughly every 26 milliseconds) during the encoding process. The person carrying out the encoding selects a target quality level and the encoder does its best to maintain a fixed level of quality throughout the encoding.

In summary, VBR = constantly varying bitrate but almost constant quality. The file size will vary depending upon the quality level chosen and the complexity of the music being encoded.

Which do I choose?

For CDs...

The LAME developers have been favouring VBR for some time now with their fine-tuning, and this is the mode most recommended for transcoding high-quality sources (CD audio for example) as it offers the smallest files possible for the desired level of quality chosen by the end-user. As MP3 encoding is all about balancing quality retention versus saving space, this makes perfectly logical sense.

Selecting a quality level of '3' in MediaCoder should provide perceptually transparent results for almost everyone. If you think you can detect some loss in quality, try '2'. Very few people need to go higher in quality than this, but the option is there to set it to '1' or '0' if you need to. The lower the number, the higher the quality, but the larger the output file will be.

Leave all other encoder related settings on defaults to generate MP3 files with the highest possible quality.

For Movies...

There seems to be some debate over which is the best encoding method to use when muxing audio with video, but I've almost always had lip-sync problems when using VBR or ABR, infact some reauthoring software won't even allow the use of anything but CBR, so I always go for CBR to be on the safe side at a minimum of 128kbps if quality matters at all, giving more critical soundtracks a bitrate of either 160kbps or 192kbps. Your ears may demand more, or you may get away with less, so experiment to find out what suits you.

Remember that for the sake of highest compatibility, the bitrate should always be one of the following values, and nothing inbetween...

32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256 and 320.

As above, leave all other encoder related settings on defaults for the highest possible quality.

Why leave everything else on defaults? I want to play!

Play with any other encoder related settings on the firm understanding that encoding quality will almost certainly suffer. The LAME developers have spent 12 years fine-tuning the encoder to provide the best possible performance with default settings. It automatically changes any other parameters that need changing internally depending upon the encoding method and bitrate or quality level you set.

The only possible exception to this rule is if you have to change the sample rate to suit a specific hardware player, otherwise, you're best off leaving everything else well alone. Trust the developers. They know the encoder much better than we do.

Conclusion

Hopefully, this quick guide has given a few important answers to those unfamiliar with MP3 encoding, but if you have a question to ask concerning any aspect of MP3 encoding with LAME, feel free to ask in the appropriate sub-forum. Thank you, and happy encoding!