SoundCD =========== A real-time 32-bit player of Layer I,II encoded CD-quality audio Coded by Thanassis Tsiodras Requirements: 486DX-33MHz and above Math Coprocessor 16 Bit sound card 32 bit Windows platform (Windows 95, Windows 3.1 with Win32s) This software is released as shareware. It can be freely distributed, as long as it is kept in its original form, with the accompanying text files. The code in this software is Copyright (c) 1996 by Thanassis Tsiodras. Contents. ========= 0. Disclaimer 1. What is this program? 2. How is SoundCD used? 3. Registration information. 4. FAQ about MPEG audio. 5. Win32s 6. Contactin me. 0. Disclaimer ============= I will not be held responsible for any kind of damage this software causes to your machine. You are using it at your own risk. Having said that, I must admit that as a computer engineer I believe there is no way this software can harm your system. I have tested it in more than 6 computers, and even in crash situations, it doesn't cause lost clusters by itself (all the disk activity it is doing is reading, not writing). 1. What is this program? ======================== This program is a real time player of MPEG encoded CD-quality audio. CD quality means 44KHz sampling rate, two channels (left&right) and 16 bits in each sample. This adds up to 176KB per second, which means that even a small CD-quality sampled sound takes up a lot of space. MPEG Audio Layers I and II compress this space by up to 11:1. You can check out some .mp2 files at ftp.crs4.it/mpeg/demos. 2. How is SoundCD used? ======================= SoundCD offers the following functionality: File/Open Opens an MPEG encoded sound file, of Layers I or II. File/Exit Aborts the program Options/Output quality... Prompts you with a dialog box, where you define the sound quality you want the player to produce. You can specify the channels that will be decoded (left, right or both) as well as the sample rate of the output (11,22,44KHz). The more quality you want, the more horsepower is asked from your CPU when decoding. Full CD-quality (44Khz, stereo) is produced with a pentium 90MHz or a faster machine. With 486 machines (from 66Mhz up to 100MHz) stereo decoding at 22KHz is possible. Weaker machines need a reduction in the number of channels, in the sampling rate, or in both. Help/About... Help/Register... Help/11 Hours of CD sound Display informative dialog boxes, on what you can do to register your copy of SoundCD, or license it for a production (if you are a developer/company). 3. Registration information. ============================ The shareware version of SoundCD is only capable of playing files less than 500KB (approximately 30 seconds of audio). If you want to be able to play back a bigger file, you have to register your copy. Your registered copy will be sent to you via e-mail (as uuencoded data). Registration is done by sending me an envelope containing a cheque as well as your e-mail address (DON'T FORGET IT! IF YOU FORGET IT I CAN NOT SEND YOU YOUR COPY - EVEN IF YOU E-MAIL ME AFTERWARDS). Make sure your e-mail system can handle uuencoded data (if you are not sure, ask your e-mail provider). My address is: Thanassis Tsiodras Aghiou Charalampous 30 Gizi 114 74, Athens Greece Registration fee: For single users: Cheque of 5000 Greek drx. or 25 US $, payable to "Thanassis Tsiodras". For multimedia developers: One product License ( = name the product! =): Cheque of 50000 Greek drx. or 250 US $, payable to "Thanassis Tsiodras". Life-time license: Cheque of 170000 Greek drx. or 750 US $, payable to "Thanassis Tsiodras". Note that registering the software doesn't mean you gain a bug-free version. Both the shareware and the registered version are equally bug-free. The only drawbacks of the shareware one, is the dialog box at the start, the 500KB limitation, and the fact that SoundCD ignores any run-time parameters (so it can't be spawned as a player just by association). These three drawbacks are removed from the registered version, but all the other code is the same. 4. FAQ about MPEG audio. ======================== This is a part of the MPEG FAQ. Q. So how does MPEG audio work? A. Well, first you need to know how sound is stored in a computer. Sound is pressure differences in air. When picked up by a microphone and fed through an amplifier this becomes voltage levels. The voltage is sampled by the computer a number of times per second. For CD audio quality you need to sample 44100 times per second and each sample has a resolution of 16 bits. In stereo this gives you 1,4Mbit per second and you can probably see the need for compression. To compress audio MPEG tries to remove the irrelevant parts of the signal and the redundant parts of the signal. Parts of the sound that we do not hear can be thrown away. To do this MPEG Audio uses psychoacoustic principles. Q. So how does MPEG achieve this compression ratio? A. Well, with audio you basically have two alternatives. Either you sample less often or you sample with less resolution (less than 16 bit per sample). If you want quality you can't do much with the sample frequency. Humans can hear sounds with frequencies from about 20Hz to 20kHz. According to the Nyquist theorem you must sample at least two times the highest frequency you want to reproduce. Allowing for imperfect filters, a 44,1kHz sampling rate is a fair minimum. So you either set out to prove the Nyquist theorem is wrong or go to work on reducing the resolution. The MPEG committee chose the latter. Now, the real reason for using 16 bits is to get a good signal-to-noise (s/n) ratio. The noise we're talking about here is quantization noise from the digitizing process. For each bit you add, you get 6dB better s/n. (To the ear, 6dBu corresponds to a doubling of the sound level.) CD-audio achieves about 90dB s/n. This matches the dynamic range of the ear fairly well. That is, you will not hear any noise coming from the system itself (well, there is still some people arguing about that, but lets not worry about them for the moment). So what happens when you sample to 8 bit resolution? You get a very noticeable noise floor in your recording. You can easily hear this in silent moments in the music or between words or sentences if your recording is a human voice. Waitaminnit. You don't notice any noise in loud passages, right? This is the masking effect and is the key to MPEG Audio coding. Stuff like the masking effect belongs to a science called psycho-acoustics that deals with the way the human brain perceives sound. And MPEG uses psychoacoustic principles when it does its thing. Q. Explain this masking effect. A. OK, say you have a strong tone with a frequency of 1000Hz. You also have a tone nearby of say 1100Hz. This second tone is 18 dB lower. You are not going to hear this second tone. It is completely masked by the first 1000Hz tone. As a matter of fact, any relatively weak sounds near a strong sound is masked. If you introduce another tone at 2000Hz also 18 dB below the first 1000Hz tone, you will hear this. You will have to turn down the 2000Hz tone to something like 45 dB below the 1000Hz tone before it will be masked by the first tone. So the further you get from a sound the less masking effect it has. The masking effect means that you can raise the noise floor around a strong sound because the noise will be masked anyway. And raising the noise floor is the same as using less bits and using less bits is the same as compression. Do you get it? Q. I don't get it. A. Well, let me try to explain how the MPEG Audio Layer-2 encoder goes about its thing. It divides the frequency spectrum (20Hz to 20kHz) into 32 subbands. Each subband holds a little slice of the audio spectrum. Say, in the upper region of subband 8, a 6500Hz tone with a level of 60dB is present. OK, the coder calculates the masking effect of this sound and finds that there is a masking threshold for the entire 8th subband (all sounds w. a frequency...) 35dB below this tone. The acceptable s/n ratio is thus 60 - 35 = 25 dB. The equals 4 bit resolution. In addition there are masking effects on band 9-13 and on band 5-7, the effect decreasing with the distance from band 8. In a real-life situation you have sounds in most bands and the masking effects are additive. In addition the coder considers the sensitivity of the ear for various frequencies. The ear is a lot less sensitive in the high and low frequencies. Peak sensivity is around 2 - 4kHz, the same region that the human voice occupies. The subbands should match the ear, that is each subband should consist of frequencies that have the same psychoacoustic properties. In MPEG Layer 2, each subband is 750Hz wide (with 48 kHz sampling frequency). It would have been better if the subbands were narrower in the low frequency range and wider in the high frequency range. That is the trade-off Layer-2 took in favour of a simpler approach. Layer-3 has a much higher frequency resolution (18 times more) - and that is one of the reasons why Layer-3 has a much better low bitrate performance than Layer-2. But there is more to it. I have explained concurrent masking, but the masking effect also occurs before and after a strong sound (pre- and postmasking). Q. Before? A. Yes, if there is a significant (30 - 40dB ) shift in level. The reason is believed to be that the brain needs some processing time. Premasking is only about 2 to 5 ms. The postmasking can be up till 100ms. 5. Win32s. ========== If you don't have Windows 95 you need to install the Win32s extensions over Windows 3.1 or 3.11 in order to run SoundCD. These can be found at ftp.microsoft.com. A search I did gave also the following sites: nic.switch.ch:/mirror/python/wpy/win32s12.exe ftp.sunet.se:/pub/lang/python/wpy/win32s12.exe ftp.ibp.fr:/pub11/python/wpy/win32s12.exe n.ruf.uni-freiburg.de:/pub/pc/msdos/windows/system/win32s12.zip scss3.cl.msu.edu:/pub/pc/win/win32s120.exe scss3.cl.msu.edu:/pub/pc/win/win32s125a.zip Win32s is a set of 32 bit extensions, that when applied to Windows 3.1/3.11 allows them to execute 32-bit Windows programs, like SoundCD. If for some reason you then wish to remove them, To remove Win32s: (1) exit to DOS. (2) delete the Win32s directory under windows\system and all its files. (3) edit the system.ini file in the window directory. and remove the line device=C:\WINDOWS\SYSTEM\WIN32S\W32S.386 (4) return to windows (5) remove the Win32 Applications Progman group 6. Contacting me. ================= My e-mail account is ttsiod@softlab.ntua.gr You can contact me for any comments or suggestions on SoundCD. Hope you enjoy it.