96 khz sample rate wav files

After studying sampling rates in one of my labs, im wondering why/if 96 khz recording offers any benefits. from the academic viewpoint, at least how i understood it, 96khz is overkill. unless it offers a more accurate reproduction of the original material by sampling at more than 4 times the bandwidth of the signal…

It’s the same overkill as 24 bit!

It’s not the same overkill, 24 bits can provide some extra dynamics on test recordings if you listen at 110 db. Granted that the average dynamic of pop music is nowadays around 6 db, it won’t change anything in this domain.

96 kHz provide frequency response up to 48 kHz (and 1 extra bit of dynamics as a side effect, because of possible dithering).

Listening tests :

Con : as the frequencies added are inaudible, the only way they can affect the hearing is intermodulation distortion. However, tests conduced feeding a high (inaudible) frequency in a speaker and an audible one in another, that should intermodulate, show that it is not the case. I did it with 6+18 kHz, and confirmed that, though the intermodulation (12 kHz) is audible when both are in the same speaker, I can’t hear it, even at about 100 db in the room, when separate speakers are used.

Another con : any speaker should distort when fed with >20 kHz frequencies.

Pro : http://jn.physiology.org/cgi/content/full/83/6/3548

Other Links :

http://www.musicgearnetwork.com/ubb/ultimatebb.php?ubb=get_topic;f=3;t=000822#000000 , abstract in page 33

http://www.hydrogenaudio.org/index.php?act=ST&f=1&t=3390 (down at this time, should be back online soon)

Originally posted by ckin2001
After studying sampling rates in one of my labs, im wondering why/if 96 khz recording offers any benefits. from the academic viewpoint, at least how i understood it, 96khz is overkill. unless it offers a more accurate reproduction of the original material by sampling at more than 4 times the bandwidth of the signal…

No it’s not overkill. It’s not 4x the bandwidth.

In a single sinusoidal wave period, their is a postive and negative sections in a single period. You need to sample at least once for each section. Once for +ve and once for -ve. More if possible (understand later)

In theory, using the right kind of higher order filters, we can reconstruct the waveform exactly assuming that the sampling rate is slightly more than double.

The audible range of human hearing is approx 20Hz to 20KHZ. (44.1KHZ is what audio cd’s are sampled at)

Unfortunately we can can get folded or aliased signals the closer the sampling rate is to (or lower than) 2x the actual signal frequency. These aliased and folded signals will appear in the reconstructed waveform and produce distortion.

If the sampling frequency is much greater than the highest audinble frequency, the folded and aliased signals are pushed much higher than the audible frequencies.

So actually sampling at at over 4 samples / period gives a much more accurate reconstruction using $hitty filters that are usuallky implemented in hardware.

In the real world, the majority of filters are crap, and if you get 2x /4x oversampling without a filter, or a linear filter you are doing well. Getting a higher order filter is near impossible, except for the most excpensive audiophiles gear.

Lower sampling rates will give a muffled sort of sound. The higher frequencies will discarded, leaving only the lower frequencies leaving a muffled sound. (which is one reason music sounds really bad across a phone)

Don’t confuse the sampling rate with the sampling resolution.

Increasing the sampling resolution (the number of bits for a single sample) gives a higher accuracy on the reconstruction of the sine wave. Lower sampling resolutions will usually suffer from a noise because the reconstructed signal has increased an increased error because the approximation of the original sound cannot be as accurate.

Depending on the application, we can choose whether we want to drop the resolution, or the sampling frequency.

For music, dropping neither is suitable.
For voice, sampling rate can be dropped (voice is usually less than 2KHz).

Dropping the sampling resolution is usually shocking in both cases (listen to mobile phones)

@Pio: 24 bit is an overkill, because currently affordable technology cannot convert signals of a precision higher than 20 bits to analogue signals…so the lowest 4 bits don’t contain any useful information.

Originally posted by debro
If the sampling frequency is much greater than the highest audible frequency, the folded and aliased signals are pushed much higher than the audible frequencies.

The aliases are filtered. No one should remain in the output.

Here are the level of the highest aliases I’ve measured in the analog output of the Yamaha CDX860 CD player, playing -12 db sines burned on CD. (whole setup here) :

14 kHz sine@-12 db : -96 db
16 kHz sine@-12 db : -96 db
18 kHz sine@-12 db : -95 db
19 kHz sine@-12 db : -95 db
20 kHz sine@-12 db : -86 db
21 kHz sine@-12 db : -43 db
22 kHz sine : 22 kHz original and 22.1 kHz alias both at -19 db.

Therefore at 44.1 kHz, the alias are filtered enough, no need to switch to 96 kHz.

Originally posted by debro

In the real world, the majority of filters are crap, and if you get 2x /4x oversampling without a filter, or a linear filter you are doing well. Getting a higher order filter is near impossible, except for the most excpensive audiophiles gear.

My Sony DAT55ES (1000 €) oversamples 64x in input and output. Maybe computer soundcards are crap, but can you hear the difference in an ABX test ? Try KikeG’s challenge : http://www.kikeg.arrakis.es/stest/
A track is analogly passed three times through the output-input (-output-input-output-input) of the M-Audio audiophile soundcard.
With a Marian Marc 2 soundcard (bit exact SPDIF output) - Sony DTC 55ES DAT/Converter - AKG K-400 headphones, I can’t hear the difference between the wav directly ripped, and the same, recorded analogly three times.

I “surely” can tell the difference between the converter of my 450 € Yamaha CDX860 (crap) and 1000 € Sony DAT55ES fed with the Marian Marc 2 for CD Playback… until I try in a blind test (and fail).
I recorded the analog output of the Yamaha (Sony analog input -> Marian digital input in slave mode). The copy can’t sound better than the original. The computer CD playback (SPDIF to Sony DTC55ES) being already better than the direct analog Yamaha CD player, it is even better than the copy of the Yamaha.
And yes, the difference is pretty easy to hear with an Arcam Diva A85 ampli, Senheiser HD-600 headphones, and Dynaudio Gemini speakers.
So I started the ABX blind test. No so easy, after all, but I can still hear the difference, and have all answers correct, I’m sure I had all answers correct… until I click the “test over” button and see the actual results : 4/8 !
It means that I answered randomly, and that in fact, there is no audible difference. It gave me the sessions where, sure of myself, I said the opposite of what was played.

I’ve lured myself for years believing that this DAT, used as an external converter for the CD player, sounded better. I was wrong all along. There is no audible difference with the analog output of the CD player.

I’ve also tried to record a vinyl at 44.1 kHz 16 bits, and 96 kHz 24 bits in the Marian Marc 2 soundcard analog input (Technics SL-3100 turntable, Trackmaster EL cartridge, CyrusOne phono input). The 96 kHz sampling rate should keep all the warm analog sound of vinyl. Playback with AKG K-400 headphones in the analog output of the Marian.

Same as always : incredible, spectacular, impressive audible difference when I see what I play, and no audible difference at all when the samples are hidden. No ABX test, this time. I just listened to the samples in the ABX program, and realized they were in fact the same before even starting the test.

And the program is not broken, I would have hear it, I would have thought : “hey, what’s happening, it only plays me the 44.1 (or 96) file” ! No, I simply couldn’t tell if the sound played was good or bad, it was just “as good as it always had been”. I couldn’t tell if there was “something analog” in it. And With some other samples (Mp3, LP vs CD…), the ABX program has already proven to work properly (8/8, 16/16 correct results when all samples are properly recognized)

To fight this so called “placebo” effect, that plagues even professional listeners, the solution is simple : ABX blind test. If a difference is audible between two sources, a successful ABX blind test is the proof.

So far, nobody succeeded in telling the difference in 44.1 kHz and 96 kHz sample rates in ABX.
The absence of proof is not the proof of absence, but, with all numbers far beyond human hearing abilities (frequency response, noise, dynamics, distortion, wow…) we are still waiting for a proof that there is an audible difference in spite of all scientific measurments of distortion and frequency response, that already states that there shouldn’t be any audible improvement.

The above test is already a proof that on nowadays consumer soundcards, the difference between 16/44.1 and 24/96 is little, if audible at all.

fyi, the reason music sounds poor over a 'phone line is because the spec. for an ordinary 'phone service is 0.3 - 3.4 kHz. Analogue and digital transmission systems designed for carrying telephone traffic only allow a 4 kHz. bandwidth for each telephone channel.
This is why radio and TV stations specify 10kHz/15kHz. for their programme lines when doing outside broadcasts (instead of using ordinary 'phone lines).

Just thought you’d like to know:)

Originally posted by debro
[B]

No it’s not overkill. It’s not 4x the bandwidth.
[/B]

but u also say its over 4 samples per period. which is exactly the same thing…

i’ve listened to various signals where the sample rate was more than 2x the bandwidth, and others where it was less than 2x. i cant imagine a difference between the more than 2x and the more than 4x, though. i’ll have to confirm it to myself - take some signal thats under 5khz, and sample it at 11025 and 22050. thats the easiest way for me to hear it, anyways :wink:

Originally posted by ckin2001
but u also say its over 4 samples per period. which is exactly the same thing…

The standard is two samples per periode. When you use 4 samples per period instead, you double the sample rate.

Originally posted by ckin2001
take some signal thats under 5khz, and sample it at 11025 and 22050. thats the easiest way for me to hear it, anyways :wink:

Some sinusoidal signal, it’s very important, because a square or a triangle, for example, stand for several sines of different frequencies.

lol, yaya, i know i know, no squares / triangles

stupid time domain / frequency domain :wink:

Well its not only that. Say you’re running a spectrum analyzer and doing effects in the frequency domain.

More samples per second means you can figure out frequency content with a smaller chunk of time.

Obviously you can’t figure out the frequency content of a signal that’s only 1 sample long. If you try to do it on a 5 sample window, results won’t be very good at all.

Say you use a 44100 sample window. Then you can get some pretty fine-grained results about exactly what frequency is in there. Sampling at 44100 Hz, that’s 1 full second of sound. But sampling at 192 kHz means that’s only about 0.2 seconds of sound.

I guess it means your frequency domain effects will sound much better, and you won’t have as much lag when adding effects in real-time, and your real-time spectrum analyzer will not have as much lag either.

See http://en.wikipedia.org/wiki/Nyquist-Shannon_sampling_theorem.

Thanks billybaloop!..For the 2002 revival!..Anyways, while some case studies have shown
’measurable’ differences, most have shown no 'audible differences via (ABX) test results…

"Limits of perception
The human ear can nominally hear sounds in the range [B]20 Hz to 20,000 Hz (20 kHz)[/B]. This upper limit tends to decrease with age, most adults being unable to hear above [B]16 kHz[/B]. The ear itself does not respond to frequencies below 20 Hz…"

IMHO, why bother with anything other than (Redbook standard (44.1KHz, 16 bits per sample)?..But hey, if you can [I]hear[/I] a difference while [I]looking[/I] at a spectrum, have at it…
Cheers!..:cool:

Going from 44.1KHz to 96KHz when using 16bit probably won’t make a huge difference. Going to 24bit makes a huge difference to the dynamic range.

A hobby of mine is multi track recording using real instruments. I sample at 24bit 192KHz. When it comes to mixing it down to two tracks, in 24bit it sounds great, but in order to let someone else hear what i’ve done, it has to be converted to 16bit 44.1KHz so it can be made into an audio CD, and quite frankly, at least to my ears it now sounds crap.

[QUOTE=t0nee1;2519843]
"Limits of perception
The human ear can [B]nominally [/B]hear sounds in the range 20 Hz to 20,000 Hz (20 kHz). This upper limit tends to decrease with age, most adults being unable to hear above 16 kHz. The ear itself does not respond to frequencies below 20 Hz…"


[/QUOTE]
What I get out of that is [B][I]nominal[/I][/B].
Nominal is defined as:

[ul]
[li]performing or achieved [B]within expected, acceptable limits[/B]; [B]normal,[/B][B] satisfactory[/B] and[/li][li]named as a mere matter of form, being trifling in comparison with the actual value; [B]minimal.[/B][/li][/ul]
I translate that as normal / acceptable.
Most people aren’t normal. Look at your family and friends.
Extrapolate their behavior to the behaviour of their ears … and hence many people will benefit from a higher frequency range. On the other hand some can’t even hear the nominal 20hz-20Khz frequency range … especially those that listen to iphones on high volumes, and people that have attended many concerts in their youth, people that work with noisy equipment.

The nyquist rate specifies that to reconstruct a waveform within acceptable accuracy requires a sample rate AT LEAST TWICE the maximum frequency.
However, accurately recreating a waveform correctly from only two samples (assumed one positive, one negative) per cycle can probably be summed up best from this video.

Engineers prefer 10x samples per cycle, probably because we’re cynical about the capability of products that will be sampling & recreating the sound later :stuck_out_tongue:
Although there has been much advance in waveform reproduction with digital systems :wink:

At any rate, human vocal range is generally limited between 100Hz and 1KHz.

From memory, most instruments produce sounds at/ below 4Khz, which is why 40Khz was chosen. 44.1Khz was finalised due to equipment restraints.

CD Audio sounds crap. Mp3 and/or heavily compressed audio is listening to noise with music playing in the background :stuck_out_tongue:

At any rate sampling at 96Khz will increase your storage requirements by a factor of 96/44.1Khz (2.2ish) and will have no quality increase for likely 95% of the population, and a negligible quality increase for the other 5%.
Resampling for distribution using 44.1Khz will be a major task, and will result in sound with quality depandant on the reconstruction & resampling algorithm (I guess they’re probably pretty good by now).

Increasing the sampling accuracy from 16bit to 24bit will increase your storage requirements by a factor of 1.5, but will have a noticeable quality increase for all people, except the deaf.
Downgrading the audio to 16b is a minor task (just drop the least significant 8bits) which is quick.

Remember - Record the audio in the frequency (or multiple of the frequency) that you intend to distribute it as.

If you intend to distribute audio in 48Khz, record at 48 / 96Khz. If you intend to distribute on CD, record in 44.1 / 88.2Khz. It’ll make your life easier.

This is a link to a hearing test.I didn’t digout my old enclosed headphones like I should have & went ahead with the open to room noise kind.I think I will try again with the enclosed headphones.
http://www.phys.unsw.edu.au/jw/hearing.html

Dee wrote,

A hobby of mine is multi track recording using real instruments. I sample at 24bit 192KHz. When it comes to mixing it down to two tracks, in 24bit it [I]sounds great[/I], but in order to let someone else hear what i’ve done, it has to be converted to 16bit 44.1KHz so it can be made into an audio CD, and quite frankly, at least to my ears it now [I]sounds crap[/I].

Were you able to ABX both versions?..Else may I suggest ‘placebo effect’, since you knew which was which…I’m not arguing/doubting you could hear the difference, just curious is all!..And just to be clear, don’t let the earphones fool ya, I’m no audio expert, just a hobbyist, like yourself…:wink:

debro wrote,

CD Audio sounds crap

The CD audio sounds like crap debate has been discussed ad nauseam…Yawnnn!!..:rolleyes:
http://www.hydrogenaudio.org/forums/index.php?showtopic=8909&st=0

[QUOTE=t0nee1;2520021]The CD audio sounds like crap debate has been discussed ad nauseam…Yawnnn!!..:rolleyes:
http://www.hydrogenaudio.org/forums/index.php?showtopic=8909&st=0[/QUOTE]
Yep :slight_smile:
And the conclusion is generally that CD audio is good enough for general listening, but not good enough as a permanent storage :slight_smile:

IMHO

The whole “[B][I]lossless audiophile[/I][/B]” scene is flawed from it’s very foundation.

Basically it’s people who when they started downloading MP3s were using bad codecs and lower bitrates (128 w/ music match jukebox) and it was certainly not CDDA quality…since then they went crazy.

Truth be told there is NO such thing as “lossless audio” in the digital domain, it’s all sampling and reproducing the sound based on samples and chosing from x ammount of samples. Only analog is lossless (in a sense) as it’s contiguous and does not take a sample.

96 may be better depending on the source (i.e. that the source is higher then 96) as if most of us can hear any differnce - maybe but it’s usually in db and not quality.

How to create 16bit CD to 24bit CD ?
TIA :slight_smile:

[QUOTE=Burnsama;2556780]How to create 16bit CD to 24bit CD ?
TIA :)[/QUOTE]

You can’t, CD only supports 16 (not anything else). If you are asking about upsampling to 24bit it’s a waste of space as the file is already 16bit…I am sure foobar 2k will do it, give it a try the sound won’t change.

If you want to make a playable disc w/ 24bit you’ll need to use DTS or DVDA which your player has to support, you can’t do CDDA. As I said upsampling is a waste of space.