fft & windowing

For a visual representation of sound energy in different frequency bands it is necessary to transform the digital time domain audio signal into the frequency domain using a mathematical method: the Fast Fourier Transform (FFT). We will not explain the details here – the algorithm is well known and found in many publications and in Wikipedia as well. We will consider only a few aspects of the processes involved.

Digital audio data is represented in samples. The sample rate describes the number of samples which represent 1 second of audio, e.g. a sample rate of 44100Hz means that you need 44100 continuous samples to represent 1 second of your audio signal. The FFT works on blocks of samples which are power of 2, i.e. 256, 1024, 2048 samples coming up to 5.8ms, 23.2ms, 46.4ms at a sample rate of 44100Hz. Each “point” of the FFT represents the energy of a frequency range. Those ranges are linearly distributed over the entire range (0 – 22050 Hz @ 44100Hz sample rate).  In our example with 256 points this would mean: the first FFT point represents all the energy between 0 and 172.3Hz whereas the last FFT point represents the range between 21878Hz and 22050Hz. The human perception of sound energy is logarithmic on the frequency axis. This means that the resolution of the lower frequencies computed with a narrow FFT block size is very unsatisfying because it leaves great gaps in the spectrum! One could argue that the solution to this problem is a larger block size like 65536 samples. Indeed this leeds to a FFT resolution of 0.67Hz per point which is quite satisfying. But: 65536 samples represent almost 1.5 seconds of audio! Despite the computational efforts there is no chance to get details on fast changes of the signal like in transients!

This relation between FFT block size and frequency resolution is the reason for the gaps in the Analyzer’s frequency view at lower block sizes. The shorter the block size the wider the gaps in the lower frequencies – we decided to leave those gaps as they are so you can always see how exact your analysis can be.

For further inside of the relation between block size and frequency resolution we can recommend this: http://en.wikipedia.org/wiki/Short-time_Fourier_transform

The FFT algorithms are based on periodic signals. A block taken out of even a periodical signal is not periodic – the FFT produces frequency leakage. To reduce this problem a window function is applied before calculating the FFT. This window “sharpens” the spectrum but may result in amplitude errors. Depending on the characteristics of the signal and the desired focus in the spectrum there are many different window functions. The Analyzer offers 6 of the most commonly used ones.

Here you can see the effects of different window functions on a 1000Hz sine:

windowsWorkspace

snapshots show from left to right: flat top – Blackman-Harris – Blackman – Hann – Hamming – rectangular

left to right: rectangular - Hamming - Hann - Blackmann - Blackman-Harris - flat top

zoomed in from left to right: flat top – Blackman-Harris – Blackman – Hann – Hamming – rectangular

6 Responses to fft & windowing

  • Iain Campbell

    Replied on: 02/02/2013, 15:28

    Wish App worked better on iphone. Menus need more explanation.
    no max or average values
    flash opening screen shows 2displays, iphone only access 1
    Menus purpose unknown:–
    #1. Response – Imp.
    #2. Weighting ABCoff
    #4. Leq on off

    • dan

      Replied on: 02/02/2013, 16:36

      hi Ian,

      to get access to the frequency view on iPhone you need to turn it into landscape mode (make sure the autorotation is not locked). The max SPL and frequency (only in frequency view) values are shown when you use the hold function. An average measurement is made when you use the Leq function. Response times are for measuring SPL. They define how long the intervall of analysis is: imp = 35msec, fast = 125msec, slow = 1sec. The A, B, C, off are the weighting curves. What they are for you can see here http://en.wikipedia.org/wiki/A-weighting

      Leq is an average measurement method. It stands for equivalent continous sound level. You start it with turning it on and stop it with turning it off.

      Daniel

      • SJ

        Replied on: 15/04/2014, 17:00

        Reading available material available, it seems the limit of measurable Db is 120. What we are trying to accomplish with Analyzer and a MicW i436 is measure the report of a gun, which is going to be over 130 Db. Is Analyzer able to help us in this task?

        • dan

          Replied on: 16/04/2014, 10:51

          Hello,

          I am not sure if the i436 can measure as much SPL. If it does I am quite sure that the amplifier of the iPhone will distort. THe only way i can imagine is an (electric) pad for the Mic which puts down the level (-20dB). Then you would just have to add the pad to your measurement.

          Daniel

  • Michael G

    Replied on: 01/05/2013, 00:01

    For the last couple weeks there are missing vertical bars in the low frequency display on my iPhone. The update of 29 April did not change that. What is the problem? Am I doing something wrong?

  • Michael G

    Replied on: 24/06/2013, 09:35

    It has been almost TWO MONTHS and you haven’t answered my question. Please refund my monies.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.