2.2 KiB
Background
Included here is a simple python script to analyze and an manipulate an audio signal by working with individual frequency bands. By implementing this method of tone control (adjusting the amplitudes of different bands), we can improve percieve sound quality as an audio engineer would with a raw sound sample.
Setup
Potential libraries needed for debian-based gnu+linux
sudo apt-get install libportaudio2
Install python libraries
pip install -r requirnments.txt
Run
python3 main.py
#or
python3 main.py --alien.wav
View Source
The over all flow of this objective work is roughly as follow
- Load the wave file into memory, handling errors or stereo signals.
- Figure out how many 1024 window frames will fit in the signal. Perform an FFT to get a general idea of energy level
- Iterate on all of the frames and:
- Perform hanning on the frame
- Calculate the FFT value and frequency of the frame
- Use those to get the three desired band_energiesTODO
- Perform adjustments on each frame
- Reconstruct the frames with inverse_fft
- Reconstruct the signal
- Write the audio
Included Audio Sources
airplane.wav: Royalty free wav from Daniel Simion on soundbible.com. My first impression of this one is that lower frequencies dominate
alien.wav: Royalty free wav from Daniel Simion on soundbible.com. Higher frequencies seem to dominate here
Reflections, Results, Analysis
This objective at first sounded slightly easier than it turned out to be. To be more specific, the biggest challenge was understanding the use of windowing and window size. While all other steps can be defined in a algorithm, the window size, as well as methods are less deterministic and can vary in effectiveness based on the input. The best window is often after trying a few with different sizes, until you find the frequencies you are interested in, within the result.
A Hann window of size 1024 vs a Rectangle Window of size 256, on a small sample of airplane.wav
in audacity.