Created
November 15, 2010 20:43
-
-
Save pi3r/700905 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<las> pi3r: PCM audio data is really very, very simple | |
<las> pi3r: the simplest case is a mono (single channel) data buffer. to "mix" two such buffers you really do just add the values in each buffer | |
<pi3r> but it's going to "overflow" | |
<las> pi3r: interleaved data is slightly more complex, but not for mixing purposes if they are both interleaved: you still just add them | |
<pi3r> or a thing la that? | |
<pi3r> s/la/like | |
<las> pi3r: the overflow issue is why JACK and most other sensible APIs use floating point | |
<las> pi3r: for integer format samples, yes, overflow is potentially an issue, and there is no simple, trivial "always do this" rule that can be used | |
<pi3r> so if i have 2 totally different buffer coming from 2 different input (but with the same format) | |
<las> pi3r: digital audio engineering evolved a variety of different solutions, none of which are compelling. float is compelling so that is what we use | |
<pi3r> i can do finalBuf[i] = buf1[i] + buf2[i] | |
<pi3r> ? | |
<las> pi3r: if its floating point data, then certainly, yes, you can do that | |
<pi3r> las, it seems so easy, i'm reading about that and i can't understand how it work | |
<las> pi3r: you don't understand enough about time domain and frequency domain :) | |
<pi3r> hum, clearly not | |
<pi3r> :D | |
<las> pi3r: then you should just accept for now that it Just Works That Way | |
<pi3r> but it's hard :D | |
<pi3r> i want to understand ! maybe it's why i don't | |
<pi3r> :D | |
<las> pi3r: i can try to explain it quickly | |
<las> pi3r: lets start with two actual acoustic pressure waves approaching your ear from sources | |
<pi3r> i don't want to bother you but if you have time | |
<las> pi3r: the air pressure is varying, becoming greater and lesser | |
<pi3r> yeah | |
<las> pi3r: (actually, lets just start with 1 acoustic pressure wave) | |
<pi3r> ok | |
<las> pi3r: now, there is only wave, only one set of variations in air pressure approaching your ear | |
<las> pi3r: but, as represented by the elegance of Fourier's theorem, this single wave can represent the summation of all the individual waves used to create the recording | |
<las> pi3r: that is, within that one acoustic pressure wave's variations are *ALL* the variations of the (say) bass guitar, piano, singer and theremin | |
<pi3r> hum | |
<las> pi3r: your ear is able to carry out an analog equivalent of fourier analysis, extract all the frequency components (well, not all, but thats a physiological issue) and thus you can hear individual distinct sounds | |
<las> pi3r: even though only a single pressure wave arrived at your ear | |
<las> pi3r: so, lets step back to a simple setup where there are two sources making sounds. if you want, make them nice and simple. the first generates an acoustic pressure wave that varies at 100Hz | |
<pi3r> hum it's like i can visualize me ear doing that :) | |
<las> pi3r: the second does the same at 200Hz | |
<pi3r> s/me/my | |
<las> pi3r: do some magic so that we can ignore directional complications | |
<las> pi3r: now the waves are both flowing toward your ear | |
<las> pi3r: what happens? you end up with a single acoustic pressure wave containing variations in air pressure that correspond to *both* sound sources | |
<pi3r> hum seems logical | |
<las> pi3r: ok | |
<pi3r> now | |
<las> pi3r: that is "mixing" in the analog domain. it consists of simply adding the two pressures together | |
<pi3r> my ears are not threaded | |
<las> pi3r: actually, your ears are extremely threaded | |
<pi3r> arf | |
<las> pi3r: they run a large frequency-dependent sensors in parallel | |
<las> pi3r: so, when the air pressure in one wave was "low" and the other one was "high", if they were the same amplitude, we add them together, and we get "high" | |
<las> pi3r: if they were both "high pressure" at a particular point, we'd add them and get an even higher pressure point | |
<las> pi3r: if they were both low, we'd add them and get an even lower pressure point | |
<nedko> if jack_lsp says that it cannot connect to jack server and you are sure that jack server is running, then your jack setup is broken :P | |
<pi3r> hum | |
<nedko> _nusse: ^^^ | |
<las> pi3r: so, now convert the pressure wave to digital with a converter that measures the pressure value at any point in time | |
<las> pi3r: you get a series of numbers | |
<las> pi3r: if you do this with each individual sound source, you will two series of numbers that represent the air pressure of each sound over time | |
* deronnax ([email protected]) has joined #jack | |
<las> pi3r: fundamentally, this is what the signal chain of "microphone -> mic-preamp -> analog-to-digital converter" is doing | |
<las> pi3r: when the pressure is high, it will be represented by a number closer +1.0 | |
<las> pi3r: when the pressure is low, it will be represented by a number closer -1.0 | |
<las> pi3r: when the pressure is the same as that of the static air, it will be represented by a number close to 0.0 | |
<pi3r> hum i was missing the link between time and air pressure | |
<las> pi3r: take the two digital representations of the two different sounds and mix them | |
<las> pi3r: all you have to do is add up the numbers | |
<las> pi3r: the result is a new set of numbers representing the time doman variation in air pressure | |
<pi3r> i understand why working with float is so important | |
<pi3r> now | |
<pi3r> ~logs | |
<pi3r> so if i hit an overflow, it's really bad luck | |
<pi3r> ss/it/if | |
<las> pi3r: correct. | |
<las> pi3r: the most common approach with integers involves dividing each signal by the number of signals | |
<las> pi3r: this more or less ensures no overflow, but it doesn't actually produce the desired effect most of the time (both signals become notably quieter) | |
<pi3r> then i'll need to resample | |
<pi3r> right? | |
<las> pi3r: eh? you mean convert to float? | |
<pi3r> not resample sorry | |
<pi3r> normalize | |
<JackWinter> pi3r: since you want to send the signal to jack and jack expects floating point, you might as well do that before mixing :) | |
<las> pi3r: yes | |
<las> pi3r: JACK expects normalized floating point | |
<pi3r> right | |
<las> pi3r: there are some contentious minor details about normalization, but the basic approach is to divide by (2^nbits)-1 where nbits is your integer sample bit width (8, 16, 24 typically) | |
<pi3r> JackWinter, but the buffers are coming from jack, so there are already normalized? right? | |
<pi3r> las, dully noted |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment