Last active
November 13, 2024 10:17
-
-
Save munshkr/30f35e39905e63876ff7 to your computer and use it in GitHub Desktop.
The C64 Digi ~ C=Hacking #20
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<============= | |
The C64 Digi | |
=============> Robin Harbron <[email protected]> | |
Levente Harsfalvi <[email protected]> | |
Stephen Judd <[email protected]> | |
Introduction | |
------------ | |
Digis -- digitally sampled audio -- are fairly common on the 64. This is | |
meant to be a comprehensive article on digis: how they work, examples, | |
different playback methods on the 64 (volume register and Pulse Width | |
Modulation), and some tricks. We'll even show you how to play 6-bit and | |
even 8-bit digis in high quality on a 64, which is really pretty neat to | |
hear. | |
The first part discusses digis from a fundamental point of view -- just | |
what a digi is, acoustic signals, and things like that. The most common | |
method of playing digis is via the volume register at $d418, and the next | |
two sections are devoted to this technique. Section two discusses some | |
SID fundamentals, and the reason why $d418 may be used for digis (and why | |
later-model SIDs don't play digis correctly); Section three discusses | |
$d418-digis from a software perspective: how to play them, tricks for | |
improving them, how to boost digis on 8580 SIDs, and how to detect what | |
kind of SID (6581 or 8580) is in the machine. The fourth and final part | |
of this article discusses pulse width modulation, and includes example source | |
code and a binary that plays a true 7-bit digi at around 16KHz -- something | |
which, we think, has never been done before. | |
Without further ado... | |
=============== | |
Digis: Overview | |
=============== | |
The whole point of playing a digi on a 64 is to provide something | |
for your ear to hear. So let's begin by discussing just what an acoustic | |
signal is and how that relates to digis. | |
Probably everyone knows that "sound" is how your ear responds to | |
changes in air pressure -- that is, when you clap your hands together, | |
it compresses the air between your hands in a special way, and that | |
higher pressure moves outwards into the surrounding air (since it's at | |
lower pressure). That pressure change propagates along and when it | |
encounters your ear it causes the ear drums to move, causing three little | |
bones to move, causing some fluid to move, causing tiny, exquisitely | |
sensitive hairs to move, transmitting a signal that your brain converts | |
to "sound". | |
An audio speaker also changes the air pressure in response to a | |
signal. If you take a coil of wire and change the voltage on it, it | |
generates a magnetic field; if a magnet is placed inside the coil, the | |
changing magnetic field will place a force on the magnet, causing it to | |
move, causing some air to be pushed along, causing a change in pressure, | |
causing a signal to propagate to your ear which your brain interprets as | |
Van Halen. All a stereo (CD player, etc.) does is send a varying voltage | |
signal to the speaker. As that voltage level goes up and down the magnet | |
moves back and forth, and so the speaker converts that electrical energy | |
into an accoustic wave. | |
For us, the trick is to coax SID into sending a specific voltage | |
signal to the speaker, the way a stereo or CD player might. And a CD player | |
is of course a very apt comparison, since it is itself a digi player. | |
Just for reference, a really good pair of ears can hear signals from | |
around 20Hz to 22KHz, with the sensitivity dropping considerably outside | |
of around 100Hz to 10KHz. A CD player has a playback rate of 44KHz, and | |
the highest frequency SID can generate from the frequency registers is | |
around 4KHz. If you've ever set SID to maximum frequency and heard just | |
how high 4KHz is, you can appreciate that even 10KHz is _really_ high, and | |
actually quite difficult to hear. In human speech, most of the information | |
content of vowel sounds is contained in the range 300Hz - 3KHz, and above | |
around 1KHz for consonant sounds; most information in musical sounds is in | |
the range 100Hz - 3KHz. | |
Discrete Sampling | |
To understand digis a little better, consider the more general | |
case of a discretely sampled signal -- a continuous signal sampled at | |
discrete time intervals. Let's say we had some device producing a | |
_continuous_ sinusoidal signal in time: | |
* * | |
* * | |
* * | |
* * | |
* * | |
* * * | |
* * * | |
* * | |
* * | |
* * | |
* * | |
* * | |
* * | |
-----------------------------------------------> time | |
(yes, I did miss my calling as an ASCII artist) | |
To turn the signal into a _discrete_ signal, we simply sample the | |
signal at discrete intervals of time. For example, let's say the above | |
signal lasts one second, and is input into a device which measures the | |
value every 1/4-second. The device will spit out four numbers: 0, 1, 0, | |
and -1: | |
* | |
* * | |
* | |
The sampling frequency here is four samples per second -- 4 Hz. If we were | |
to then play back this signal at the sampling frequency, we'd get a signal | |
like | |
********** | |
********** ********** | |
********** | |
So one thing sampling does is to "staircase" a signal -- the sample becomes | |
some sort of "average" value over the sample period. Increasing the sample | |
rate -- taking more samples per second -- will smooth things out, and the | |
sampled signal will look (and sound!) more like the original signal. | |
Now let's say we just took two samples in that one second -- 2 Hz sampling | |
rate -- and just happened to catch the signal at its maximum and minimum | |
values (the peak and trough). Upon playback, the signal would look like | |
********************* | |
* | |
* | |
* | |
* | |
* | |
* | |
********************* | |
That is, a square (pulse) wave. If you're on the ball, you've noticed | |
that the frequency of the new signal is 1 Hz -- exactly half the sampling | |
frequency. This is also called the Nyquist frequency. In general, the | |
_maximum_ frequency that can be captured in a discrete sample (called the | |
Nyquist critical frequency) is half the sampling frequency -- as you can | |
see above, it takes two data points to get a single (nonzero) frequency. | |
So, for example, the highest frequency a CD player -- which has a sampling/ | |
playback rate of 44KHz -- can capture is 22KHz, well above the range of | |
normal human hearing. | |
Thus, increasing the sample rate increases the frequency range captured | |
in the discrete signal. This is why a digi at a high sample rate in general | |
sounds better than a digi sampled at a low sample rate. | |
BUT -- there is more to life than sample rate: there is also sample | |
resolution. The sample resolution -- 4-bit samples, 8-bit samples, etc. -- | |
determines how accurately the sample measures the actual signal. For | |
example, let's say we sample sin(x) when x=0.5: | |
sin(0.5) = 0.4794255... | |
No matter what sample resolution we use, there will always be some error | |
in the measurement, and the _true_ value of the sample will be the | |
_measured_ value plus some error. | |
In general the sampling errors are random and uniformly distributed, so | |
the sampled signal corresponds to the original signal plus some noise (the | |
random errors). That is why you almost always hear some sort of hiss on | |
a normal C64 digi, which uses a resolution of 4 bits per sample. | |
So, increasing the sample _resolution_ decreases the amount of noise introduced | |
into the sampled signal (and increases the dynamic range), and increasing the | |
sample _rate_ increases the frequency range. | |
If you're _really_ on the ball, you've noticed that the 1-Hz square pulse | |
above actually contains frequencies higher than 1Hz, simply because a | |
square pulse contains higher harmonics in addition to the 1Hz fundamental | |
frequency. And you've also no doubt realized that the sampled pulse wave | |
would sound different than the original sine wave (due, of course, to the | |
added harmonics) -- it's at the right frequency, but it will sound like a | |
pulse wave instead of a sinusoid. | |
Have we somehow broken the Nyquist limit? | |
The answer is no, because of a nifty thing called the Discrete Sampling | |
Theorem, which says that, given the samples h_n of a bandwidth-limited | |
function h(t), the original function h(t) is given by | |
h(t) = dt * Sum{ h_n * sin(2*pi*f_c*(t-n*dt)) / (pi*(t-n*dt)) } | |
where dt is the sampling period and f_c is the cutoff/critical frequency. | |
What this means is that the original signal can be _reconstructed_ from the | |
discrete samples, not that it is _equivalent_ to the discrete samples. | |
The Nyquist limit is the highest frequency that can be _reconstructed_ from | |
the discrete samples, not the highest frequency that will be produced if you | |
"staircase" the discrete samples through a speaker. If the original | |
signal is bandwidth-limited, and there are at least two samples for the | |
highest frequency, then the signal can be completely reconstructed. | |
Since a "normal" digi contains all these extra frequencies, shouldn't a digi | |
sound "different" than a "true" analog signal? Sure. On the other hand, many | |
of the extra frequencies are beyond the range of human hearing, and the rest | |
can often be removed using a filter -- all CD players filter the output, for | |
example. So sometimes it is worthwhile to turn on a low/band pass filter | |
when playing a C64 digi, especially at lower sample rates. | |
And that more or less summarizes basic discrete sampling theory. | |
============= | |
D418 Playback -- Hardware | |
============= | |
The SID contains both analog and digital subparts on one silicon plate -- in | |
other words, it is a mixed signal device. | |
At the time, the SID was certainly the best of the microcomputer sound chips. | |
This may be mostly due to its mixed signal design, which the designers used | |
to solve certain problems. | |
The hard thing in a sound generator design is to implement waveforms, volume | |
control, and mixing. Things like that don't really fit into the digital | |
'either 0 or 1' philosophy, unless lot of data bits and arithmetic functions | |
are involved. In a fully digital sound chip, the waveforms could be generated | |
by ROM lookup tables. The mixing function could be derived from binary | |
addition, while the volume control from division or multiplication. Unless the | |
sound functionality is greatly simplified, the arithmetic functions must be | |
present and they must be implemented in hardware. Finally, the D/A conversion | |
could be done by (fast) pulse width modulation just at the output stage. | |
(Most of today's wavetable sound cards operate like this). | |
This method implies heavy arithmetic hardware, which was not an option for | |
designers back then. Still, most sound chips were fully digital, and all | |
suffer from the required compromises (i.e. generating square waves only, | |
no dedicated channel volume control, etc. - both TED and the VIC-I are obvious | |
examples). | |
The solution that one finds in the SID design is very straightforward: mixing | |
and variable volume level is problematic in a digital circuit when dealing | |
with waveforms, so simply avoid doing it. In the SID, only the microcomputer | |
interface, the registers, the oscillators (phase accumulating oscillators), | |
and other controller logic are digital; the mixing and volume control parts | |
are fully analog. There are digital to analog converters providing analog | |
voltage levels from the digital state variables. The SID D/As are in fact | |
'multiplying' D/As, having an analog input (AIN), an input base voltage | |
(IBASE), and a digital input. They operate by amplifying the input voltage | |
offset (AIN-IBASE) by a factor proportional to the number on the digital input | |
and adding this offset back to the base level. | |
This mixed signal design also allowed some other features to be implemented. | |
The most important one is the analog filter (that is, a two integrator loop, | |
bi-quadratic filter, according to Yannes). With that, the SID points beyond a | |
home computer sound chip - it is a true analog subtractive synth (marketing as | |
such was cancelled because of manufacturing capacity reasons). | |
Here is a detailed map on the SID inners (analog path; probably my most | |
beautiful ASCII ever :-D). Info can be found in the SID patents (US 4,677,890; | |
1986), the MOS 6581 technical document (can be found somewhere on the Net), or | |
the back of the Programmer's Reference Guide (PRG). | |
----------------- 11bit ------------ | |
|Cutoff freq reg|-------->|Cutoff D/A|---------o | |
----------------- ------------ | | |
$d415-16 | | |
| | |
----------------- 4bit ------------ | | |
|Resonance reg. |-------->|Reson. D/A |-o | | |
----------------- ------------- | | | |
$d417.[4-7] | | | |
| | | |
=0 v v | |
----------- ----------- >o------------>| ------------------ | |
|wave D/A |--->|env. D/A |-->o/ | | | | |
----------- ----------- o--->| | | | | |
^ ^ ^ =1 o--------|--->| | | |
|12bit |8bit | | | | | | |
| | | | | | | | |
----------- ----------- | | | | | | |
|OSC1 + | |ADSR cnt+| | | | | | | |
|wave sel.| |env. log.| | | | | | | |
----------- ----------- | | | | | | |
$d400-03, $d405-06, $d417.0 | | | | | |
$d404.[1-7] $d404.0 | | | FILTER | | |
| | | | | |
=0 | | | | | |
----------- ----------- >o----|------->| | | | |
|wave D/A |--->|env. D/A |-->o/ | | | | | |
----------- ----------- o--->| | | | | |
^ ^ ^ =1 | | | | | |
|12bit |8bit | | | | | | |
| | | | | | | | |
----------- ----------- | | | | | | |
|OSC2 + | |ADSR cnt+| | | | | | | |
|wave sel.| |env. log.| | | | | | | |
----------- ----------- | | | | | | |
$d407-0a, $d40c-0d, $d417.1 | | | | | |
$d40b.[1-7] $d40b.0 | | | LP BP HP | | |
| =0 | ------------------ | |
=0 | >o->| | | | | |
----------- ----------- >o----|--o/ | *** o o o | |
|wave D/A |--->|env. D/A |-->o/ | o- | / / / | |
----------- ----------- o--->| ^ =1 | =0 V V V =1 | |
^ ^ ^ =1 | | | o o o o o o | |
|12bit |8bit | | | | | | | | | | | |
| | | |$d418.7 |<-------o | | | |
----------- ----------- | | |<------------o | | |
|OSC3 + | |ADSR cnt+| | | |<-----------------o | |
|wave sel.| |env. log.| | | | | |
----------- ----------- | | | | |
$d40e-11, $d413-14, $d417.2 | | | |
$d412.[1-7] $d412.0 | | ----------------- | |
| | | Master volume |AUDIO | |
=0 | o--->| D/A |-----> | |
>o----|------->| ----------------- OUT | |
EXT IN --------------------->o/ | ^ | |
o--->| |4bit | |
^ =1 ^ | | |
| | $d418.[0-3] | |
$d417.3 ^ | | |
| | | |
Analog mixing ---|--------| | |
***: Filter type select switches, $d418.[4-6] respectively | |
$d418 digis | |
----------- | |
The most common method of playing a digi is to use the register at $d418. | |
When someone plays a digi using the master volume register, the situation is | |
similar to the waveform D/A converters. Both D/As are multiplying D/As -- | |
signal amplifiers whose amplification is proportional to the input digital | |
number. If there is a nonzero signal offset on the D/A input it will be | |
multiplied proportionally by this number. | |
Playing digis with $d418 is possible because there is indeed a relatively | |
large DC voltage offset on the master volume D/A. This offset is present | |
right from the moment when the SID is powered up. | |
Where can this DC offset come from? | |
There is a mixer before the master volume D/A (see figure). If there's a DC | |
offset on the D/A input, it must come from there. ...And going further, | |
the DC offset on the mixer must also come from somewhere. But where? | |
Signals come from the three ADSR volume D/As, the EXTIN line, and the three | |
outputs of the filter. Fortunately, all paths that go to the mixer have analog | |
switches (all paths can be disconnected from the mixer individually, if that's | |
needed). | |
The above analog switches are driven by the filter selection bits ($d417 bits | |
0-3), the voice 3 off bit ($d418 bit 7) and the filter type selection bits | |
($d418 bits 4-6). | |
After a reset, the filter selector bits are all 0 (all signals are routed | |
towards the master mixer), the 'voice 3 off' switch is on, and the filter type | |
selector bits are 0 (filter outputs are unconnected). In this state, only | |
EXTIN and the three SID voice signals are present on the mixer. EXTIN can | |
be eliminated as the source since it has no DC offset (as long as the computer | |
was not hacked, see notes on the 8580). | |
The ADSR volume D/A is similar to the previously mentioned multiplying D/As. | |
If the digital number on the input is 0, the input analog signal offset can't | |
pass through (as measurements verify). This is the case when SID is reset, | |
setting the envelope counters to zero. Therefore, nothing behind the ADSR | |
multiplying D/As can have any effect on the DC offset of the mixer. | |
So, the DC offset must come from the ADSR multiplying D/As. Another | |
measurement shows that even the mixer itself has a small DC offset. | |
Tests and results | |
----------------- | |
I did some tests that support this theory. They were done 'by hand', by simply | |
using a digital voltmeter + the FC3 monitor. | |
The chip was a 6581(R1), 0883, Hong Kong (an early 6581). | |
When turned on the voltage on the AUDIO OUT was about 5.5 volts (slowly | |
decreasing as it warmed up, stopping at about 5.43 after some 10 mins - all | |
subsequent tests were done after this time period). | |
Writing $0f to $d418 raised the output voltage to 6.15 volts. Therefore, the | |
maximum output amplitude that can be achieved when playing digis is 0.72 volts | |
in this 'mode' (without wiggling any other SID settings to achieve higher | |
voltage levels) -- remember that what counts is the maximum voltage | |
_difference_, not the maximum absolute voltage. | |
The next test is to determine if the mixer has its own DC offset (with all | |
possible paths are disconnected). It's possible to do. With the volume at | |
maximum (to maximize any effect), all voices are routed towards the filter | |
($d417 = $0f), while making sure that the filter outputs are not routed to the | |
mixer ($d418 = $0f). In this state no paths can drive the mixer. The result | |
is 5.39 volts. When the volume changes, the output also changes towards the | |
previous 5.43 volts --> there is a (very small) DC offset from just the mixer. | |
What could be the DC offset value of each individual SID voice (i.e. the base | |
level difference of the multiplying D/As)? Doing the above, but leaving one | |
voice routed to the mixer ($d417 = $0e, $0d or $0b) gives 5.69 volts. | |
5.69-5.43 = 0.26 volts, and 5.43 + 3*0.26 = 6.21, almost 6.15 volts. | |
To determine if the ADSR multiplying D/As act as expected, I used pulse | |
waves with zero frequency and 0 or $fff pulse width (two cases), to make the | |
input signal of the ADSR multiplying D/A the minimum and maximum possible | |
level. After careful checking, the output changed a few hundredth volts | |
(about 0.01 volt per voice). So the D/A doesn't close up completely, but | |
it's still O.K. | |
To prove that these offsets are equal for all voices, I did another test. Some | |
people know that the filter inverts phase (multiplies the input signal by -1). | |
Machine is reset, $d417 = $01, $d418 = $9f. (Voice 1 is routed through the | |
filter, voice 3 is cut off from the mixer completely ($d418.7), low pass | |
filter is selected, volume = $0f). The output voltage was 5.41 volts, just | |
very slightly below the "default" output level. This means that the DC of | |
voice2 + (-1*) DC of voice 1 resulted in about 0 relative offset. Doing | |
similar tests proved that the DC offsets for the voices match each other | |
almost exactly (within a few hundredths of a volt). | |
These measurements all support the idea that the DC offset comes from the | |
ADSR multiplying D/As, that the offset is mostly independent from the waveform | |
D/A converters (as long as sustain levels are 0), and that the offsets are | |
equal for all voices. In addition, a small DC offset is supplied by the master | |
signal mixer itself. | |
What if we try different sustain settings? For this test, set the volume to | |
maximum, as usual. Set the sustain level to $0f for all voices ($d406, $d40d, | |
$d414 = $f0). Start the attack, but with no waveform selected ($d404, $d40b, | |
$d412 = 01). The output level is now 5.21 volts, a little bit below the '0' | |
offset of the audio output! (Doing the test with just one voice (all | |
others disconnected), the output is 5.29 volts). | |
Finally, we can do some experiments with the pulse waveform. The pulse | |
waveform is useful for these tests, since at zero frequency we can set both | |
the minimum and the maximum constant DC levels at the voice D/A just by using | |
the pulse width registers. Reset the computer. Set voice 1 to zero frequency, | |
pulse level $0fff, sustain level 15, and $d404=$41 (pulse waveform + gate on). | |
Route only voice 1 through the mixer ($d417 = $0e). The output voltage is | |
similar to the test when no waveform was selected -- 5.29 volts! This seems to | |
show that "waveform accu = $0fff" is the same as when no waveform is selected | |
(i.e. the waveform D/A digital input pins are pulled high when they're not | |
driven, as seen in most other NMOS chips). | |
When the pulse width is 0 in the above test the output changes to 6.34 volts. | |
This seems to be strange (a multiplying D/A giving higher signal level for | |
multiplying something by 0). | |
Now, when the ADSR multiplying D/A is closed, the output is 5.70 volts. When | |
it's fully open, the output changes from 5.29 volts (wave acc= $fff) to 6.34 | |
volts (wave acc = 0). One reasonable answer is that the base voltage of the | |
waveform D/A is higher, and the analog input is tied lower than the base | |
voltage of the ADSR D/A -- the effect is that the SID waveforms will lie | |
'around' the ADSR multiplying D/A base voltage, more or less symmetrically. | |
This was surely done intentionally, to reduce absolute voltage levels (for | |
linearity). In the 6581, the big DC offset is probably a result of having the | |
ADSR D/As and the master volume D/A at different base levels (the difference | |
appears as true DC offset on the master volume D/A). If both were the same | |
(presumably at VDD/2), and the waveform D/A parameters were selected similarly | |
(operation is symmetric to VDD/2), there would be no final DC offset at all. | |
Rather like the 8580... | |
Other issues | |
------------ | |
So now we know why $d418 digis are possible - but still, there are some things | |
to note. | |
The DC offset on the master volume D/A changes with different SID settings, | |
and whatever affects the DC offset on the mixer will affect the digi | |
volume. For example, even the filter output signals have a small DC offset. | |
Just do a test - set the volume to 0f, then simply turn one filter route on | |
(for example, $d418 = $1f). You'll hear a small click (i.e. a small DC offset | |
change on the mixer), even if the filter has no input. | |
Moreover, as seen above, the DC offset can be eliminated completely (just by | |
SID register settings), leading to no audible digi sound at the output. In | |
other words, whatever affects the DC offset on the mixer _will_ affect the | |
digi volume. | |
One place where this is important is playing a digi with a tune: there's a | |
constantly changing signal going to the mixer instead of a constant DC offset, | |
so playing a digi on the master volume also causes distortion for both the SID | |
voices and the digi sound (since they're cross modulated). To reduce this | |
effect most 3+1 like SID + digi players play samples by writing 8-offset sample | |
values to $d418 (ie. adding 8 to 3-bit sample values and writing this to $d418 | |
- see players used by Jeroen Tel and other famous composers using digi). This | |
trick reduces the modulating effect while still maintaining good digi volume. | |
The DC offsets used to create awful clicking sometimes. For example, the | |
filter inverts phase. If the filter is currently routed to the mixer, there'll | |
be a large 'click' (2 times the DC offset) when a voice is on and its routing | |
is changed to or from the filter. | |
The 8580 | |
-------- | |
This is a completely redesigned chip. I don't know details, but it was | |
probably redesigned by the time all other chips in the C64 were done for CSGs | |
new manufacturing technology and the C64c. It is a 'better' chip from the | |
technical side (but in my opinion it sounds crude in comparison to the 6581, | |
at least the R4 series). The 6581 was designed in months. Bob Yannes had to do | |
everything from scratch and use the manufacturing technology MOS currently had | |
(NMOS). And it shows. First, it has high background noise. The DC offsets are | |
really also a misfeature. The D/A converters are sometimes non-monotonic (at | |
least, the waveform D/As and the filter cutoff D/A have some drops at the | |
change of the most significant bits). The op-amps in the active (resonant) | |
filter are simple, linearized NMOS inverters ;-) (loopbacked, they act like | |
more or less linear op-amplifiers around VDD/2). And I still haven't mentioned | |
bugs in the digital side (ADSR envelope bugs). Because of the above, one | |
probably won't find two identical 6581 chips -- each sounds a little bit | |
different (mostly due to the filter). Since the active components of the | |
filter are far from ideal, the filter is strongly nonlinear (the cutoff curve | |
changes with signal amplitude). On the other hand, these things are what make | |
the SID sound so unique. | |
Most of the problems were fixed in the 8580. It has much less background | |
noise. The chips sound the same (there are hardly any differences between | |
different 8580s). Most of the DC offset issues (the clicks) were elminated. It | |
needs less power, and lower VDD level. Something was changed also in the | |
digital logic, but the ADSR part was not touched. The 'combined' waveforms are | |
a bit different (and more useable from the musician's point of view). | |
The clicks were reduced, which means that there is no (or no significant) DC | |
offset on the master volume D/A in the 8580. | |
(I have not done any measurements, but after listening to a lot of 3+1 channel | |
type musics, I have a strong suspicion that even if sounds are turned on, the | |
average DC offset on the master volume D/A is still minimal). | |
To fix this in software, you'll have to wait until the next section of this | |
article. | |
To fix this in hardware, people use a simple hack: take a resistor of about | |
330k and tie the SID EXTIN line to GND through that (directly, beside the | |
chip, on the mainboard). | |
The EXTIN line goes directly to the mixer, and thus the master volume D/A, or | |
can also be routed through the filter. In either case, unless the filter is | |
disconnected, the above hack will give a pretty large DC offset, similar to | |
the original 6581s. So, digi sounds can be played :-) (even with SID music | |
playing simultaneously, similar to the 6581). | |
This solution is good as a work around, but there's one thing to note: this is | |
not completely the same as the 6581 ADSR D/A offset voltage. At least, this | |
offset is negative (should that pin rather be tied to VDD?). Programs that | |
depend on the 6581s way of DC offsets will not work correctly (but I know of | |
very few such programs, so at worst you'll experience slightly different digi | |
sound only occasionally -- but hey, the 8580 sounds different anyway). Another | |
problem is that when EXTIN is routed through the filter the DC offset may | |
cause strong distortion since the DC operating point of the filter is changed | |
-- bad news if the 'semi-linear' amplifiers in the filter are picky about | |
absolute DC level. Some music (not neccessarily involved with digis) indeed | |
do route EXTIN through the filter, for noise reduction on older C64s (with the | |
earlier C64 mainboards that pick up lots of 'digital' background noise from | |
EXTIN). DC distortion can also occur occasionally for the same reason but on | |
the master volume D/A (the higher the difference from VDD/2, the greater the | |
risk of experiencing nonlinearity and clipping distortion). | |
Some final words | |
---------------- | |
A lot of this information comes from Dag Lem, who is certainly the No. 1 SID | |
hacker for me ;-). Take a look at reSID, his SID emulator library (the sources | |
can be downloaded from somewhere). reSID contains so much reverse engineered | |
information of the real SID that you won't believe it -- check it out if | |
you're interested. | |
============= | |
D418 Playback -- Software | |
============= | |
$D418 digis are by far the most common playback method. The volume register | |
gives 16 different amplitudes (0-15), and so can provide 4-bit digi playback. | |
In its most basic form, this is an extremely easy routine to code. Simply | |
load each 4-bit sample, and store it in the volume register ($d418). | |
Assuming $fd/$fe are pointing to the beginning of a series of samples, the | |
following code will play it back: | |
ldy #0 | |
:loop lda ($fd),y | |
sta $d418 | |
ldx #5 ;some delay value | |
:delay dex | |
bne :delay | |
iny | |
bne :loop | |
inc $fe | |
jmp :loop | |
The ldx #5 would have to be adjusted depending on the speed of the | |
sample - the lower this number (not including zero) the faster the | |
sample will play back. | |
There are a number of improvements we could make to this code - first | |
of all, this method takes twice as much RAM to store the sample as is | |
necessary. Because we're dealing with 4-bit samples, we can store 2 | |
samples in each byte. This can be handled simply by alternately | |
masking out the high bits (with AND #15) to play the sample stored in | |
the low nybble, and by shifting the high nybble down to the low nybble | |
to play the high nybble (LSR : LSR : LSR : LSR). A lookup table may also | |
be used to save processor cycles (but use more RAM). | |
Another improvement is to move the routine to zero-page, and use self- | |
modifying code. In general, this results in the fastest digi players. | |
We should of course have the routine check for the end of the sample -- | |
typically just checking the high byte of the zero-page pointer is enough | |
(in this case, checking $fe). Typically digis are page aligned anyway, so | |
just zeroing out the unused part (if any) of the last page is fine. | |
Finally, it is often important that each sample of a digi is played back | |
at regular intervals. If the samples aren't played at a steady speed, | |
extra distortion is audible. In the example above, playback is steady | |
for a full page (256) of samples - but several extra cycles are added | |
by incrementing the zero-page pointer to the digi. The situation | |
worsens when we start adding extra code to check for the end of the | |
digi, and even the main loop starts getting irregular when we add the | |
code for the simple form of packing discussed earlier (2 4-bit samples | |
per byte). | |
These problems can be solved by careful cycle counting and adding NOP | |
and harmless BIT instructions in strategic places to make each | |
iteration the same number of cycles, regardless of which branch is | |
taken - people who have written a stable raster routine, or done some | |
Atari 2600 coding have likely done this sort of painstaking work | |
before. | |
NMI-driven digis | |
---------------- | |
More commonly, however, we enlist the help of CIA #2 and have it | |
generate regular Non-Maskable Interrupts which we use to call our digi | |
player. This has two important advantages - first, it makes timing | |
much more simple. Second, it frees your main program to do other | |
things while the digi is playing "in the background". | |
To experiment, I pulled a 4-bit packed digi from the extras disk | |
included with Super Snapshot 5.22. It's the beginning seconds of the | |
introduction to Classic Star Trek (Space, the Final Frontier). | |
Here's the source for a fairly "frills-free" NMI based digi player, | |
with my comments after blocks of code: | |
start = $1400 | |
end = $7cff | |
freq = 141 | |
ptr = $fd | |
Labels start and end simply point to the beginning and end of the | |
digi. Freq isn't actually the frequency - it's the number of | |
processor cycles between interrupts necessary to play the digi at the | |
desired speed/pitch. If you know the frequency (in hz) of the digi, | |
simply divide your CIA clock speed (approximately 1000000 hz) by the | |
digi frequency. In this case, the digi runs at approximately 7100 hz. | |
We use two zero page locations to form a 16-bit pointer to the current | |
sample in the digi to play. | |
*= $1000 | |
;disable interrupts | |
lda #$7f | |
sta $dc0d | |
sta $dd0d | |
lda $dc0d | |
lda $dd0d | |
sei | |
This code simply disables interrupts and initializes both CIA timers. | |
;blank screen | |
lda $d011 | |
and #255-16 | |
sta $d011 | |
Just like erratically timed code can introduce distortion when a digi is | |
played back, the VIC steals cycles from the processor that can cause | |
interrupts to not occur precisely when you'd like them to. This routine will | |
work without the screen blanked, but the extra noise introduced when the | |
screen is on is noticeable when the time between samples is less than around | |
2.5-3 times the time the processor is stopped. Another option is to use some | |
multiple of the raster timing as the sampling rate, and start the routine on a | |
non-badline, to ensure that the interrupts never occur on a badline. (A final | |
option is to use a raster-driven interrupt for the digi; with the SCPU, it is | |
actually possible to drive an IFLI display and play a digi at the same time, | |
badlines and all -- email Robin for more info, or maybe wait for a future | |
article!). But the simplest thing to do is to blank the screen :). | |
;switch out roms | |
lda #$35 | |
sta 1 | |
;point to our player routine | |
lda #<nmi | |
sta $fffa | |
lda #>nmi | |
sta $fffb | |
Unless using the KERNAL routines is necessary in my program, I always | |
switch out the ROMs. One of the biggest benefits is that our NMI | |
routine will be immediately called, rather than using $0318/$0319 and | |
waiting for the KERNAL to indirectly call your routine. | |
;initialize player | |
lda #<start | |
sta ptr | |
lda #>start | |
sta ptr+1 | |
ldy #0 | |
sty flag | |
lda (ptr),y | |
sta sample | |
This section simply initializes the various memory locations that the | |
player uses - sets ptr/ptr+1 to point to the beginning of the digi, | |
loads the first sample, and clears the flag that handles the | |
alternating between the lower and upper nybble of the packed samples. | |
;setup CIA #2 | |
lda #<freq | |
sta $dd04 | |
lda #>freq | |
sta $dd05 | |
Sets Timer A on CIA #2 to freq. | |
lda #%10000001 | |
sta $dd0d | |
Enables Timer A interrupts on CIA #2. | |
lda #%00010001 | |
sta $dd0e | |
Sets Timer A to run in continuous mode. As soon as Timer A counts | |
down to zero, it will automatically be reloaded to the last writes to | |
$dd04/$dd05 and begin counting down again. | |
endless jmp endless | |
For this example, we just put the computer in an endless loop. | |
nmi | |
pha | |
txa | |
pha | |
tya | |
pha | |
;play 4-bit sample | |
lda sample | |
and #15 | |
sta $d418 | |
We play the sample while all the code is still linear - before any | |
branches have occurred. This is to minimize the distorting effects I | |
mentioned earlier. The AND #15 is used so we don't inadvertently | |
enable the filter bits in $d418 with the high nybble packed into | |
sample. | |
;clear NMI source | |
lda $dd0d | |
By reading $dd0d, we are acknowledging the source of the interrupt, | |
and the CIA will now generate another interrupt next time Timer A | |
counts down to zero. | |
;just something to look at | |
inc $d020 | |
;every other NMI do 1) or 2): | |
lda flag | |
bne lower | |
Now we deal with "unpacking" the samples. | |
;1) shift upper nybble down | |
upper lda sample | |
lsr a | |
lsr a | |
lsr a | |
lsr a | |
sta sample | |
jmp exit | |
When flag is set to zero, we shift the high nybble of sample down to | |
the low nybble so it's ready to be played next NMI. | |
;2) get a new packed sample | |
; then point to next | |
lower ldy #0 | |
lda (ptr),y | |
sta sample | |
inc ptr | |
bne checkend | |
inc ptr+1 | |
When flag is set to one, we load a new packed sample into sample, and | |
point ptr at the next packed sample. | |
;if end of sample, point to | |
;beginning again | |
checkend lda ptr | |
cmp #<end | |
bne exit | |
lda ptr+1 | |
cmp #>end | |
bne exit | |
lda #<start | |
sta ptr | |
lda #>start | |
sta ptr+1 | |
Simply check for the end of the digi, and if we've reached it, loop | |
back to the beginning of the digi. | |
;toggle flag and exit NMI | |
exit lda flag | |
eor #1 | |
sta flag | |
pla | |
tay | |
pla | |
tax | |
pla | |
rti | |
;sample's lower nybble holds | |
;the 4-bit sample to played | |
;next NMI - the upper nybble | |
;holds the next nybble to be | |
;played on "odd" NMIs, and is | |
;undefined on "even" NMIs. | |
sample .byte 0 | |
;flag simply toggles between 0 | |
;and 1 - used to decide whether | |
;to play upper or lower nybble | |
flag .byte 0 | |
Improving D418 Digis | |
-------------------- | |
D418 digis tend to generate a lot of noise, because, of course, the 4-bit | |
sample resolution. Over the years people have come up with numerous tricks to | |
improve the sound of a d418 digi; here are some that we know of and have tried. | |
The first, and most obvious, thing to do is to use the low-pass filter, since | |
a lot of the noise is at higher frequencies. Unfortunately this won't work, | |
since the filters occur in SID before the volume amplifier -- all the filters | |
can do is change the DC offset that makes the digi possible. This trick | |
will work for methods that use SID voices, however (such as Pulse Width | |
Modulation, discussed in the next section). | |
Another trick is to "dither" the sound, as discussed in C=Hacking #11. The | |
idea here is to generate an intermediate "average" value by toggling between | |
two values. For example, if d418 is set to '8' half of the time, and '9' the | |
other half, its 'average' value will be 8.5. So this is somewhat like adding | |
an extra bit of resolution. In principle, you can extend this further: if it | |
is '8' one-third of the time and '9' for the remaining two-thirds, the average | |
value will be 8.66. And so on. | |
Now, we aren't _really_ increasing the sample resolution here, but are instead | |
increasing the sample playback rate -- we're playing two samples ('8' and '9' | |
for example) where before we played just one. Don't get too carried away | |
thinking about "average" voltage levels (after all, there is an average | |
voltage for the entire digi but that's not what you hear!) -- what's important | |
is how well the sampled signal represents the original signal. If the | |
original signal is rising from 8 to 9 during the sample interval, this type | |
of trick will work well. | |
Which leads us to another trick: interpolation. This is really a compression | |
trick, more than a 'resolution' trick. Let's say that one sample value is 5, | |
and the next value is 9. It might be reasonable to expect an 'intermediate' | |
value of 7, to play right after the 5. Once again, the idea is to increase | |
the playback rate to better-represent the original signal. This type of trick | |
increases the playback rate without increasing the amount of data -- and as | |
always, your mileage may vary. Many modern soundcards and CD-players use | |
interpolation. | |
Another curious trick is to add noise to the signal -- that is, the 4-bit | |
sample corresponds to the original signal plus noise. Sometimes, by adding | |
noise to the signal playback the noise can actually cancel! The 'dithering' | |
trick above can be viewed in this way. | |
Boosting 8580 Digis | |
------------------- | |
As most people know, there are 'old' SIDs (6581) and 'new' SIDs (8580), and | |
$d418 digis do not work right on 8580 SIDs, (such as in the 128D, most 128s, | |
and the 64C) for the reasons discussed earlier -- the 8580 does not have a | |
residual voltage leading into the amplitude modulator. | |
The software fix for this is pretty simple: have SID generate a signal, and | |
hence a voltage, for the volume register to modify. You can actually use | |
pretty much any waveform to do this, but a pulse is the simplest, since a | |
pulse wave just toggles between two voltage levels. Moreover, page 463 of | |
the PRG says, "The TEST bit, when set to a one, resets and locks Oscillator 1 | |
at zero until the TEST bit is cleared. The Noise waveform output of | |
Oscillator 1 is also reset and the Pulse waveform output is held at a DC | |
level." So it's not really necessary to worry about the frequency or pulse | |
width, by using the test bit. | |
BUT -- it is very important to set the sustain level to $f. The ASDR envelope | |
generators generate the voltage. A sustain level of 0 gives no improvement. | |
So, to 'boost' a digi on a later-model SID, you can just turn on a pulse with | |
the test bit set: | |
LDA #$FF | |
STA $D406 | |
LDA #$49 | |
STA $D404 | |
Setting more voices gives the digi a substantial extra boost: | |
LDA #$FF | |
STA $D406 | |
STA $D406+7 | |
STA $D406+14 | |
LDA #$49 | |
STA $D404 | |
STA $D404+7 | |
STA $D404+14 | |
The moral is: if you're writing a digi routine, and want it to work on all | |
computers, be sure to boost the digi. | |
And for completeness, using more channels is a commonly used trick to enhance | |
digi resolution on the Plus/4. The TED digi resolution (the volume register) | |
is 3 bits. Fortunately, all channel on/off bits + the volume level are in the | |
same register ($ff11). If one source is on, the output DC is about half of the | |
level when both are turned on. This trick can be extended further to results | |
in a 'semi 4-bit' or 5-bit digi table (the dynamic range is enhanced, but | |
there are larger steps at the table end than at the start). This trick could | |
also be used in SID if the sound sources were accurately preset, but runs into | |
problems due to the non-matching SID-versions and having the control bits in | |
multiple registers. | |
SID Type Auto-Detect | |
-------------------- | |
The following routine will detect what type of SID is in use. I've | |
tested it on a fair cross-section of my collection of computers - my | |
NTSC 128D, two 64Cs, two "breadbox" C-64s, and my PAL breadbox 64. In | |
all cases the code performed 100% accurately - but still, there may be | |
cases where it fails. I'd be interested to know if anyone finds any | |
faults in the routine, so I can improve it! | |
How does the routine work? I was told that the old SID (6581) and the | |
new SID (8580) behave differently when set to play combined | |
waveforms. I coded a fairly simple routine to use the REU to sample | |
$d41b (the upper 8 bits of Oscillator 3's waveform output) for a full | |
64k bank. Then I experimented with various frequencies and | |
combinations of waveforms on Oscillator 3 until I found consistently | |
different results with the two different SIDs. | |
When I combined the triangle and sawtooth waveforms and then sampled | |
$d41b I found that most of the time the oscillator was just putting | |
out zeros, with occasional bursts of numbers. These "bursts" were | |
consistently near $ff on the 8580, while the 6581 was always well | |
below $80 - often $3f was the highest it would get. | |
So, the detection code ended up being quite simple - I'll explain each | |
block of code: | |
*= $4000 | |
start sei | |
lda #11 | |
sta $d011 | |
Disable bad-lines (by blanking the screen). This prevents badlines | |
from interfering with the detection process. | |
;sid setup here! | |
lda #$20 | |
sta $d40e | |
sta $d40f | |
Set Oscillator 3's Frequency Control to $2020. I just randomly chose | |
this value when experimenting, and it worked, so I kept it. The trick | |
here is to set a value fast enough that the oscillator will make a | |
number of cycles (so we can get a good sample of the values coming | |
out) but not so fast that it might miss any of the "bursts" I was | |
mentioning earlier. | |
lda #%00110001 | |
sta $d412 | |
Combine the triangle and sawtooth waveforms and start the ADSR cycle. | |
ldx #0 | |
stx high | |
loop lda $d41b | |
cmp high | |
bcc ahead | |
sta high | |
ahead dex | |
bne loop | |
This loop takes 256 samples of Oscillator 3's output, saving the | |
highest value in location high. | |
lda #%00110000 | |
sta $d412 | |
Stop Oscillator 3. | |
cli | |
lda #27 | |
sta $d011 | |
Turn the screen back on. | |
lda high | |
rts | |
high .byte 0 | |
Return from the routine with the highest value sampled from Oscillator | |
3 in the accumulator. This allows you to branch based on the high | |
bit: | |
bmi SID8580 | |
bpl SID6581 | |
Voila! | |
====================== | |
Pulse Width Modulation | |
====================== | |
The primary limitation of using the volume register is, of course, | |
that it is only 4-bits. Pulse width modulation (PWM) allows us to get | |
around that limitation. | |
In general, there are lots of ways of transmitting information. | |
If you've ever used a radio you've encountered both amplitude modulation, | |
where the signal is encoded as the amplitude of some carrier wave, and | |
frequency modulation, where the signal is encoded by changing the frequency | |
of the carrier wave. In both cases, the idea is to strip out the encoded | |
information and throw away the carrier. | |
Yet another possibility is pulse width modulation: use a pulse | |
wave at some carrier frequency, and modulate the pulse width. Pulse width | |
modulation has several nice properties for transmitting signals; we can | |
take advantage of it to play digis. | |
Pulse waves, of course, take on only two possible values: zero and | |
one (low and high, etc.). Over a single period, a pulse wave will in general | |
be low for some amount of time and then high for some amount of time. | |
The _duty cycle_ of a pulse wave is the amount of time it spends in the high | |
state compared to the total period. For example, a square wave, which is | |
low exactly half the time and high the other half, has a duty cycle of 50%: | |
______ ______ | |
| | | | | |
| | | | | |
_____| |____| |____ ... | |
Remember that, regarding SID, a signal like the above is simply a voltage | |
level. What is the _average_ voltage over a single period? Since a square | |
wave is zero half the time and one the other half the average value is | |
just 1/2. If instead the pulse had a duty cycle of 75%, it would be low | |
for 1/4 the cycle and high for 3/4, giving an average value of 3/4. | |
So the _average_ value of a single pulse is simply the duty cycle. So if | |
we change the duty cycle for each pulse we can essentially generate a | |
series of average voltage values -- and since a digi is nothing more than | |
a series of average signal values, we can use PWM to play a digi. | |
To make this more precise, let's say we had a digi sampled at 1KHz -- one | |
thousand samples per second. Since each sample value will be approximated | |
by a pulse, we need one thousand pulses per second. The duty cycle of | |
the first pulse will be the first sample value, the duty cycle of the | |
second pulse will be the second sample value, and so on. Note that | |
the sample rate is the carrier frequency -- the frequency of the modulated | |
pulse train, 1KHz in this case. | |
(Actually, to be more accurate, we need _at least_ 1000 pulses per second -- | |
for example, we could use 2000 pulses per second, and represent each sample | |
value using two pulses. So the more correct statement is that the pulse | |
carrier frequency is the maximum sample playback frequency.). | |
The advantage for playing C64 digis is that we have much more resolution | |
for the pulse width, and probably not in the way you think! Because you | |
are probably thinking that SID has this nice 12-bit pulse width that | |
we can use here. The problem is that the absolute highest frequency SID | |
can produce, using the frequency registers, is about 4KHz, which would | |
be the maximum playback rate. | |
There's still another catch -- the carrier wave is still there! Imagine | |
trying to encode a signal that was constant, say 1/2 everywhere. To | |
generate a "digi" value of 1/2, you'd use a square wave, half down and | |
half up. So while the _average_ value of each pulse would be 1/2, the | |
actual signal would be a square wave at the carrier frequency (look at | |
the little picture above if you don't see it -- its average value is 1/2). | |
Trying to modulate a 4KHz carrier wave results in a piercing 4KHz tone, | |
and a _maximum_ sample rate of 4KHz (and this assumes that you can sync your | |
code up exactly with SID). So that's pretty worthless for digis. | |
- BUT - | |
What if we could change the voltage level manually? Let's say some | |
hypothetical machine language program toggled the voltage level | |
on each machine cycle -- the result would be a square wave of | |
frequency 0.5 _mega_ hertz. Okay, let's say it changed the voltage | |
level every 10 machine cycles -- the result would be a carrier | |
frequency of around 50 KHz. The point here is that a machine language | |
program can generate its own pulse waveform, and do so at much higher | |
frequencies than SID can produce. | |
Toggling the voltage levels turns out to be very simple. As was | |
described earlier, the way to "boost" digis on later SIDs is to use | |
a pulse waveform at frequency zero. Depending on the value of the | |
pulse width register, SID will set the output voltage to either high | |
or low. So all a program has to do is set up a pulse waveform at zero | |
frequency and use the pulse width registers to toggle the voltage -- | |
set $d403 to either $00 or $ff to toggle low/high. (You could also use | |
$d418 to toggle low/hi, but this method should produce more uniform | |
results, and unlike $d418 can be filtered). | |
So now we're cooking -- we've got a program that can generate a pulse | |
train. The next step is to change the width of each pulse to represent | |
the sample values in our digi. Remember that the duty cycle -- the | |
percentage of time the pulse spends high -- is the average value for that | |
pulse. But also remember that each digi sample represets an average | |
value over the sample period. If the pulse period is equal to the sample | |
period, then _the duty cycle is exactly the sample value_! | |
Example: let's say that we have an 8-bit sampled digi, so that values go | |
from 0-255, and our program generates pulses with a period of 256 "ticks". | |
Now pick a sample value, say 56. All the program has to do is hold the | |
pulse high for 56 "ticks", and low for the remaining 255-56 = 199 "ticks", | |
and it will have the correct average value: 56/256. So a program to play | |
8-bit samples might look like | |
1 - Load .X with next sample value | |
2 - Load .Y with 256-.X | |
3 - Set pulse high | |
4 - Loop for .X iterations (each loop iteration is one "tick") | |
5 - Set pulse low | |
6 - Loop for .Y iterations | |
7 - Loop back to step 1 | |
Let's say that each "tick" takes m cycles, and the sample size is 2^n, so | |
that there are 2^n ticks per sample. A stock machine runs at around | |
10^6 cycles/second, so... | |
(10^6 cycles/second) / (2^n ticks/sample * m cycles/tick) | |
= 10^6 cycles/second / (m * 2^n cycles/sample) | |
= 10^6 / (m * 2^n) samples/second | |
So, for example, let's say we had n=6-bit samples -- 2^6 = 64 -- and could | |
generate pulses with a resolution of one machine cycle -- m=1. Then | |
we could play that 6-bit sample at 10^6/64 = 15.6KHz. That is _really | |
very good_! In principle -- possibly using the CIA timers, possibly using | |
fixed delay loops, possibly using a massively unrolled loop -- this can | |
be done on a stock machine. (I did try using the CIA timers, but the | |
number of cycles to set up the timers was too big, and made it sound poor; | |
I've included the code below though.) | |
At this point it becomes a numbers game. As we increase the sample size | |
(increase m or n above), we _decrease_ the sampling rate -- if, in the | |
above example, we instead use 8-bit samples, the sampling frequency drops | |
by a factor of four to around 4 KHz. So there's a tradeoff between | |
resolution and sampling frequency. | |
AND... we still have this issue of the carrier frequency. You should be | |
able to convince yourself that the sampling frequency above is exactly | |
the carrier frequency. So with the 8-bit resolution example there | |
would be an awful 4KHz tone running through the playback. There are | |
only two ways to beat the carrier frequency: push it high enough that | |
you no longer hear it, or else push it high enough that you can use the | |
filters to dampen it down. | |
How high is high enough? You can judge for yourself, but 15 KHz is | |
pretty tough to hear, unless you have good ears and the volume is really | |
loud -- so 6-bit samples are within reach on a stock machine. | |
But add a SuperCPU into the picture, and the numbers get _really_ nice. | |
Everyone knows that a SCPU can interact with the C64 at 1MHz, and | |
hence generate pulses with 1MHz resolution, using code like | |
lda #$ff | |
sta $d403 ;Set level high | |
:loop lda $d011 ;wait for C64 cycle | |
dex | |
bne :loop | |
where .X contains the sample value. But what happens if we try to move | |
beyond that 1MHz? What if we put some NOPs into the above delay loop, | |
in place of the lda $d011? Well, in principle it means that the duty | |
cycles won't always be right, which corresponds to some sampling error. | |
In practice, however, it works _really well_! Consider what happens when | |
the above code is changed to: | |
:loop | |
nop | |
nop | |
dex | |
bne :loop | |
The earlier formula still applies, but now using 20MHz cycles: | |
20 * 10^6 / (m * 2^n) samples/second | |
In this example each loop iteration -- each "tick" -- is nine 20MHz cycles, | |
giving a playback rate of approximately 17Khz for 7-bit samples. Which | |
is TOTALLY COOL! | |
And it can even be pushed to 8-bit samples (although I personally don't think | |
they sound any better, at least with the code I've tried; maybe the code can | |
be improved). Using loops like | |
:loop | |
dex | |
beq :done | |
dex | |
beq :done | |
... | |
dex | |
bne :loop | |
:done | |
it is possible to "fine-tune" the loop tick to somewhere between 4-5 cycles, | |
giving a playback rate between 15KHz and 19KHz, for an 8-bit sample. Pretty | |
cool. The code is also a little more involved (with 7-bit samples we can | |
use BMI for the loop branches; not so with 8-bits). But it really is | |
possible to play 8-bit samples at 19KHz on a C64 (plus SuperCPU). | |
Using two voices | |
---------------- | |
You may be thinking, Hey, we've got three pulse waves to work with, can | |
we improve the performance by using multiple pulses? | |
Let's say we have two pulses, P1 and P2, with the same period. When both | |
are activated, the pulses simply add together -- that is, the total voltage | |
is just the sum of the individual voltages, and therefore the _average_ | |
voltage is the sum of the individual pulse averages: | |
avg voltage = D1 + D2 | |
where D1 and D2 are the duty cycles of pulses P1 and P2. In the simplest | |
case, this gives us an extra bit of resolution -- if D1 and D2 are both | |
7-bit values, say, then D1+D2 is an 8-bit value. | |
-BUT- | |
Consider, for a moment, what would happen if we were to change the amplitude | |
of the second pulse -- that is, let's say the maximum voltage it took on | |
was 1/16 of the maximum voltage of the first pulse. The average voltage | |
would then be | |
avg = D1 + D2/16 | |
This then gives us _four_ extra bits of resolution, with each bit to the | |
_right_ of the decimal place. For example, if D1 and D2 are 4-bit numbers, | |
with D1=xxxx and D2=yyyy, then the avg will be a number like xxxx.yyyy | |
(four bits to the left of the decimal place and four to the right). | |
Of course, we can change the pulse amplitude by changing the sustain | |
setting, so in principle this gives a very easy and efficient way of | |
playing high-resolution digis. In practice, I have not been able to | |
make it work very well. I used a sustain setting of 1 and split an | |
8-bit sample into two 4-bit pulses; I believe the result sounds better | |
than 4-bits, but certainly doesn't sound anywhere near 8-bits. My | |
suspicion is that it is because the second pulse voltage is not really | |
1/16 of the first pulse, which corresponds once again to adding noise | |
to the sample value. | |
To find out, we can just measure the output at different sustain levels. | |
The following table gives the voltage output for voice 1 using a pulse | |
waveform at zero frequency and volume 15: | |
Pulse Width Diff | |
SU 000 fff 000 fff | |
0f 6.34 5.29 .08 .07 | |
0e 6.26 5.36 .02 .01 | |
0d 6.24 5.37 .06 .05 | |
0c 6.18 5.42 .03 .02 | |
0b 6.15 5.44 .05 .03 | |
0a 6.10 5.47 .03 .02 | |
09 6.07 5.49 .04 .02 | |
08 6.03 5.51 .03 .02 | |
07 6.00 5.53 .05 .03 | |
06 5.95 5.56 .03 .02 | |
05 5.92 5.58 .05 .02 | |
04 5.87 5.60 .04 .03 | |
03 5.83 5.63 .04 .02 | |
02 5.79 5.65 .06 .02 | |
01 5.75 5.67 .05 .02 | |
00 5.70 5.69 | |
Voice 2 is identical within a few hundredths of a volt. If this test is | |
repeated using voices 1 and 2 simultaneously, the result is: | |
Pulse Width | |
SU 000 fff | |
0f 7.30 5.25 | |
0e 7.12 5.36 | |
0d 7.09 5.37 (!) | |
0c 6.95 5.46 | |
0b 6.88 5.49 | |
0a 6.78 5.54 | |
09 6.72 5.58 | |
08 6.62 5.62 | |
07 6.58 5.65 | |
06 6.47 5.70 | |
05 6.40 5.73 | |
04 6.31 5.78 | |
03 6.22 5.82 | |
02 6.13 5.87 | |
01 6.07 5.90 | |
00 5.97 5.95 | |
Note the weird step at $0d -- the response is definitely not linear! | |
Now, to summarize, when using one voice, the "positive" amplitude (about the | |
mean 5.70V) is .64V and the "negative" amplitude is .41V, giving a spread of | |
1.05V. With two voices together, the amplitudes are 1.33V, 0.72V, and 2.05V | |
respectively. If the two signals were simply added together, the numbers | |
should be 1.28V, 0.82V, and 2.1V. | |
What we originally wanted was a signal like | |
D1 + D2/16 | |
that is, another pulse that is 1/16 the value of the 'full' pulse. 1/16 of | |
the positive amplitude is .64V/16 = .04V, and 1/16 of the negative amplitude | |
is .41V/16 = .026V. A setting of sustain level 1, on the other hand, gives | |
voltage offsets of 0.05 and 0.02, giving approximately | |
.64V / .05V = D1 / 12.8 | |
.41V / .02V = D1 / 20.5 | |
So, in summary, whereas I wanted D1 + D2/16, I was actually getting something | |
that varied from D2/12.8 to D2/20.5, even if the two voices summed together | |
correctly. | |
There may still be a way to make all this work right, which would be great, | |
but I'm tired :). The code from my attempts is below. | |
I also could not get two 7-bit pulses to sound like an 8-bit pulse. I took | |
an 8-bit pulse and divided it in half, assiging each half to a pulse | |
(and giving the extra bit to pulse 2, if an extra bit was present). | |
I suspect that another issue is that it is impossible to update both | |
pulses simultaneously, meaning some delay between pulses, which translates | |
to adding -- surprise! -- noise to the signal. Perhaps it would be | |
more effective at lower resolutions, however. | |
If someone has some success using these techniques I'd be interested in | |
hearing it. | |
SID lockups | |
----------- | |
Blindly applying these PWM algorithms has a way of locking up SID -- like, | |
locking him up hard. To be honest, I don't have a good explanation for why | |
this happens, and I haven't yet found a good method of prevention -- toggling | |
the test bit, playing a real sound for a short time, toggling the gate bit, | |
and so on, just don't seem to "initialize" SID reliably enough. Sometimes the | |
code works, and sometimes it doesn't -- it's the same code both times. Often | |
resetting the machine will make things work; I'm not sure what hardware resets | |
take place within SID, but the kernal certainly zeros him out so that's a | |
possibility. The other observation is that playing a tune seems to 'clear | |
out' whatever is blocking SID. So there _must_ be some kind of software | |
solution to the problem. | |
In the example code pressing RESTORE restarts the code, which will usually | |
clear the 'blockage' after a tap or two, if it happens. | |
If anyone has some thoughts on this issue (or even better, an explanation | |
of what is going on!) I'd love to hear them. | |
Pulse Width Modulation, continued | |
--------------------------------- from various | |
The digi article in issue #20 of C=Hacking left a few loose ends, and | |
generated some followups. | |
First, Otto Jarvinen (sounddemon) emailed to say that the SID detection | |
routine occasionally reported incorrect results for him, and suggested that | |
a workaround was to do the detect several times. YMMV! | |
Second, a day or two after issue #20 was released, Levente discovered a | |
brilliant way to play 6-bit PWM digis on a stock machine: | |
-- | |
I couldn't resist, and tried something out (see attachment). It works!!! :-) | |
In fact, when I wrote the last letter I didn't know that I found something | |
useable, just had some ideas - I felt that I'm at the right place. When I read | |
C=H 20 this morning and read your comment about the Test bit (from the PRG), I | |
knew that it must work. All I had to do is then to put this idea into code. | |
The whole idea is about starting the pulse by software, and then having the | |
SID turn it back to 0 after a time. | |
Is it possible? ...The keys are the Test bit (the SID wave counter can be | |
reseted anytime), the pulse width register, the wave counter and the SIDs way | |
of generating pulse wave. (Ie. the pulse wave is high, as long as the wave | |
counter is less than the value in the pulse width register). | |
Check this algorithm: | |
- Init: volume at max, voice 1 sustain level max, start attack. Freq is | |
selected well (=$4000), so the wave counter is incremented by 4 every | |
processor clock cycles. | |
Loop: | |
- load next sample value, and put it to the pulse width low register ($d402; | |
ensure that $d403 is 0). | |
- Set test bit, and clear test bit (counter reset). | |
- Increase sample pointer, some delay, then loop. The delay must be 64 clock | |
cycles + the time while the Test bit is kept set (4 cycles if using STA $d404 | |
: STX $d404 immediately with pre-loaded values). | |
What will happen? The 8-bit sample value is put directly to the pulse width | |
register (MSBs of the pulse width register are cleared!...). The wave counter | |
is started (release test bit), and it increases 4 by every CPU cycles (= | |
counts 256 in 64 cycles). After some time, the counter will reach the value in | |
the pulse width register. This happens in exactly after (8-bit sample value / | |
4) cycles, because of the above. In this cycle (or the next?...) the SID turns | |
its pulse output to 0. Voil�! | |
One must just make sure that the loop length in cycles matches the above | |
conditions, and then it runs like hell... Since it does exactly the same on | |
the SID as the other (bit-banging) way, it just does it with some hardware | |
help, there's also no problem with the 4khz maximum barrier (since the | |
oscillator is reset every loop). | |
With little enhancement, it's possible to write an about 7.5 bits player for a | |
stock C64 by this method. This is what you find in the attachment... The idea | |
is using all the 3 channels simultaneously. A slightly increased sample value | |
is written to the three pulse width registers, so the oscillators will finish | |
the duty cycle one processor cycle later, when there's a carry between | |
bits(0,1) to the MSBs. | |
The replay freq is the CPU clk / 68 (~15khz). 64 cycles (variable duty cycle) | |
+ 4 cycles (constant duty cycle because of the reset time - no problems with | |
that, it doesn't change (just gives a small constant DC...)). | |
By similar methods, it should be possible to write a sample player with higher | |
PWM freq (with less resolution of course, but eliminating this still audible | |
whistling). | |
(I tried using the filter to reduce it, but it sounded so bad that I left it | |
out. It clicked like hell. The FETs got saturated.) | |
[Richard Atkinson suggested turning down the sustain volumes to avoid this] | |
See the attachment, and the binary. I think the sample sounds pretty good :-). | |
(The cut is from 'Greece 2000' by Three drives on a vinyl). | |
(Another idea that popped up in my mind: since the TED sound generator can | |
also be reset, I could probably translate this idea to the Plus/4 :-O ). | |
Best regards, | |
Levente | |
-- | |
The binary is available at http://www.ffd2.com/fridge/chacking/ towards the | |
bottom of the page. | |
Third, I received a very interesting email from an Apple-II guy, which I'd | |
like to pass on: | |
-- | |
Hi! | |
I found your page as I was searching for something else 6502-related, | |
and was very interested. Although I have always been aware of the | |
C64, I have never really been a user--I have used Apple II's since 1980. | |
I was particularly interested in the article on playing "digis" on the | |
C64. I became interested in playing digitized sounds on the Apple II | |
in 1993, after hearing a 3-bit, 11.025 KHz PWM player. At 3 bits, you | |
can imagine how noisy speech samples were, but the overall effect | |
for a 1 MHz machine with a 1-bit speaker "toggle" was amazing. It | |
made me wonder how far this PWM technique could be pushed on a | |
stock, 1 MHz Apple II (not the somewhat faster, 65816-based IIgs). | |
The short answer is, much farther than I expected! Robin and Stephen | |
accurately describe the theoretical PWM limit as 6 bit samples at | |
about 16 KHz for a stock 1 MHz machine, but, as they point out, | |
that is not practically realizable for a number of reasons, unless the | |
play loop is completely unrolled! | |
Furthermore, in the Apple II world, sampled sounds have acquired a | |
few standardized sampling rates--mostly as a result of Mac influence, | |
which was in turn influenced by CD's. The most common rate in the | |
Apple II world is 11.025 KHz, or one-fourth of the audio CD sampling | |
rate. This is commonly considered to be "AM radio quality", with a | |
Nyquist bandwidth of about 5.5 KHz and a practical bandwidth of | |
4+ KHz, given practical anti-aliasing filters (at the sampling end, not | |
the playback end). | |
A frequency of 11.025 KHz is, though high, still painfully audible to | |
people whose ears are not zonked--a piercing "squeal" running | |
through every sound. So even though it is possible to write a | |
practical 6-bit 11.025 KHz PWM player (usually called a SoftDAC | |
in the Apple II world), the resulting listening experience is disappointing. | |
So I went to work on a way to do 2x oversampling, and built a 5-bit | |
22.050 KHz PWM player. It was sad to lose a bit, but the absence | |
of any audible "carrier" more than compensated for it! | |
If you have access to an 8-bit Apple II (preferably with lower case, | |
like a //e), and also preferably with a way of attaching an external | |
speaker or headphones in place of the miserable 2.75" internal | |
speaker, then you can easily give it a try and judge for yourself. | |
I'm pretty proud of the novel design of the code, which I would | |
characterize as "vectored" unrolled loops, one for every two | |
pulse duty cycles, which I wrote a BASIC program to write | |
for me--much less painful for counting cycles! | |
The package is available on the web at: | |
http://members.aol.com/MJMahon/index.html | |
and is called <A HREF="http://members.aol.com/MJMahon/sound22.shk">Sound Editor v2.2</A>, since I had to "dress up" the player | |
into something fun to play with. ;-) An earlier version of Sound Editor | |
was published on SoftDisk in 1994, IIRC, but this one is a little more | |
evolved. It also introduced 2:1 ADPCM compression of 8-bit sampled | |
sounds, to save disk space. It is a lossy compression, but not very | |
noticeably. The editor package also includes those routines, in 6502 | |
assembly code. | |
All of this should be trivially adaptable to the stock, 1 MHz C64, with | |
very good results. By using the filters, you could probably filter out | |
the 11.025 KHz carrier and return to 6-bit accuracy! | |
I should note that in the Apple world, sampled sounds are usually | |
represented as "excess-128" codes, which means that the sign bit | |
is inverted. This actually simplifies things, since the sample value | |
is within a few shifts of being the pulse width in cycles. | |
Let me know what you think! | |
-michael | |
-- | |
(Always great to hear from Atari and Apple ][ folks!) | |
And finally, I have a little mathematical analysis of PWM and how it compares | |
to a "straight" digi. Basically, I found some of the PWM explanations a | |
little unconvincing in issue #20 (even though I wrote them!). For example, | |
the idea of "average voltage" seems a little funny, since every two samples | |
has an "average voltage", as does every four, etc. but that set of average | |
voltages would give a different sounding signal than the original (or | |
more dramatically, there is an average voltage over a full second of digi | |
playback, but that's not what you hear!). So I wanted to know how a | |
PWM signal _really_ compares to a straight digi playback. | |
Another issue is changing the amplitude of a PWM digi, i.e. using two | |
pulse waveforms, with one 1/16 the value of the other, to get higher | |
resolution. If you recall the discussion of digis, the resolution of a PWM | |
digi depends on the number of pulse widths available, not the amplitude. | |
Adding two PWM waveforms together does not change the number of pulse widths | |
available, so I wanted to figure out what changing the amplitude _really_ | |
does to a PWM digi, and if it can really be exploited. | |
And finally, I wanted to know about the carrier wave (that is so piercing | |
at lower playback frequencies) -- and once again, how it compares with a | |
standard digi (which, after all, is stair-stepping the voltages at the | |
playback rate). | |
Since the rest of this article is some Fourier analysis that 99% of people | |
will have zero interest in, I'll put the conclusions here. The first is: | |
PWM digis and standard digis are essentially identical except at higher | |
frequencies (except for a phase shift, which doesn't make any difference to | |
your ear). The second is: changing the amplitude of a PWM changes the | |
resolution. More specifically, the amplitude of the pulse multiplies the | |
digi sample value. If two pulses can be synced close enough, it should | |
indeed be possible to use two pulses to get a higher resolution. Moreover, | |
by modulating the amplitude of a single PWM digi, using the $d418 volume | |
register -- that is, using PWM _and_ $d418 -- it should be possible to get a | |
higher dynamic range, something that should be a little more achievable using | |
SID (but maybe not that useful, so I didn't try it out). And finally, a | |
standard digi has zero amplitude at the carrier frequency. | |
In other words, after a lot of effort I was able to demonstrate what everyone | |
already knows. | |
The analysis doesn't change anything from the previous articles (except | |
possibly the idea for changing the PWM amplitude to get more dynamic range). | |
And now, some Fourier analysis. A standard digi just sets the voltage to | |
the sample value s_j, for a length of time dt (dt = 1/sample rate). The | |
Fourier transform of a single sample s_j (occuring at time t_j) is | |
s_j [e^(-iw dt) - 1] * [e^(-iw t_j) / -iw] | |
where w = angular frequency. Since the above is a little hard to read, I'll | |
say it in words. The first term is the sample value s_j, which scales | |
amplitudes at all frequencies. The second term is due to the finite length | |
of the pulse (evaluating the Fourier integral at the boundaries), and | |
basically changes the phase of the transform. The third term is like | |
sin(w)/w -- a sinusoid with decreasing amplitude as frequency increases. | |
So: the transform goes like sin(w)/w times the sample value, with some phase | |
effects thrown in (we'll get back to these in a moment). | |
A PWM digi sets the duty cycle of a pulse to the sample value s_j, giving | |
a Fourier transform of | |
[e^(-iw s_j dt) - 1] * [e^(-iw t_j) / -iw] | |
Compare this with the earlier expression, and you'll see that the sample | |
value s_j has moved up in to the exponent of the "phase term" but that | |
they're otherwise the same. | |
The first thing to do is to show that both expressions, PWM and standard, | |
reduce to the same thing -- that is, that a PWM and a standard digi sound | |
the same! The expressions both decrease as 1/frequency, due to the | |
sin(w)/w term. This means that at large frequencies the values become | |
negligible. (How large? For example, if the sample frequency is just 1KHz, | |
then sin(w)/w is .001 times smaller near w=1KHz (i.e. the sample frequency, | |
which is twice the Nyquist limit) than it is near w=0). | |
So now consider the phase terms for small w. The Taylor expansion for e^x is | |
1 + x + x^2/2 + ... | |
We can therefore expand the "phase terms" as | |
regular: e^(-iw dt) - 1 = (1 - iw*dt + w^2 dt^2/2 + ...) - 1 | |
= -iw*dt + O(w^2 dt^2) | |
pwm: e^(-iw s_j dt) - 1 = -iw*s_j*dt + O(w^2 dt^2) | |
where O(w^2 dt^2) is considered very small since w and dt are both small. | |
Substituting the above into the original expressions gives | |
s_j*iw*dt [e^(-iw t_j) / iw] | |
in both cases. That is, we have shown that for "small" frequencies -- more | |
specifically, for frequencies where (w^2*dt^2) is much smaller than (w*dt), | |
which is where w*dt<1, which is frequencies less than the sample frequency, | |
which is all frequencies of interest! -- PWM and standard digis are the same. | |
The explanation lies in the phase terms. Those "phase terms" | |
[e^(iw dt) - 1] (regular) | |
and | |
[e^(iw s_j dt) - 1] (PWM) | |
do more than just change the phase. When they multiply the sin(w)/w signal, | |
they take the sin(w)/w signal, change the phase, and then subtract the | |
sin(w)/w signal again. It's this difference of signals that makes things | |
work out at the frequencies we care about. PWM and standard digis are _not_ | |
the same, but the main differences are at higher frequencies, where the | |
amplitudes are in general much smaller. | |
But... but... what about the PWM carrier frequency? If we take a constant | |
digi, say with sample values = 1/2, the standard digi gives a constant | |
voltage, whereas a PWM digi gives a square wave at the sample frequency. | |
The answer comes from the "phase terms" above. The sample frequency is | |
w = 2*pi/dt. | |
Substituting this into the phase terms gives | |
[e^(i*2*pi) - 1] (regular) | |
and | |
[e^(i s_j 2*pi) - 1] (PWM) | |
The regular expression is exactly zero -- there is _nothing_ at the | |
sample frequency of a regular digi. But that's not the case for the PWM | |
term, because of the s_j up in the exponent. PWM digis have a _finite_ | |
amplitude at the carrier frequency. Note that because of the sin(w)/w | |
term it gets smaller as the sample frequency increases -- but it isn't zero. | |
Finally, the phase term expansions give some insight into what happens | |
when both the pulse width _and_ height are varied. If the pulse width | |
is s_j, and the height is set to h_j, then the Fourier transform becomes | |
h_j*s_j *iw*dt [e^(-iw t_j) / iw] | |
That is, the amplitude multiples the width. For the case of adding two | |
PWM waves together, then, the amplitude really does effectively scale the | |
sample value, and it should be possible to add one PWM value at 1/16 the | |
amplitude of another to get an effective 8-bit value. | |
What about _varying_ the amplitude of a single PWM sequence? For a 6-bit PWM | |
digi, say, the sample values s_j can go from 0 to 63. If this is then | |
multiplied by h_j=2 say, then the values become 0 2 4 ... 126 -- a 7-bit | |
number where the lowest bit is always 0. What use is that? Well, we still | |
have the h_j=1 values of 0..63, which do include the lowest bit. So we | |
can effectively change the dynamic range from 0..63 to 0..126 using just two | |
amplitude values. | |
As a practical matter, then, it might be possible to use all 15 $d018 values | |
available to get a big dynamic range, and hence a better sounding digi, | |
using fewer CPU cycles. Well, ok, we're only _sort of_ changing the dynamic | |
range, so I pretty much doubt the usefulness of it. But maybe someone out | |
there would like to give it a shot. | |
All right, let's hope this closes the book on pulse width modulation for | |
digi playback! | |
....... | |
.... | |
.. | |
. C=H 20 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment