Musical Blackboard Video
♫ Tuesday, February 20th, 2007
By the power of HRTF, I present an HRTF panner… in VST form! Most of the code was originally created for a PortAudio implementation, and then adapted for the VST specification. The code to read the HRTF directory was taken and adapted from code in the MAT 240C codebase, by an anonymous author.
The interface is nothing to write home about, but looks can be deceiving. This plugin will eat your mono signal’s soul and spit it back out as a glob of spatial HRTF audio.
Download the project file (Microsoft Visual Studio .Net). It should be cross-compatible, but may require some tweaking on your system to compile properly.
You will also need FFTW3 (compiled as a DLL), LibSndFile (compiled as a LIB), and the IRC_1026_R IRCAM HRTF files. Upon downloading the IRCAM files, put the “files” file included in the zip archive into the directory. This allows the code to load up all of the HRIR impulse responses. (Note: the use of another IRCAM data set is possible, but modification of both the C code and the “files” index file will be required.)
To run the plugin on your machine, you will require:
Though this plugin should be cross-platform and compatible with all VST hosts, this is not guaranteed. As of this writing, the only host this has been tested in is H. Seib’s VSTHost.
There are many things that need to be improved. Let’s list them, shall we?
And that’s that. Happy? What, you mean you’re not? Then contact me and let me know about it.
The virtual musical blackboard is an alternative musical interface device. Specifically, the blackboard serves as a “glitch interface” to a Casio MT-401 synthesizer.

The Musical Blackboard schematics (Note: The PIC Microcontroller includes all of the helper circuitry given by the CUI design by Dan Overholt)
The original concept for the system came from an initial exploration of the circuitry of the Casio MT-401 keyboard. It was initially thought that because the keyboard is digital in its sound production, there would be limited opportunities for “circuit bending”, a procedure traditionally reserved for analog systems.
But in the process of probing the system, it was discovered that shorting various pins on the memory chip together produced a variety of digital noise effects.
From this discovery, the abstract blackboard interface was conceived. Because the digital noises produced from the memory pin shorting were loud and abrasive, it was reminiscent of the sound that fingernails on a blackboard produce.
Now, it just so happens that the main keyboard PCB contains an unused 40-pin chip slot (presumably for debugging early in-house system prototypes). Using this, we will be able to interface with the keyboard in whatever fashion we want.
Shorting various pins on the memory chip together produces a variety of digital noise effects. There are several ways to replicate the effect of shorting memory pin chips. One of the simplest would be to connect all 40 pins to an external micro-controller. The micro-controller could pull individual pins high or low. However, this method does not allow us to short individual memory pins together. Nor does it replicate the analog nature of physically attaching electrodes to memory pins.
Thus, we will construct a 40-pin parallel analog switching circuit. The logic is simple: A serial stream of digital bits gets converted to a 40-bit parallel signal. These lines each control the enabling of an IC analog switch. The inputs to each of these switches are wired to a common point, so that an enabled switch will connect the corresponding memory pin with all other pins currently switched on.
In order to realize the blackboard metaphor, several different types of sensors were considered. The original goal was to construct a sensor that could detect a range of different gestures from a large surface. However, the complexity of the interface needed to remain low because of the limited time available for the project.
After much research, it was decided that a capacitive sensor would be the ideal method to gather input from the blackboard. Capacitive sensors can be found in many touch-sensitive lamps and radios, because they are cheap and relatively simple to produce.
However, most touch-sensitive switches are of the simple on-off variety - definitely not enough input variation for the kind of interface we need. So a pressure- and velocity-sensitive version was adapted from a circuit by John Simonton.
The construction of the physical blackboard interface proved to be fairly simple. A wooden frame was constructed to hold a piece of scrap sheet metal. The unit was then soldered to a lead attached to the capacitive sensing circuit.
The construction of the interface between the keyboard and the microcontroller proved to be a bit more labor-intensive. The sheer number of pins on the keyboard memory chip meant that a large Serial-In Parallel-Out (SIPO) latch circuit was needed to provide 40 outputs necessary from the microcontroller.
With a large circuit comes a large debugging task. Though the initial design was theoretically simple, all circuit projects have unforseen problems.
The first problem occured when attempting to power the interface from the keyboard. For some unknown reason, the keyboard would not start up correctly if it was powering the interface on its own power. Thus, the system needed to provide its own power, in the form of a USB cable from the PIC microcontroller board.
With so much circuitry, it was difficult to find a way to integrate the project into the keyboard. Ideally, the entire interface save for the physical blackboard would be contained inside the keyboard. But this has two problems:
So, the PIC microcontroller and the sensing circuitry were fitted into an external housing, while the interface between the PIC and the keyboard were fit into the keyboard housing. A 5-pin ribbon cable is all that connects the two units.
Through much testing and trial changing the code, an algorithm was devised that produced a fairly extensive range of glitch sounds when the blackboard was touched. Much of the trial had to do with how the pressure/velocity data from the blackboard was translated to the memory pins, and how fast the system updated its readings. But with suitable values, the system retains much of its “analog pin-swiping” sound it exhibited originally.
It was initially thought that the capacitive sensing circuit would produce unstable results, causing glitches in the system when not being touched and not activating when touched. However, the sensor is suprisingly stable. This has quite a bit to do with the “noise gate” coded into the software: No input was transferred to the memory pins until the pressure of the sensor climbed above a certain threshold.
Though the actual design changed quite a bit from initial concept to final prototype, the end product was remarkably similar to what was envisioned. Next time around, more care needs to be taken to ensure all construction is resistant against damage. Things such as proper connectors, proper housing, and proper power need to be addressed in order to build a second, better prototype.
Linear Predictive Coding is a technique used to model speech and other similar systems. It has applications in the following areas:
LPC is the process of predicting future samples in a sequence given a set of its N past values. Thus, for an LPC resynthesis of order N, the following equation is used:
To determine the proper set of coefficients “a”, we need to predict what values would produce a resynthesis closest to the original signal. This prediction can be accomplished by various means, but the minimization of the mean-square error of the prediction is a common method.
For speech coding and synthesis, a window of samples is analyzed according to the above equation. Upon calculating the linear predictor coefficients, the residual signal (error signal) epsilon is calculated. The coeffients are then saved and the next window is analyzed. Upon analyzing the entire signal, the LPC coefficients are saved for later resynthesis, or transmitted across a channel for remote resynthesis.
Because only coefficients are saved, the coded signal has a much lower bitrate and thus is useful for applications such as cellular phones, internet audio and voice prompting, where high bitrates are not available.
It is interesting to note that if the residual signal is added to the resynthesized signal, the original signal can be perfectly reconstructed. This is because the residual is by its very nature the difference of the input and resynthesized signals. Though at first this sounds like lossless compression, the residual has a bitrate equal to the original signal, so transmitting the residual does not result in any data compression. Usually, the residual is simply thrown out after analysis.
During signal resynthesis, the speech is modeled as a periodic impulse (glottal pulses) filtered through a vocal tract modeled by the linear prediction coefficients. Though this approximates the fundamentals of speech well, it does not model any of the complexities of the human voice, such as inharmonic frequencies and voiced/unvoiced-combination vocal utterances.
To investigate LPC synthesis in detail, we will use a computer routine originally developed by Perry R. Cook. The code has been updated to use LibSndFile for sound file I/O, and signal statistics have been added in order to qualitatively measure resynthesis quality.
The source code has been compiled on Microsoft Visual C++ .Net, but should be compiler- and platform-independant. (Note: source requires LibSndFile to compile) Effect of LPC order on resynthesis quality
As the order of the LPC analysis increases, the resynthesized signal approximates the original signal more closely. We can here this from the following audio samples:
From these audio samples, it is clear that as order increases, the resynthesized LPC signal approximates the original file better. We can also ear the robotic nature of LPC, with its simplistic vocal tract model. However, we want to objectively measure the effect of LPC order on analysis quality.
We will look at plots of several signal statistics versus LPC order, to measure signal quality.
From the plots above, we can make several observations.
We can also view the time- and frequency-domain plots of the output signals for various LPC orders. We will see how LPC resynthesis of increasing order affects the quality of signal synthesis.
A few observations can be made about this time-domain animation.
From the above plot, we notice several things
Careful observers might notice that for the time-domain and frequency-domain graphs above, orders N=2 and N=3 produce odd plots. Indeed, the Cook implementation used sometimes produces unstable responses.
Looking into the source of the unstable responses, we note that using autocorrelation to minimize the mean-square of the error-signal (which the Cook implementation uses) should result in guaranteed stability. However, further investigation reveals that precision and roundoff errors for coefficients near 0 or greater than 1 may cause the actual response to deviate from the required response, generating instabilities in the process.
These errors result in improper frequency and time responses for the orders in question, as seen in the above animations.
LPC Synthesis is a powerful technique for speech analysis, coding, and resynthesis. By investigating various stastics, parameters and outputs related to the technique, we can better understand the effects of LPC on an input signal.
We have shown that LPC is not designed for reconstructing audio signals, and as a result does not work well for non-speech signal coding. But to transmit intelligible audio across a low bandwidth channel, LPC is a very useful technique.
We ave shown that beyond 9 or 10 coefficients, the increase in quality of the LPC reconstruction does not justify the added computation. For this reason, LPC-10 (with -10 denoting the 10 prediction coefficients, and 180 samples per analysis frame) became an industry-standard codec for low-bandwidth speech transmission. We can calculate the estimated bitrate of the coded signal as follows:
It is quite evident from the above calculations that LPC coding produces an extraordinary bitrate compared with sampled audio, with a bitrate of roughly 3.5% of the original signal. Of course, the output will sound robotic and will be highly succeptible to other noise in the signal. But if transmitting intelligible speech reproduction is all that is needed, LPC is a wonderful tool.
This program was designed as a demonstration of how to integrate PortMIDI with OpenGL 3D visualizations. It takes MIDI input from the default MIDI Input port, outputs that data to the default MIDI output, and translates the note and controller data into various OpenGL parameters. It serves as an example for implementing PortMIDI, and basic OpenGL concepts such as GLU Quadrics and lighting. The OpenGL code is heavily referenced from the great tutorials at NeHe Productions.
The APIs implemented are as follows:
The program was designed to be used with an Oxygen8 keyboard controller, so the notes and parameters used are as follows:
| Note C2 .. C4 | Individual size of corresponding sphere |
| Controller | |
|
1
|
X-coord of spotlight |
|
7
|
Y-coord of spotlight |
|
11
|
Diffuse light: Red value |
|
12
|
Diffuse light: Green value |
|
13
|
Diffuse light: Blue value |
|
15
|
Ambient light: Red value |
|
16
|
Ambient light: Green value |
|
17
|
Ambient light: Blue value |
For Oxygen8 users, these parameters correspond to the Modulation wheel, the data slider, and the 3 right-most rotary knobs in both rows.
Run midi2opengl.exe . The program relies on your default MIDI input and output ports being the ones you intend for input and output, so other setups will require re-coding in the openMidi function. To trigger the spheres, press C2 .. C4 on your MIDI input controller. The lighting characteristics are modified by the controller parameters listed above.
As this is only a quick technology/interfacing demo, some re-coding might be necessary for this to run on your machine. As every MIDI I/O situation is different, you MAY need to change the IN and OUT parameters in openMidi to suit your specific setup. Run the test application packaged with PortMidi to determine the proper parameters to use with your system.
IMPORTANT NOTE! This program does not use threads for video and MIDI updates, and instead runs in an endless loop. Though this makes the code easy to read, it has the side effect of maxing your CPU at 100% usage. Take note of this fact when running the program – it won’t harm your computer, but it will heat up the processor quickly!
A PC running Windows 98+, a sound card or external audio device with MIDI inputs and outputs