This page is a brief introduction to the topics utilized by the Big Mouth Billy Bass project. Links are provided to the directly relevant projects and tutorials. Please send any related links this way.

Acoustic Phonetics

One of the first things a linguistics student learns is that there are a finite number of sounds the human voice can make. There are on the order of 100 of these sounds, also known as phonemes. Babies experiment with different phonemes as they learn to talk. Soon they learn to use the phonemes particular to their first language. As adults we can identify foreign languages merely by which phonemes are used. One of the things a phonetician does is look at or listen to a sound samples to identify phonemes. If you can identify phonemes, then you can know a great deal about how the speaker's lips are held without needing to know what language that person is speaking.

These references are ordered by their information content: most general first, most detailed last.

Marcus Filipsson's Tutorial. A nice introduction.

Center for Spoken Language at the Oregon Graduate Institute. Shows the basics of reading a spectrogram--the translation of sound into a time varying spectrum.

Notes from Professor Russell's General Phonetics course at the University of Manitoba. Introduces the source-filter concept and applies it to voiced sound.

Wavesurfer is an elegant, GPL'd research sound processing program that I use to create crude transcriptions. It is easy to install, easy to use. It runs in Linux and in Windows. These transcriptions are used to animate Billy Bass's mouth.

Fourier Transforms

Sound is a series of pressure disturbances detected by your ear. You can look at those pressure levels over time, or you can look their spectrum or frequency content. A graphic equalizer, for example, alters and maybe displays the spectral content of sounds produced by your stereo. By looking at the spectrum associated with human voice you can see patterns that indicate that the lips are open and the sound of the vocal chords is allowed to resonate freely. The patterns are called formants. They resemble harmonics, in fact they are the harmonics of your larynx.

Fastest Fourier Transform in the West Perhaps the Best-of-Class library for performing the FFT. Actually the creators do not claim it's the fastest, however they claim it is the fastest cross-platform FFT library. It has served me well before: I have used this code to transform sampled sound in real-time on my trusty little P-100.

Again OSALP provides this functionality, intended for sound visualization but it could be used by Billy Bass to recognize phonemes.

Threads

Threads are way to make a single program do more than one thing at the same time. Long ago I remember being impressed that I could enter a URL into Netscape at the same time it was downloading another page. Rathar than run several programs at once, the programmer can launch several threads, each performing a different task but sharing the same memory. In this project threads are used to move the lips while feeding the sound card.

Here are some introductions to thread programming.

  1. An informative and attractive introduction to POSIX threads programming from LLNL.
  2. Another POSIX threads guide.

Pavel Krauz has written a delightful threads library for C++ called QpThreads. This library is used in the Billy Bass control software. Anyone who has used a commercial library such as Threads.h++ by RogueWave will feel extremely comfortable with this library. This library wraps POSIX threads.

Sound Files and Streaming

Open H.323 Project. The goal of this project is an open source library that implements the H.323 standards for packet-based videoconferencing.

Open Source Audio Library Project.This is a project designed to implement a world class set of classes in C++ that will handle all of the audio functions one would like. OSALP allows Big Mouth Billy Bass to interpret a tremendous array of sound files: WAV, MP3, AU, and AIFF.