
From Funny Fish to
Videoconferencing Station
I propose to use the MZ104 to control the popular animatronic toy Big Mouth Billy Bass. At the early stages of this project, the MZ104 will dramatically enhance the current behavior of the toy with user selected audio clips and better motion coordination. Later, by adding a microphone and CCD camera to the system, the toy will be transformed into a webcam or a videoconferencing station.
We will make the following improvements to Big Mouth Billy Bass.
· User defined audio clips
· Lip syncing
· Video recording
· Audio recording
By adding this functionality to the bass, in addition to networking protocols, the bass will be transformed into an H.323 compliant video teleconferencing host. It will be possible to use Microsoft NetMeeting or CUSeeMe to connect to your bass at home and talk with your loved one ones!
For the unfamiliar, Big Mouth Billy Bass is an animatronic toy in the form of a mounted fish. When stimulated by a button press or proximity of a human, the fish comes to life, singing and wiggling to one of two preprogrammed tunes.
The secret of Billy’s success is his behavior—certainly not his appearance nor singing ability! The essence or personality of the toy resides in a small CMOS chip, no larger than a fingernail.

By replacing that preprogrammed chip with a fully programmable embedded controller, perhaps we can create a Billy whose novelty never wears.
The project will use an incremental approach. At each stage in the development, a new layer of capability will be added and tested. The following stages are planned.
Here we simply reproduce Billy’s original behaviors. Billy will be able to move his head, tail or mouth. At this stage basic lip-syncing will be implemented. Sound coordination algorithms for the head and tail will be created as well.
Stage 2. Polly Wants a Cracker.
At this stage sound sampling is added. Billy will be able to create sound clips to repeat on its own.
Stage 3. Smile, You’re on
Candid Camera.
Video sampling and storage will be added at this stage. When Billy performs an entertaining function, the embedded camera and microphone will record the viewer’s reaction.
Network based video and audio streaming will be added to Billy. These streams will comply with H.323 standards. In lieu of presenting the remote speaker’s video image, Billy will interpret audio output using mouth, head and tail motion.
On the surface Billy will remain unchanged, save for an Ethernet plug in the frame. All the new hardware will be mounted within the fake plastic frame. As shown in the image below, the frame is more than sufficient for a PC/104. Lots of other goodies, including a striped down USB camera and microphone, will be mounted within the frame.

As for control circuitry, as much as possible will be reused. The only part of the control circuity to be replaced will be the CMOS chip. The original power switching circuitry will be reused.
One assumption has been made with respect to the MZ04 board: its USB interface is a master/host and not a slave. Hence this interface can support a USB camera. If not, then there are reasonable alternatives. For example, we can use the parallel port for the camera interface and directly use other lines (serial, IrDA, USB) for motor control.
The bill of materials will include the following
· The Big Mouth Billy Bass
· The MZ104 Embedded Board
· PC/104 sound card
· Piezoelectric mike
· USB based CCD camera
· Various IC’s to create a homebrew circuit to interface the parallel port to CMOS logic
Software development will be the majority of effort for this project. It is the software that will provide Big Mouth Billy Bass with a new personality. It is the software that will allow the bass to be possessed by a remote user.
Fortunately most of the basic functionality exists as open source libraries. This project will utilize these libraries for these basic functions.
· Generate the spectrum associated with audio clips
· Retrieve images from the USB camera
· Implement the H.323 protocol
· Encode video and audio
In addition code will be reused from a previous, personal project. In this project we created a Linux-based, interactive light show. The light show interpreted the sounds of the audience. From that project we created code to
· Communicate with devices over the parallel and game ports
· Recognize notes
· Sound sampling and playing
This code has dependences on other libraries. Notable among these is the Fastest Fourier Transform in the West (FFTW) and the Open Sound System (OSS). A key software assumption is that BlueCat Linux supports the OSS.
A significant amount of new code will be required for this project. This software will not only “glue” together the previous functions but also provide new ones, such as:
· Motor control signal generation
· Sound coordination code for the mouth, head and tail.
Every project has risks. Hopefully these risks can be minimized by anticipating them and coming up with work-arounds. Here are the risks as I see them for this project.
· Interpretive algorithms for the mouth, head, tail. I will try to keep this simple. For example, lip syncing can be implemented by opening the mouth when vowels are spoken. Vowels can be inferred when a voice is speaking and when a particular spectrum appears.
· Processing limitations. The FFT, video collection and video streaming are processor intensive activities. However FFT this can be reduced significantly by subsampling the data. Video handling can be simplified by reducing the frame rate. As a general note, it is possible to perform these activities simultaneously on a desktop Pentium 100 Mhz system.
· Memory limitation. Again, audio subsampling can help here. Also, replacing the supplied 32 MB DRAM module with a larger and/or faster 64 Mb one might solve the memory limitations.
· USB and camera driver suppot. I do not have much experience with these issues in Linux. However by using a camera with proven Linux drivers this risk will be reduced.
· Currently I am a Research Scientist Associate at Applied Research Laboratories at the University of Texas at Austin. My research focuses on use of the Global Positioning System. As part of this work I have written interfaces to devices such as atomic clocks, time interval counters and GPS receivers in C and C++.
· While in grad school at Stanford I performed research at the Stanford Learning Laboratory. My task was to evaluate and integrate software/hardware systems for collaboration over the Internet. It was here I developed an appreciation for the H.323 standard.
· At Hughes Space and Communication in El Segundo, CA, I wrote software to control telecommunication satellites. For a year and a half I was the technical lead for one major software development. Our team ranged in size from 8 to 12 developers during that period.
· Hobby robotics. Inspired by the works of Marvin Minsky and Isaac Asimov.
· Recently used Linux to create an interactive music recognition light show project. We used Linux in realtime to control a lightshow based on music made by the audience.
· Digital Video. Firewire/IEEE 1394. Creation of video CDs. Animation using MPEG.
· MS, Aerospace Engineering, Stanford, 2000. Focus in controls. Classes and labs focusing on digital control of dynamic physical systems.
· BS, Aerospace Engineering, University of Texas at Austin, 1994. Focus in controls. Classes in robotics, dynamics.
· Talented and Gifted Magnet High School, Dallas, Texas, 1989. Valedictorian, National Merit Finalist. For my senior project I created a programmable, user friendly interface to a Puma trainer (6 DOF arm with pressure sensitive gripper) for the IBM PC.
These are the books and websites that I plan to utilize for this project.
Bergsman, Paul. Controlling the World with Your PC. HighText: Solana Beach, CA, 1994.
This excellent reference describes the electrical characteristics of the PC printer, serial and game ports. The book comes with circuit diagrams and source code to control simple devices. The source code is supplied with versions in BASIC, Pascal and C. I have adapted some of the C code for use with gcc in the Linux x86 environment.
McComb, Gordon. The Robot Builder’s Bonanza. 2nd ed. McGraw-Hill: New York, 2000.
Web site http://www.robotoid.com/
.
This handbook of circuits and mechanical designs has been my favorite robotics reference since its first edition in 1987.
Linux I/O port programming mini-HOWTO by Riku Saikkonen.
http://www.ibiblio.org/mdw/HOWTO/mini/IO-Port-Programming.html
This site has code snippets showing how to access the all major I/O ports in Linux: parallel, serial, and gameport.
Fastest Fourier Transform in the West. http://www.fftw.org/
The Fastest Fourier Transform in the West is a world class implementation of the Fast Fourier Transform (FFT). This C library is open source. It is one of the cleanest interfaces to a math library that I have ever used.
The OpenH323 Project. http://www.openh323.org/
An all-encompassing source for the software architecture to support the H.323 standard. Most importantly, this project has created classes that implement audio and video codecs.
The Linux USB project. http://www.linux-usb.org/
This site is the hub (please pardon the pun) for all Linux related USB activities. Drivers for many inexpensive CCD based cameras can be found here.