April 1999


By Fred Henning

Some more thoughts on the future and the future is today.

I went to CompUSA the other day in hopes of finding an idea for this issue of Bits & Bytes. After looking around I decided that live audio and video are the hot items. If you follow the financial news you may have seen articles on two of the companies involved. RealAudio and Broadcast.com. RealAudio has set the standards for Live audio and video.(1) While Broadcast.com has created a virtual network of off air, cable and internet stations including WFMT, PBS, BBC, CNN, etc. You can even listen to the Dallas, Texas Fire Department live or watch Heart Surgery from the 'Virtual-Operating Room'.

RealAudio's software has made the quality of Internet accessed radio and TV sound and picture very good. You can even get stereo sound! There is a free version of the software that is required and a download link to it is found on most sites that provide sound.

Another side of audio is the ability to communicate directly with your computer. You can have a computer read back a Word document or an Excel spreadsheet. You can get software that will allow you to command your computer via your voice and even allows you to dictate your Word document.

Speech technology is very intriguing to me since developing speech recognition technology was my first computer job. I was an Experimental Engineer at the IIT Research Institute and assigned to the Cognitive Systems Simulation Laboratory. My job was to develop the hardware side of interfaces between the 'real world' and the UNIVAC computer. One of our goals was to develop a computer program that could recognize spoken words.

When tours came to see the computer we would have a visitor ask the computer to perform simple math and the computer would then type out the answer for them. This pales in comparison to the continuous speech recognition available today with $9.99 software and a $650 computer. My only consolation is that someone had to take the first step. We were presenting to professional organizations in the 1961-63 era various methods of capturing, digitally recording and digitally processing speech and music. Yes, I recorded digital music decades before the CD-ROM.

We therefore have three ways the computer interacts with speech, besides just having the PC play back sound.

1. The computer can take commands from us to control the computer. It simulates command keys, mouse actions and keyboard input. Some of this has been developed to make the computer more friendly while it also has become important in military and Industrial settings as well as for those with disabilities.

2. The computer can read back to us text from most word processing documents, Internet pages, spreadsheets, etc. I like to have it read me the 'Read Me' files and Instructions for a new program.

3. Continuous speech recognition software allows us to transcribe our documents. The most powerful of this type of software will provide the same type of spelling, grammatical and syntactical help that you would get from typing into a word processor.

Live broadcast video, video e-mail, video teleconferencing and video demonstrations are some of the ways in which live or streaming video is used. Way back in 1992 Cornell University started a NSF project which would allow classroom video conferencing over the internet. This evolved to a student being able to walk around campus with a television camera attached to his head. We saw what the camera saw and it saw what he saw. We know this as CU-SeeMe. CU-SeeMe was the first time that the concept of Reflectors was introduced to the Internet. This would allow more people to view and/or join the CU-SeeMe session.

We are all familiar with the Mars Rover pictures and the fact that NASA needed to find hundreds of Internet server sites to Reflect (Rebroadcast) the Mars pictures because so many people wanted simultaneous access. If all we had was text an internet server might support 100 simultaneous users but with pictures it might only support 10. The alternative is to send the picture out at slower rates and have it rebuilt at the receiving end. This is where RealAudio and similar companies have come up with new technologies that provide excellent video compression. The ability of this new technology allows the interleaving of audio with the video to provide lip sync.(2)

You and I are more likely to buy a $65 - $150 camera to set on top of the computer monitor and send video as part of an e-mail or talk to our grandchildren in a virtual teleconference. Because we have simple MODEM connections we usually have a simple choice. A small picture that looks almost Live (30 frames per second with lip sync.) or a larger picture that is jerky (10-15 frames per second), not as good as an old silent movie and the lip movement/face does not go with the sound. Seeing a grandchild live from Boston, even though the image is small, is all that it takes to convince you to buy one for your distant family.

CAUTION: Using Real Audio/Video, Streaming Video, Video e-mail, etc. means that you are using much larger portions of the total Internet bandwidth. Adding audio and video clips increases the total bandwidth consumed by 10 times to 100 or more times. Where 1000 users could use an Internet connection at a University, now the University needs 10 to 100 new connections or the equivalent channel bandwidth. Yes we get more and better content. Yes, one picture is worth a thousand words, but it comes at a price!

(1) Never let it be said that Microsoft would let anyone else create a standard regardless of how good it might be. You guessed it! Microsoft has a standard - Windows Media Player.

(2) The basic TCP/IP protocol of the internet does not provide that all packets of information will arrive in a uniform time reference. Because of this, most video and sound techniques require that x number of seconds of data are stored before it is played. There is always a built in delay. It's similar to watching CNN and having the news desk say the reporter's name and then you wait the second or so before the reporter starts talking, only this is due to satellites being 25,000 miles out in space.

In Washington the concept of the government being in the business of making a profit and making sure that the public PAYs goes on. The FCC is trying to define Internet access as if it were interstate commerce and a long distance telephone call. Inexpensive access to the Internet, it's world of knowledge and it's ability to provide inexpensive and equal access to knowledge is in danger!