AVICAR Project: Audio-Visual Speech Recognition in a Car
Contents
- Announcements
- About the Project
- Downloads
- Subject Information
- Contact Information
- Introductory Video: Windows AVI Format (20 MB)
- Introductory Video: Quicktime Format (16 MB)
Announcements
April 2, 2012:The full database is now available to any interested speech or language researcher via http or wget from http://ifp-08.ifp.uiuc.edu/protected/AVICAR_DIST/. Documentation can be browsed at http://ifp-08.ifp.uiuc.edu/public/AVICAR_DIST/documents/; the movie shows the data collection procedure. Note that audiovisual alignment problems persist in the distributed data; see the Queensland University page for a solution. In order to download audio and video data, please contact avicar@gmail.com for a username and password.
September 5, 2010: Rajitha Navarathna and Sridhar Sridharan of the Queensland University of Technology have re-aligned the audio and video recordings of 3000 phone-number utterances with 150 video files (30 videos per each noise condition), so that it is now possible to use the phone number portion of the database for synchronized audiovisual speech recognition experiments. This alignment was produced by researchers in the Speech and Audio Research Laboratory of the SAIVT Research program at Queensland University of Technology (QUT), Brisbane Australia (http://www.bee.qut.edu.au/research/projects/saivt/). Alignment files are currently only available by writing to researchers at QUT.
May, 2006: The AVICAR database is now available for public release. This database was originally intended to contain recordings of total of 100 (50 male and 50 female) subjects. This release contains post-processed audio recordings of 87 subjects, and video recordings of 86 subjects.
Introductory Videos
- Windows AVI Format (20 MB)
- Quicktime Format (16 MB)
DTMF Segmentation Software
- Software to segment audio files automatically at DTMF tones is available here; the SVN should allow anonymous login.