Differences
This shows you the differences between two versions of the page.
| — |
wiki:news [2009/12/22 13:39] (current) whf created |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | **"[[http://www.diggingintodata.org/|Digging into Data Challenge]]" 2009 funds two web-based speech-oriented projects:** | ||
| + | |||
| + | |||
| + | **Harvesting Speech Datasets for Linguistic Research on the Web** | ||
| + | |||
| + | **Awardees:** Mats Rooth, Cornell University, NSF; Michael Wagner, McGill University, SSHRC. | ||
| + | **Description:** This project will **harvest audio and transcribed data from podcasts, news broadcasts, public and educational lectures and other sources to create a massive corpus of speech**. Tools will then be developed to analyze the different uses of prosody (rhythm, stress and intonation) within spoken communication. | ||
| + | |||
| + | |||
| + | |||
| + | **Mining a Year of Speech** | ||
| + | |||
| + | **Awardees:** Mark Liberman, University of Pennsylvania, NSF; John Coleman, University of Oxford, JISC. | ||
| + | **Additional Key Participants:** The British Library. | ||
| + | **Description:** This project focuses on large scale data analysis of audio -- specifically the spoken word. This project will create tools to enable rapid and flexible access to **over 9,000 hours of spoken audio files**, containing a wide variety of speech, drawn from some of the leading British and American spoken word corpora, allowing for new kinds of linguistic analysis | ||
| + | |||
| + | |||