● Mozilla began collecting Chinese phonetic data in mainland China, further enriching its public voice data set
● Voice files in 27 different languages have been collected and will be expanded to support 72 languages
● Common Voice is the largest open source voice transcript dataset ever, and its newly released database includes voice files recorded in 18 languages from more than 42,000 contributors for a total length of 1,400 hours.
The voice interface is the general trend of the Internet in the future. Car voice assistants, smart watches, smart light bulbs, etc... The equipment with built-in voice recognition technology is growing day by day. However, innovations in related technologies still face significant obstacles: innovative companies, researchers, or developers who are interested in creating voice-assisted solutions need to obtain a large amount of voice data transcribed into text to train machine learning algorithms. However, the amount of voice data and the number of supported languages in the existing public voice data sets are extremely limited, and the private voice data is not only in the hands of a few companies, but also has a high cost.
Therefore, Mozilla has been in operation since June 2017.Common VoiceThe plan is to establish a global open source voice database to cope with the development needs of voice interfaces and to break through the current market limitations. Mozilla believes that such interfaces should not be held in the hands of a handful of vendors with voice service technology, and hopefully allow users to absorb and understand information in their own language and familiar tone.
27 voice data including Chinese (Mandarin) have been collected
Common Voice began collecting multilingual voice data in June 2018. Since then, the project has grown and become more global and inclusive. Over the past 10 months, a large number of blood contributors have responded enthusiastically, launching a voice file collection program in 27 languages on the Common Voice website, and recording programs in up to 72 languages are underway.
The latest language to join is Chinese (Mandarin). Now, netizens all over the world can goHttps://voice.mozilla.org/zh-CNWebsite "donate sounds" or verify other people's recordings.
Voice contributors can choose to keep a record of their projects to keep track of their own recordings. In addition, demographic information can be optionally provided to assist Mozilla in improving the voice data used to train the speech recognition engine.
For other language data collected by Common Voice, Mozilla's goal for Chinese (Mandarin) is to accumulate approximately 10,000 hours of verified audio, as 10,000 hours of audio is sufficient to train a complete speech recognition system so that everyone can Together to promote the progress of speech recognition technology. Whether on the way to work, on the bus, during lunch breaks, at home, or when gathering with friends and family, you can passVoice.mozilla.orgWebsite oriOS app,if there isMobile phoneOr a computer, you can donate sounds or verify other people's audio.
George Roter, director of the Mozilla Open Source Innovation Program, said: "Even if a person only records or listens to a few seconds of audio, if the number of contributors is hundreds of thousands, the amount of data added will be amazing! When more people are willing to The value of this voice data set can be increased even faster."
Publish multilingual speech data sets
Mozilla will not forget the original intention and continue to enrich the content of the voice dataset, making it a public resource available to everyone. The first multi-language voice dataset was released in February this year, covering a wide range of audio recordings in 18 languages, including: English, French, German and Chinese (Taiwan) and other widely spoken languages, as well as Welsh and A relatively unpopular language such as Kabir. Common Voice has collected more than 42,000 people's recordings to date, with a total length of about 1,400 hours, and the amount of voice data continues to grow.
After the release of this data set, Common Voice has surpassed other voice data sets of the same type, and has opened tens of thousands of recording files and corresponding texts to the public.CC0Authorization). Anyone can goCommon Voice websiteDownload the complete voice data set.
George Roter further stated: “Mozilla is committed to promoting the development of a more diverse and innovative voice technology ecosystem. We not only hope to launch our own voice technology products, but also we are committed to supporting the development of researchers and small businesses. In the process of public multi-language voice datasets, we are honored to be helped by more and more people, and we are very grateful to the volunteers for their enthusiasm, so that we can successfully support Chinese Mandarin."