Speech recognition, although it is not a fresh technology, but the realization of real-time voice into the dictation of the transcription, has become a field of artificial intelligence vertical landing new breakthrough point. In recent days, Sogou launched transcription, shorthand "artifact" & mdash; Sogou dictation, from the Sogou input method of voice recognition "rssquo; to Sogou dictation, AI application gradually" into the homes of ordinary people "natural interaction Also led the AI scene of the floor.
When the Sogou input method in 2006 formally on the line, the user is in the keyboard input of the golden age; 2011, Sogou began to layout their own voice technology, and within a year of rapid product. From the keyboard to the touch screen, and then to the voice input, Sogou input method in the "human interaction" and constantly accumulate the experience, the mouth is also gradually become a user from the fashion habits.
Voice is the most natural way of human communication, human-computer interaction, and it is also considered to be the starting point for opening the era of artificial intelligence. Sogou company as the strongest domestic AI one of the Internet companies, has long established a strong voice self-study team, with the current Internet's largest voice data. Statistics show that Sogou input method single voice input frequency has reached 260 million times, an increase of more than 80% a year ago. Through the large-scale high-quality voice training data and the depth of learning technology capacity accumulation, Sogou also this voice recognition technology advantages into more applicable scenarios.
From the technical point of view, Sogou dictation product is the key to the accuracy of voice recognition, it is understood that Sogou dictation using Sogou input method of long-term voice transcription technology, from the project to the present, the error rate has dropped by 30%. In the acoustic model, the deep-loop neural network technology Deep LC-CLDNN + CTC technology is adopted, and the deep CNN + CTC mode is used in the transfer mode. The language model is based on the T-level mass input method. mold.
Sogou dictation recognition accuracy has reached the international leading level, voice input faster than the keyboard typing, more convenient and more accurate. But the AI application process is not entirely technical-oriented, but the scene-driven product-oriented, focusing on how to deepen the user needs, how to use more scenes, only the needs and scenes can be combined to become a good AI products. In the field of voice, Sogou first realized that the product floor needs to drive the scene, in the vertical scene, AI can really use for the user.
In the specific application scenarios, Sogou dictation for the user to use different environments, such as meetings, writing novels and other scenes to optimize the recognition effect than the general effect to enhance more than 15%; for libraries, cafes and so on is not easy to speak loudly The use of voice scenes, providing whisper recognition technology, in the human voice volume as low as 30 decibels, still can be accurately identified. Sogou dictation as a multi-scene voice dictation tool, greatly improving the user productivity.
From the Sogou input method of speech recognition capabilities extended to Sogou dictation, natural interaction to change the life of the curtain gradually opened. In the future, voice technology in a variety of application scenarios there are a lot of opportunities, such as smart home scene, we hope to go home after the use of voice and television, remote control, speakers, curtains and other speeches. Not only in the smart home application scene, but also in more vertical application scenarios, such as car, medical, education and other environments, voice brings human-computer interaction changes will profoundly change our way of life and habits.
Human beings for the ultimate description of artificial intelligence, is always the same as the natural language exchange, which is Sogou artificial intelligence development goals. For Sogou input method, AI also gives it more of the future, in the Sogou concept, the use of input method when the machine can better understand the intentions of people, so as to push the associated information, derivative content, the future, Sogou input method The auxiliary dialogue will help man to communicate better in the machine age.
From the input method to Sogou dictation and then to the auxiliary dialogue, Sogou AI technology through the natural interaction of the extension of the equipment to improve the convenience, timeliness, broaden the practical scene and increase the interactive latitude, Sogou has been doing is To help users to express and get information more simple ", will focus on the development of artificial intelligence technology in the field of language, natural interaction led AI application landing.