Home > News content

Jeff Dean's Writing: A Paper Looks at the Big Breakthrough of Google AI in 2018

via:博客园     time:2019/1/16 22:33:09     readed:259

data-ratio=0.5Planning Editor Vincent

Author Jeff Dean

Translator nuclear cola, ignorance

For Google's research team, 2018 is an exciting year. Google Research promotes technology research in a variety of ways, including basic computer science research results and publications, as well as a number of research applications in Google's emerging areas (such as health care and robotics), open source software contributions, and close collaboration with the Google product team, all of which are designed to provide useful tools and services. Next, we will highlight some of the work done by Google Research Institute in 2018.

Morality and AI

AI and Social Public Welfare

It is obvious that AI will have tremendous potential impact on many areas of society. An example of applying AI to solve practical problems is our work in flood forecasting. We work with several other Google teams to provide timely and accurate information on the likelihood and extent of floods, so that people in flood-prone areas can better decide how best to protect themselves and their property.

The second example is our work on earthquake aftershock prediction. We show a machine learning model which can predict the aftershock location more accurately than the traditional physics-based model. More importantly, because the design of ML model is explicable, scientists have been able to make new discoveries about the behavior of aftershocks, so they can not only predict more accurately, but also reach a new level of understanding.

We also saw a large number of external participants who worked with Google researchers and engineers to solve scientific and social problems using open source software such as TensorFlow, such as identifying humpback whale calls using convolutional neural networks, detecting new exoplanets, identifying diseased cassava plants, and so on.

Assistive technology

Other examples include Smart Compose (https://ai.googleblog.com/2018/05/smart-compose-using-neural-networks-to.html), a tool that uses prediction models to provide advice on how to write e-mail, make editing e-mail faster and easier, and Sound Search (https://ai.googleblog.com/2018/09/googles-next-generation-music.html), A technology based on Now Playing (https://support.google.com/pixelphone/answer/7535326?Hl=en) allows you to know more quickly and accurately what song is playing. In addition, Smart Linkify in Android (https://ai.googleblog.com/2018/08/the-machine-learning-behind-android.html) uses the ML model on the device to understand the text type selected by the user, and then displays the text type more useful to the user on the mobile screen (for example, if the user chooses the text to be the address type, then provides a map link to the address). )


One of our research focuses on making products like Google Intelligent Assistant support more languages and better understand semantic similarities, even if users use different ways to express the same concepts or ideas. These new functions are based on our work in improving speech synthesis and text-to-speech conversion.

Quantum computation

Quantum computing is a new computing paradigm, which promises to solve the challenges that classical computers can not solve. In the past few years, we have been actively engaged in this research. We believe that we have entered a state of affairs (so-called quantum hegemony) on at least one issue, which will be a watershed event in this field. Over the past year, we have made many exciting advances, including the development of a new 72-qubit computing device, Bristlecone.


Scientist Marissa Giustina installs Bristlecone chips in Santa Barbara's Quantum AI Laboratory

We also released the Quantum Computer Open Source Programming Framework Cirq, and discussed how to apply Quantum Computer to Neural Networks. Finally, we share our experience and technology in the performance fluctuation of quantum processors, and some ideas on how to use quantum computers as computational substrates for neural networks. We look forward to more exciting results in quantum computing in 2019.

natural language understanding

In 2018, Google's research on natural languages was also quite exciting, including basic research and product-centric collaboration. We have improved the work of Transformer in 2017 and obtained a parallel version called Universal Transformer, which shows great progress in many natural language tasks, including translation and linguistic reasoning. We have also developed BERT, the first deep two-way unsupervised language representation, which uses only plain text corpus for pre-training, and then optimizes various natural language tasks through transfer learning. BERT performed better in 11 natural language tasks than the latest results.


In addition to working with various research teams to implement Smart Compose and Duplex, we also strive to make Google Intelligent Assistants better able to handle multilingualism, with the goal of enabling assistants to naturally talk to users.

Perceptual research

Our perceptual research solves the big problem of computer understanding image, sound, music and video, and provides a powerful tool for image capture, compression, processing, creative expression and reality enhancement. In 2018, we improved Google Photos'ability to organize content that users care most about, such as people and pets. Google Lens and Google Assistant help users understand the natural world and answer their questions in real time. A key mission of Google AI is to enable people to benefit from our technology. This year, we have made great progress in improving the functionality and building blocks of the Google API, including visual and video enhancement in the Cloud ML API and building blocks related to ML Kit-based face recognition.


Google Lens can help you learn more about the world around you. For example, Lens can identify the breed of the dog.

In 2018, our academic contributions include in-depth learning for 3D scene understanding, such as stereo zooming (https://arxiv.org/abs/1805.09817), which can synthesize new scene views. Our research on better understanding of images and videos enables users to find, organize, enhance and improve images and videos in Google products (such as Photos, YouTube, Search, etc.). Some noteworthy developments in 2018 include a model for joint posture estimation and human instance segmentation, a system for visualizing complex motion, a system for simulating temporal and spatial relationships between human beings and objects, and improvements in video motion recognition based on distillation and 3D convolution.

Perception becomes more and more important on resource-limited platforms. MobileNetV2 is Google's next generation mobile computer vision model. Our MobileNets are widely used in academia and industry. MorphNet proposes an effective method to learn the deep network structure, which can achieve comprehensive performance improvements in image and audio models with limited computing resources.

Computational photography


Pixel 2 takes dynamic photos


Motion Stills augmented reality model


Left: The iPhone XS. Right: Pixel 3 Night Sight.

Algorithms and Theory

Algorithms are the core of Google's system, touching all of our products, from the routing algorithm behind Google Trips to the consistency hashing algorithm in the Google cloud. Over the past year, we have continued to study algorithms and theories, covering areas ranging from theoretical foundations to application algorithms, from graph mining to privacy-preserving computing. In terms of optimization algorithms, our work involves from studying continuous optimization of machine learning to distributed combinatorial optimization. In the aspect of continuous optimization of machine learning, our research on convergence of training neural network stochastic optimization algorithm (ICLR 2018 Best Paper Award) reveals the problems of gradient-based optimization methods (such as some variants of ADAM), and lays the foundation for the new gradient-based optimization method.


Performance comparison of ADAM and AMSGRAD on one-dimensional convex problem

In the area of distributed optimization, we are committed to improving the universality and communication complexity of combinatorial optimization problems. In other applications, we have developed some algorithms, such as balanced partitioning and hierarchical clustering of graphs with trillions of edges by drafting large-scale data sets. Our work on online delivery services won the WWW Best Paper Award in 2018. Finally, our open source optimization platform OR-tools won four gold medals in the Minizinc Constrained Programming Competition in 2018.

In the aspect of algorithm selection theory, we propose a new model and study the problem of reconstruction and learning multi-fractional logarithmic mixing. We also study functional classes that can be learned by neural networks and how to use machine learning to improve classical online algorithms.

At Google, it's important for us to keep learning technology strictly private. We have developed two new approaches to analyze how privacy differences can be amplified by iteration and rearrangement. We also use differential privacy technology to design robust motivation-aware learning methods for games. This learning technology has been applied in efficient online market design. Our new research in the field of market algorithms also includes helping advertisers test incentive compatibility of advertising bidding and optimization techniques for In-Application advertising. We have further pushed forward the boundary of the most advanced dynamic mechanism in the field of repeated bidding, and put forward strong dynamic bidding.

Finally, in terms of robustness of online optimization and online learning, we have developed a new online allocation algorithm for random input at peak time of traffic and a new robbery algorithm for damaged data.

software system

Most of our research on software systems is still related to building machine learning models and TensorFlow. For example, we released a dynamic control process for TensorFlow 1.0. Some of our new research has introduced a system called Mesh TensorFlow, which can easily specify large-scale distributed computing with model parallelism and support billions of parameters. In addition, we have released an extensible deep nerve sequencing library.


TF-Ranking library supports multi-item scoring architecture, which is an extension of traditional single scoring.

We also released JAX, a variant of NumPy. Although JAX is not part of TensorFlow, it uses some of the same underlying software infrastructure (such as XLA), and some of its ideas and algorithms are helpful to our TensorFlow project. Finally, we continue to study the security and privacy of machine learning and develop open source frameworks for security and privacy of AI systems, such as CleverHans and TensorFlow Privacy.

Another important research direction for us is to apply ML to software systems. For example, we continue to study how to move computing with hierarchical models to devices and make some contributions in learning memory access patterns. We also continue to explore how to use learning indexing to replace traditional indexing structures in database systems and storage systems.


Placement of Hierarchical Planner in an NMT Model

In 2018, Spectre and Meltdown are new security vulnerabilities in modern computer processors. As we continue our efforts to simulate CPU behavior, our compiler research team integrates their tools for measuring machine instruction latency and port pressure into LLVM to make better compilation decisions.

Running a large web service hosting content requires stable load balancing in a dynamic environment. We have developed a consistency hash scheme that guarantees maximum load on each server and deploys it to Google Cloud Pub/Sub. Vimeo's engineers found our early paper, implemented it in haproxy, and then open source it (https://github.com/arodland/haproxy/commit/b02bed24daf64743cb9a571e93ed29ee4bc7efe7) for Vimeo's load balancing project. The results are exciting. These algorithms help them reduce the cache bandwidth by nearly eight times, while eliminating the scaling bottleneck.


AutoML, also known as meta-learning, is to automate machine learning through machine learning. Over the years, we have been studying this area, and our long-term goal is to develop systems that can use the insights and capabilities derived from other previously solved problems to identify new problems and solve them automatically. In our early work in this field, we mainly used reinforcement learning, but we are also interested in using evolutionary algorithms.

Last year, we demonstrated how evolutionary algorithms can be used to automatically discover the neural network architecture of various visual tasks. We also discussed how to apply reinforcement learning to other problems besides the search of neural network architecture. Our work shows that it can be used to automatically generate image transformation sequences, to improve the accuracy of various image models, and to find new symbolic optimization expressions, which is more effective than the commonly used optimization update rules. Our work on AdaNet shows how to have a fast and flexible automatic algorithm with learning guarantees.


AdaNet adaptively increases the set of neural networks. In each iteration, it measures the set loss of each candidate and chooses the best one to enter the next iteration.

Another important point of our research is to automatically discover highly efficient neural network architectures so that they can run on mobile phones or autopilot cars. These environments impose strict restrictions on computing resources or reasoning time. Our work shows that by combining the accuracy of the model with its reasoning calculation time in the reward function of reinforcement learning architecture search, a highly accurate model can be found, while meeting specific performance constraints. We also explored the use of ML to learn automatic compression of ML models so that fewer parameters and less computing resources could be used.


Tensor Processing Unit (TPU) is a kind of ML hardware accelerator developed by Google. It has been specially designed for large-scale training and reasoning tasks since its inception. TPU has helped Google achieve a series of breakthroughs at the research level, including BERT, which has been discussed before.

In addition, it enables researchers around the world to enjoy the results of Google research in an open source way, and to seek new breakthroughs on this basis. For example, anyone can fine-tune BERT running on top of TPU for free through Colab, while TensorFlow Research Cloud gives thousands of researchers the opportunity to benefit from the larger scale of free cloud TPU computing power.


The left is a single TPU V3 device, and the right is part of the TPU V3 Pod. TPU V3 is the latest generation of Tensor Processing Unit (TPU) hardware of Google. These systems are used by external customers in the form of Cloud TPU v3, and water cooling mechanism is adopted to bring the best performance (computer chip plus water cooling system, always exciting!). Complete TPU V3 Pod can bring more than 100 trillion computing times to the major global core machine learning problems.

Open Source Software and Data Set

In the process of collaboration with research and software engineering communities, the publication of open source software and the establishment of new public data sets have been our two most important contributions. One of our most remarkable achievements in this field is TensorFlow, which is a widely popular machine learning computing system released in November 2015. We celebrated TensorFlow's third birthday in 2018, during which it was downloaded more than 30 million times and more than 1700 contributors brought it more than 45,000 commits.

We are pleased to see that TensorFlow has the strongest Github user retention and attraction among all the top machine learning and in-depth learning frameworks. The TensorFlow team is also committed to solving the Github problem quickly and providing a smooth way for external contributors to participate. According to the statistics of Google Scholar, during the research process, we continue to support most of the machine learning and in-depth learning research in the world on the basis of published papers.

TensorFlow Lite has been installed on more than 1.5 billion devices worldwide just a year after its launch. In addition, TensorFlow.js has become the number one machine learning framework in JavaScript; just nine months after its launch, TensorFlow.js has received more than 2 million clicks and 250,000 downloads on the Content Delivery Network (CSN) and more than 10,000 stars on Github.

In addition to continuing to develop existing open source ecosystems, in 2018 we introduced a new framework for achieving flexible and repeatable learning enhancements, building new visualization tools, quickly understanding the characteristics of data sets (without writing any code), and adding a new high-level library for expressing ranking learning problems (such problems are designed to maximize ranking learning). The ability to rank items in a way that meets the needs of list efficiency is of great importance to search engines, recommendation systems, machine translation, dialogue systems and even computational biology. A fast, flexible and learning-guaranteed framework for AutoML solutions is published. A browser built-in real-time t-SNE visualization is constructed using TensorFlow.js. At the same time, FHIR tools and software that can be docked with electronic medical data are added (explained in detail in the medical care section of this article).


The real-time evolution of tSNE embedded in a complete MNIST data set. The data set contains 60,000 handwritten digital images.

Demo link: https://nicola17.github.io/tfjs-tsne-demo/

Common data sets can often be an important basis for inspiration, and help researchers from many fields make great progress by bringing a lot of interesting data and problems to a wider community. In addition, public data sets can stimulate people's enthusiasm to achieve better results in different tasks, thus ensuring that the relevant communities have a healthy competitive situation.

In 2018, we were delighted to release Google Dataset Search, a new tool for finding common data sets across the entire network. Over the years, we have been planning and publishing innovative data sets, including a large number of annotated images or videos, Bangladesh crowd data sets for speech recognition, and even robotic arm grabbing data sets. In 2018, we continue to add more data set resources to this important reserve.


Use Crowdsource to add images of India and Singapore to the Open Images Extended dataset.

We also released Open Images V4, which contains 15.4 million borders (corresponding to 600 object categories on 1.9 million images) and 30.1 million artificially validated image-level labels belonging to 19794 categories. In addition, we obtained 5.5 million annotations from tens of thousands of users around the world using crowdsource. google. com, hoping to introduce more people and scenarios from all over the world to expand the scale of the data set.

In addition to specific data sets, we have also carried out a series of explorations in the Fluid Annotation project. Relevant technological achievements can speed up the creation and visualization of data sets. Fluid Annotation is an exploratory machine learning driver interface that can annotate image content more quickly.


Fluid Annotation interface visualizes images in COCO data sets.



Application of Artificial Intelligence in Other Areas

In 2018, we will apply machine learning technology to many problems in physics and biology. Using machine learning technology, we can provide scientists with data mining capabilities equivalent to hundreds or even thousands of research assistants, thus significantly improving the creativity and productivity of scientists.


Our algorithm is tracking the activity of a single neuron in a 3-D form of bird brain.

Other applications of machine learning in science include:

  • Data mining is used to aggregate the light curves of stars in order to find new planets outside the solar system.

  • Identifying the origin or function of short DNA sequences

  • Automatic detection of out-of-focus microscopic images

  • Creating Similar Cell Images with Multiple Staining Characteristics in Digital Way

  • Automatic Mapping of Mass Spectrum Output to Peptide Chains


The pre-trained TensorFlow model can be used to evaluate the focusing quality of patch splicing under Fiji (ImageJ) cell microscopy. The model estimates the focusing quality and predictive accuracy respectively by using the hue and brightness of the boundary position.


Once the results have been clinically and scientifically validated, the next step is to conduct user and HCI studies to understand how to actually deploy them in the clinical environment. In 2018, we have further expanded the scope of research in the wide space of computer-aided diagnosis, hoping to make computer-aided diagnosis a new part of the clinical process.

At the end of 2016, we published a study that showed that a model for detecting signs of diabetic retinopathy by evaluating retinal fundus images was as effective as, or even slightly better than, a certified ophthalmologist by the American Medical Commission. In 2018, we went a step further and proved that our model did reach an analytical level comparable to that of retinal specialists by using training images labeled by ophthalmologists in conjunction with final diagnostic results (consultation by multiple retinal specialists and collective evaluation of each fundus image).

Since then, we have published an evaluation which shows that ophthalmologists can obtain higher diagnostic accuracy than independent judgements after using the machine learning model. We have also worked with Alphabet's colleagues to deploy the diabetic retinopathy detection system for more than a dozen institutions, including the Arravind Ophthalmological Hospital in India and the Rajavithi Hospital affiliated to the Ministry of Health in Thailand.


The retinal fundus image on the left side was assessed as having moderate diabetic retinopathy (Mo) by the ophthalmic engineer consultation team (real background). On the upper right is the predictive score from the model ("N" is non-retinopathy, "Mi" is early retinopathy, and "Mo" is moderate retinopathy). At the lower right is the doctor's diagnostic conclusion after no reference model results (Unassisted, no assistance) and reference model predictive scores (Grades Only, only reference scores).

In addition to striking sparks with ophthalmologists, we have also released a new machine learning model in the medical research process. The model can assess the relationship between retinal image and cardiovascular disease risk. This also brings a new hope, that is, to help clinicians better understand the health status of patients through non-invasive biomarkers.

As an important part of this work, we have also developed a series of tools to enable researchers to easily create such models based on completely different tasks and different basic electronic health record data sets. In this work, we have also established the Fast Medical Interoperability Resource (FHIR) standard and related open source software to help practitioners process medical data in a more relaxed and industry-standard manner (see GitHub repo).

When machine learning technology is applied to historical data collection, the quality of data compilation is directly determined by understanding the real population structure characteristics and prejudices in the past. Machine learning gives us the opportunity to discover and solve prejudices, and we are actively designing Google's AI system to promote this healthy and fair trend.

Research outreach

We interact with external research communities in a variety of ways, including teacher participation and student support. We are extremely proud that Google has been able to recruit hundreds of undergraduates, master's and doctoral students as interns, and to offer many years of doctoral scholarships to students from North America, Europe and the Middle East. In addition to financial support, each scholarship winner will have one or more Google researchers as mentors. We bring together all the researchers to organize an annual Doctoral Scholarship Summit at Google. Here, they will be exposed to Google's ongoing cutting-edge research projects, and have the opportunity to establish contacts with Google internal researchers and other doctoral researchers from around the world.

In addition, we have established the Google Ai Residency project as a supplement to the above scholarship program. In this way, we hope to provide a year for students who want to know more about in-depth learning research, during which they will work with Google researchers and receive guidance. This year is the third year of the project. Participants are playing their part in the teams of the Google Global Office, exploring machine learning, perception, algorithm and optimization, language understanding, and health care. The fourth year of this program has just come to an end, and we are very happy to welcome a group of energetic new members at the beginning of 2019.

Every year, we also support research projects for many teachers and students through the Google Faculty Research Awards program. In 2018, we continue to hold seminars for teachers and graduate students in specific fields at Google offices, including AI/ML research and practice seminars at Bangalore, India, Algorithms and Advantages seminars at our Zurich office, Machine Learning Health Applications seminars in Sunnyville, and Cambridge, Massachusetts office. Seminar on Equity and Prejudice in Machine Learning, etc.

We believe that making an open contribution to a broader research team is a necessary prerequisite for maintaining a healthy, efficient and dynamic ecosystem for our own research. In addition to our open source projects and datasets, most of our research results are also published in top conferences and academic journals. We have also actively participated in the organization and sponsorship of conference activities in various disciplines.

New environment, new face

In 2018, we are proud to welcome more new people with a wide range of backgrounds to join our research organization. We established the first AI Research Office in Africa in Accra, Ghana. We have expanded AI research facilities in Paris, Tokyo and Amsterdam and opened new research laboratories in Princeton. We continue to invite talented people to join our offices around the world. You can learn more about joining our research work here.


Looking forward to 2019

This blog post is only a brief summary of a small part of Google's research in 2018. Looking back on 2018, we are excited and proud of the breadth and depth of our achievements. Focusing on 2019, we look forward to a more significant and far-reaching impact on Google's own development direction and product research and development, as well as the broader scientific research and engineering fields.

Links to the original text:


China IT News APP

Download China IT News APP

Please rate this news

The average score will be displayed after you score.

Post comment

Do not see clearly? Click for a new code.

User comments