Xiaocha, chestnut, Annie, from the concave templeQuotation production public number QbitAI
ICML 2019 The best paper is here!
This year, a total of 3,424 papers were submitted to the annual International Conference on Machine Learning, which received 774 articles. There are two papers that stand out from the thousands of horses and become the best papers of ICML 2019.
Who is this big prize? A paper by Google et al., “Common Assumptions for Unsupervised Separation Representation,” shows unsupervised methods (without inductive bias)Can't learnReliable discrete representation (Disentangled Representations).
This highly intelligent and courageous study almost completely denied the existing peers’ results.Also proves that Hinton’s previous views are problematic.:
The other is “Convergence Rate of Sparse Gaussian Process Regression” by the three researchers at Cambridge University.
Look at the best research this year in detail:
Best Paper 1: Separate Representation, Unsupervised Learning
To sum up in one sentence: the team of Google Brain, ETH Zurich, and Maple tested 12,000 models and raised serious questions about the existing unsupervised separation characterization learning research.
Understanding high-dimensional data and distilling knowledge into useful representations in an unsupervised manner is an important challenge for deep learning.
One way is to useSeparate characterization(disentangled representation):
The model captures a variety of independent features, and if one of the features changes, the other features are not affected.
Once this method is successful, it can make a machine learning system that can be used in the real world, either for robots or for autonomous vehicles, in order to cope with scenes that have not been seen in training.
However, in the unsupervised separation of characterization learning, recent research is difficult to see how good these methods are, and how big the limitations are.
The Google AI team made a large-scale assessment of the various recent achievements. The results of the assessment present serious challenges to existing research. And giveSeparate learningFuture research provides some advice.
What is a large-scale assessment? Google team trained12,000 models, covering the most important methods at present, as well as evaluating indicators.
Importantly, the code used in the evaluation process, as well as 10,000 pre-trained models, have been released.
Together they form a huge library calledDisentanglement_lib. Let the later researchers easily stand on the shoulders of their predecessors.
After the massive test, Google found two major problems:
1,Did not find any empirical evidence, showing that unsupervised methods can learn reliable separation characterizationBecause random seeds and hyperparameters seem to be more important than Model Choice.
That is to say, even if a large number of models are trained, some of them get a separate representation, and it is difficult to find these features without looking at the ground truth label.
In addition, the use of super-parameter values is not easy to use in multiple data sets.
The Google team said that these results are consistent with their theorem:
In the case where the data set and model do not have Inductive Biases, it is impossible to learn the separation characterization using an unsupervised method.
In other words, you must add a premise to the dataset and model.
2. On the models and data sets participating in the assessment,Did not confirm that the separation characterization is helpful for downstream tasksFor example, there is no evidence that with a separate representation, AI can be learned with fewer annotations.
The advice for the latecomers is:
Incidentally, this is a study that selected the ICLR 2019 workshop, but it eventually became the best paper for ICML.
Best Paper 2: Convergence Rate of Sparse Variational Gaussian Process Regression
This year's ICML's second best paper was research from Cambridge University and Prowler.io.
An excellent variational approximation of a Gaussian process posterior has been developed before. Avoid data set size N, calculation time complexity is O (N3), reduce the computational cost to O (NM)2), where M is a number that is much smaller than N.
Although the computational cost is linear for N, the true complexity of the algorithm depends on how M is added to ensure a certain approximate quality.
This paper solves this problem by describing the behavior of the upper bound of the backward KL divergence (relative entropy). The researchers have shown that if M grows slower than N, the KL divergence is likely to become arbitrarily small. A special case is for a regression of a D-dimensional normal distribution input with a common squared exponent kernel, as long as M = O (logDN) is enough to ensure convergence.
The results show that as the data set grows, the Gaussian process posterior probability can be easily approximated and provides a specific rule for how to add M to the continuous learning scenario.
The researchers have shown that the boundary of the KL divergence from the sparse generalized regression variation approximation to the posterior generalized regression depends only on the attenuation of the eigenvalues of the covariance operators of the previous kernel.
This boundary proof training data concentrates on a small area of smooth kernels allowing for high quality, very sparse approximations. When M≪N, truly sparse nonparametric inferences can still provide reliable estimates of boundary likelihood and point-by-point posteriori.
The author concludes by pointing out that the extension of models with non-conjugation possibilities, especially the additional errors introduced by sparsity in the framework of Hensman et al., provides a promising direction for future research.
The first author of this article is David Burt, a Ph.D. student in the Department of Information Engineering at the University of Cambridge. His main areas of research are Bayesian nonparametric and approximate reasoning.
One of the authors, Mark van der Wilk, is a researcher at Prowler.io. He is also a Ph.D. student in machine learning at the University of Cambridge. His main areas of research are Bayesian reasoning, reinforcement learning, and Gaussian process models.
7 best paper nominations
In addition to the 2 best papers, there are 7 papers that have been nominated for the best papers:
1,Analogies Explained: Towards Understanding Word Embeddings (University of Edinburgh)
2,SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver (CMU, University of Southern California, etc.)
3,A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks (Sacre University, Paris, etc.)
4,Towards A Unified Analysis of Random Fourier Features (Oxford University, London King's College)
5,Amortized Monte Carlo Integration (Oxford University, etc.)
6,Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning (MIT, DeepMind, Princeton)
7,Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement (University of Amsterdam, etc.)
Many domestic universities list
Compared with usual, this year's ICML is particularly lively.
The German Bosch Company has captured the receiving data of the ICML 19 official website, and has collected the proportion of the papers accepted, the organizations that contributed the most, and the individual authors who contributed the most. Many domestic universities and scholars are on the list.
Original statistical address:https://www.reddit.com/r/MachineLearning/comments/bn82ze/n_icml_2019_accepted_paper_stats/
This year, a total of 3,424 papers were submitted, and 774 papers were received with a reception rate of 22.6%. In 2018, ICML submitted 2,473 papers and received 621 papers with a reception rate of 25%.
Compared with last year, the number of papers submitted this year has increased a lot, butLower admission rate.
So, among the many submission agencies, who is the one with the highest contribution?
Bosch counts the institutions that receive the papers. The ranking criteria is to measure the total amount of papers contributed by an institution. The final statistical results are as follows:
The red in the picture above shows the first author of each institution, and the green is the last author of the last ranking.
The results showed that technology giant Google contributed the most, MIT second, and the University of California at Berkeley won the third place.
Among them, Tsinghua University, Peking University, Nanjing University, Hong Kong Chinese University, Shanghai Jiaotong University, Alibaba and many other Chinese universities and companies are on the list.
In these receiving papers,The number of papers from academia is far more than that of industryThe source of the paper is as follows:
Overall, the academic community contributed 77% of the papers, and the industry contributed 23%.
Among the authors of so many submissions,Which authors have the highest contribution? Bosch also counted this.
The results show that machine learning at the University of California at BerkeleyDaniel Michael Jordan has the largest number of papersVolk Cevher, professor of EPFL (French Federal Institute of Technology), ranked second, and Sergey Levine of the University of California at Berkeley ranked third.
There are also quite a fewChinese scholarThe record is quite good. Zhu Jun, a professor at the Department of Computer Science and Technology at Tsinghua University, Liu Tieyan from Microsoft Research Asia, and Long Mingsheng from the Software College of Tsinghua University, have published four papers in ICML 2019.
Finally, attach the official website of this year's ICML 2019 conference: