Hinton, Geoffrey E. “Deterministic Boltzmann learning performs steepest descent in weight-space.” Neural computation 1.1 (1989): 143-150.
Bengio, Yoshua, and Samy Bengio. “Modeling high-dimensional discrete data with multi-layer neural networks.” Advances in Neural Information Processing Systems 12 (2000): 400-406.
Bengio, Yoshua, et al. “Greedy layer-wise training of deep networks.” Advances in neural information processing systems 19 (2007): 153.
Bengio, Yoshua, Martin Monperrus, and Hugo Larochelle. “Nonlocal estimation of manifold structure.” Neural Computation 18.10 (2006): 2509-2528.
Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. “Reducing the dimensionality of data with neural networks.” Science 313.5786 (2006): 504-507.
Marc’Aurelio Ranzato, Y., Lan Boureau, and Yann LeCun. “Sparse feature learning for deep belief networks.” Advances in neural information processing systems 20 (2007): 1185-1192.
Bengio, Yoshua, and Yann LeCun. “Scaling learning algorithms towards AI.” Large-Scale Kernel Machines 34 (2007).
Le Roux, Nicolas, and Yoshua Bengio. “Representational power of restricted boltzmann machines and deep belief networks.” Neural Computation 20.6 (2008): 1631-1649.
Sutskever, Ilya, and Geoffrey Hinton. “Temporal-Kernel Recurrent Neural Networks.” Neural Networks 23.2 (2010): 239-243.
Le Roux, Nicolas, and Yoshua Bengio. “Deep belief networks are compact universal approximators.” Neural computation 22.8 (2010): 2192-2207.
Bengio, Yoshua, and Olivier Delalleau. “On the expressive power of deep architectures.” Algorithmic Learning Theory. Springer Berlin/Heidelberg, 2011.
Montufar, Guido F., and Jason Morton. “When Does a Mixture of Products Contain a Product of Mixtures?.” arXiv preprint arXiv:1206.0387 (2012).
By Terry Taewoong Um, University of Waterloo.
We believe that there exist classic deep learning papers which are worth reading regardless of their application domain. Rather than providing overwhelming amount of papers, We would like to provide a curated list of the awesome deep learning papers which are considered as must-reads in certain research domains.
Before this list, there exist other awesome deep learning lists, for example, Deep Vision and Awesome Recurrent Neural Networks. Also, after this list comes out, another awesome list for deep learning beginners, called Deep Learning Papers Reading Roadmap, has been created and loved by many deep learning researchers.
Although the Roadmap List includes lots of important deep learning papers, it feels overwhelming for me to read them all. As I mentioned in the introduction, I believe that seminal works can give us lessons regardless of their application domain. Thus, I would like to introduce top 100 deep learning papers here as a good starting point of overviewing deep learning researches.
To get the news for newly released papers everyday, follow my twitter or facebook page!
Awesome list criteria
- A list of top 100 deep learning papers published from 2012 to 2016 is suggested.
- If a paper is added to the list, another paper (usually from *More Papers from 2016" section) should be removed to keep top 100 papers. (Thus, removing papers is also important contributions as well as adding papers)
- Papers that are important, but failed to be included in the list, will be listed in More than Top 100 section.
- Please refer to New Papers and Old Papers sections for the papers published in recent 6 months or before 2012.
- < 6 months : New Papers (by discussion)
- 2016 : +60 citations or "More Papers from 2016"
- 2015 : +200 citations
- 2014 : +400 citations
- 2013 : +600 citations
- 2012 : +800 citations
- ~2012 : Old Papers (by discussion)
Please note that we prefer seminal deep learning papers that can be applied to various researches rather than application papers. For that reason, some papers that meet the criteria may not be accepted while others can be. It depends on the impact of the paper, applicability to other researches scarcity of the research domain, and so on.
Editor: What follows is a selection of curated papers from this list, one from each category, as selected by the author. Please see the original for the full listing.
1. Understanding / Generalization / Transfer
Distilling the knowledge in a neural network (2015), G. Hinton et al. [pdf]
2. Optimization / Training Techniques
Batch normalization: Accelerating deep network training by reducing internal covariate shift (2015), S. Loffe and C. Szegedy [pdf]
3. Unsupervised / Generative Models
Unsupervised representation learning with deep convolutional generative adversarial networks (2015), A. Radford et al. [pdf]
4. Convolutional Neural Network Models
Deep residual learning for image recognition (2016), K. He et al. [pdf]
5. Image: Segmentation / Object Detection
Fast R-CNN (2015), R. Girshick [pdf]
6. Image / Video / Etc.
Show and tell: A neural image caption generator (2015), O. Vinyals et al. [pdf]
7. Natural Language Processing / RNNs
Learning phrase representations using RNN encoder-decoder for statistical machine translation (2014), K. Cho et al. [pdf]
8. Speech / Other Domain
Speech recognition with deep recurrent neural networks (2013), A. Graves [pdf]
9. Reinforcement Learning / Robotics
Human-level control through deep reinforcement learning (2015), V. Mnih et al. [pdf]
10. More Papers from 2016
Domain-adversarial training of neural networks (2016), Y. Ganin et al. [pdf]
Bio: Terry Taewoong Um is a PhD candidate at U. Waterloo. Terry completed his B.S and M.S. in Mechanical and Aerospace Engineering at the Seoul National University in 2008, 2010, respectively. He also worked at LIG Nex1 and Korea Institute of Science and Technology (KIST) until 2014. His previous research was on robot motion planning and power-assist exoskeleton. In his Ph.D. study, he is focusing on introducing Lie group geometry to deep learning techniques for learning human / robot motions.
Original. Reposted with permission.