NIPS 2017 Workshop: Deep Learning At Supercomputer Scale

Five years ago, it took more than a month to train a state-of-the-art image recognition model on the ImageNet dataset. Earlier this year, Facebook demonstrated that such a model could be trained in an hour. However, if we could parallelize this training problem across the world’s fastest supercomputers (~100 PFlops), it would be possible to train the same model in under a minute. This workshop is about closing that gap: how can we turn months into minutes and increase the productivity of machine learning researchers everywhere?

This one-day workshop will facilitate active debate and interaction across many different disciplines. The conversation will range from algorithms to infrastructure to silicon, with invited speakers from Cerebras, DeepMind, Facebook, Google, OpenAI, and other organizations. When should synchronous training be preferred over asynchronous training? Are large batch sizes the key to reach supercomputer scale, or is it possible to fully utilize a supercomputer at batch size one? How important is sparsity in enabling us to scale? Should sparsity patterns be structured or unstructured? To what extent do we expect to customize model architectures for particular problem domains, and to what extent can a “single model architecture” deliver state-of-the-art results across many different domains? How can new hardware architectures unlock even higher real-world training performance?

Our goal is bring people who are trying to answer any of these questions together in hopes that cross pollination will accelerate progress towards deep learning at true supercomputer scale.

Confirmed Speakers

  1. Priya Goyal - “ImageNet in 1 Hour” - Facebook Research
  2. Timothy Lillicrap - “Scalable RL & AlphaGo” - DeepMind
  3. Nitish Keskar - “Generalization Gap” - Salesforce Research
  4. Scott Gray - “Small World Network Architectures” - OpenAI
  5. Matthew Johnson & Daniel Duckworth - “KFAC and Natural Gradients” - Google Brain
  6. Tim Salimans - “Evolutionary Strategies” - OpenAI
  7. Elad Hoffer, Itay Hubara - “Closing the Generalization Gap” - Technion
  8. Michael James - “Scaling with Small Batches” - Cerebras
  9. Azalia Mirhoseini - “Learning Device Placement” - Google Brain
  10. Gregory Diamos - “Scaling is Predictable” - Baidu
  11. Simon Knowles - “Scalable Silicon Compute” - GraphCore
  12. Sam Smith - “Don’t Decay the Learning Rate, Increase the Batchsize” - Google Brain
  13. Shankar Krishnan - “Neumann Optimizer” - Google Brain
  14. Ujval Kapasi - “Practical Scaling Techniques” - NVIDIA
  15. Thorsten Kurth - “Scaling Deep Learning to 15 PetaFlops” - Lawrence Berkeley National Labs
  16. Chris Ying - “ImageNet is the New MNIST” - Google Brain

Important Dates

Accepted Posters

Registration Awards

We will offering NIPS workshop registrations to the four best papers submitted by students at academic instituions. Please indicate in your submission if you would like to apply for this award.

Contact Us

Feel free to reach us at with any questions you might have.