We heard a while back that ImageNet is over. Well, Fei-Fei Li recently gave a talk to cap off the end of ImageNet, reviewing its incredible important history, and where its headed. ImageNet was definitely important (granted, this is from a casual observer's point of view - I'm sure there were a lot of other important threads of development at the same time). When people talk about when Neural Networks and particularly CNNs came to dominate, they go back to ImageNet 2012. And then VGG, Inception, ResNets - all these milestones come from ImageNet competitions. Transfer-learning comes a lot from the VGG models. And now, it's going to Kaggle! Which is a natural progression. Read the slides! It's a great review of ImageNet's history.
A quick round-up of papers I really haven't had time to read (carefully):
Deep Neural Networks Do Not Recognize Negative Images
by Hossein Hosseini, Radha Poovendran
A simple experiment with expected if nevertheless concerning results. If you take a Deep Learning network trained on MNIST, and run it on MNIST with negative images, the model falls apart. This should not be surprising. An MNIST classification model is relatively low-capacity and has seen only a fraction of the visual space. Interpreting negative images makes sense to humans, but isn't natural for model that's designed to act on a particular data domain. I am also no sure if any other existing algorithm, trained on MNIST, would pass this test. Goodfellow has a good tweet on this too.
As expected, a response to the "NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles" paper from last week. Basically, adversarial attacks are pretty robust, and there's a sliding scale of how adversarial/human-fooling we want them to be.
Optimizing the Latent Space of Generative Networks
by Piotr Bojanowski, Armand Joulin, David Lopez-Paz, Arthur Szlam
FAIR has a new paper on on GANs, or rather, a paper about whether the success of GANs in generating images is attributed to the adversarial training regime, or just the choice of network (deconvolution/tranposed-convolutional networks). They run a non-parametric decoder convolutional network (similar to autoencoders, except without the encoder, and the latent noise is also learnable per instance), and find that the model can be used for generation and even interporlation on par with some GANs models. Interesting food for thought, that convolutional models, which have been the primary space of success for GANs, may on their own just be great for generating images. Nevertheless, I am sure this can be used in tandem with rather than replacing GANs.
Houdini: Fooling Deep Structured Prediction Models
by Moustapha Cisse, Yossi Adi, Natalia Neverova, Joseph Keshet
FAIR also has a new paper out on generating adversarial examples for models beyond just classification. They show some pretty cool examples for tricking models dealing with human-pose detection from images and speech recognition.
A Distributional Perspective on Reinforcement Learning - Blog
by Marc G. Bellemare, Will Dabney, Rémi Munos
New research from DeepMind on reinforcement learning, this time on doing away with estimating means (e.g. mean reward) and instead projecting the whole distribution of rewards. Sensible, and apparently leads to better results.
StyleBank: An Explicit Representation for Neural Image Style Transfer
by Dongdong Chen, Lu Yuan, Jing Liao, Nenghai Yu, Gang Hua
Microsoft has a new paper on Style Transfer!
We have talked about the previous iteration of this project before. This project again learns to transfer the style of Chinese fonts to new characters. This time, it's powered by GANs (as well as several other ideas, such as Google's zero-shot GNMT)! The result continue to be pretty impressive. In this case, the model is actually trained on a large number of fonts all at once, learning an embedding for each style. (The decoder then takes a character and applies that embedding.) Becauase it's an embedding, you can also show the smooth transition between styles! Furthermore, he even tests the network against Korean characters. This is a cool and really complete project with an in-depth write-up, so I definitely recommend checking it out. (zi is the Chinese word for "character".)
From FAIR: DrQA is a platform (with models!) for question-and-answering. In this case, the task is, given a question, to retrieve an answer from a large domain of text (e.g. all of wikipedia). It includes several models, including non-neural network components.
Yandex has released a new gradient boosting on decision tree library! It supposedly outperforms some existing ones (e.g. XGBoost), but I don't have enough experience with them to confirm.
A really, really good blog post with accompanying notebook on how vulnerable neural network or "big data" models are to unintentional biases. The author basically takes some SOTA models and procedures, runs it on standard dataset, and shows how even an earnest effort can lead to terrible biases in the model. The code examples and results really drive the point home as to how easy it is to fall into this trap. Absolutely worth a read.
Apple has made its first machine learning research blog post, in line with its goal of publishing more open research to attract more research talent. This post is about a GAN-like regime for refining synthetic images (think 3D models) to look like real ones. There are a couple differences from the usual GAN training regime: instead of starting from noise, we start from synthetic images, use $L1$ loss to ensure the refined images remains close to the synthetic one, and other discriminator-related tweaks to make it better remember past generator samples for faster training. Since this is Apple, we may not get open-sourced code for this, but this is a good start for the tech giant.
by Francois Chollet
The creator of Keras put out two blog posts summarizing his thoughts on the current state, pitfalls, and future of Deep Learning. I believe these are either summarized or expanded excerpts from his upcoming book. His conclusions are fairly reasonable (don't get carried away with anthopomorphizing networks with good results, the future of DL may point to more logical/reasoning components), and the content itself is relatively accessible. A recommended quick read.
Adversarial Attacks Q&A;
by Ian Goodfellow, Alexey Kurakin
Not quite a blog post, but this is the Quora Q&A; on Adversarial Attacks run by Ian Goodfellow and Alexey Kurakin. The Q&A; actually extends into other topis too, like GANs (also invented by Goodfellow) and tips for beginner in ML/DL. I highly recommended reading through the answers from what I think is one of the top minds and rising stars of the Deep Learning community.
As expected, a response to the "NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles" paper from last week. Basically, adversarial attacks are pretty robust, and there's a sliding scale of how adversarial/human-fooling we want them to be. Goodfellow also chimes in on this.**
Learning to Learn
by Chelsea Finn
A quick recap from Chelsea Finn from Berkeley on Meta-Learning. It goes through a lot very quickly but should serve as a good road-map for the topic.
On the State of the Art of Evaluation in Neural Language Models
by Gábor Melis, Chris Dyer, Phil Blunsom
An evaluating of many recurrent neural network architectures on language modelling tasks. Turns out a well-tuned LSTM is still hard to beat.
Near human performance in question answering
by Yoav Goldberg
Slides from Yoav Golbderg on why QA isn't solved yet.
Deep Learning for NLP Best Practices
by Sebastian Ruder
Haven't had time to comb through this, but looks good!
37 Reasons why your Neural Network is not working
by Slav Ivanov
Some lessons from the field.