# Machine Learning Mailing List - Issue 10

Jason Phang, Sun 09 July 2017, Machine learning mailing list

First week of the shorter Machine Learning Mailing List. Fitting - since it seems like not too much happened this week (relatively). Apprently the Deep Learning Summer School in UMontreal is happening this week, so there're quite a few good topical talks and slides going around.

## arXiv Roundup¶

A quick round-up of papers I really haven't had time to read (carefully):

Efficient Attention using a Fixed-Size Memory Representation
by Denny Britz, Melody Y. Guan, Minh-Thang Luong

Another simple idea / quick enhancement of Attention mechanisms. In regular attention mechanisms, attention is run over all input hidden states for every output step, reaching $O(NM)$ computations altogether, where $N$ and $M$ are the input and output lengths respectively. This paper proposes an intermediary step where we summarize the $N$ hidden input states into a set of $K$ attention contexts, pushing the number of computations down to $O((N+M)K)$. Further adjustments are made to force the intermediate computation to incorporate time-ordinality (by selectively upweighting/downweighting earlier/later states differently for the $K$ contexts).

Online and Linear-Time Attention by Enforcing Monotonic Alignments - Blog Post
by Colin Raffel, Minh-Thang Luong, Peter J. Liu, Ron J. Weiss, Douglas Eck

Another simple idea / quick enhancement of Attention mechanisms. The idea here is that given that the peaks of attention weights tend to be monotonic in input and output steps (think about the attention-weight plots we often see, with a line going down diagonally), we might be able to save computation by simply enforcing that monotonicity. There are a few quirks about the idea that I will need to get into the paper to really understand, but it's also a pretty straightforward and actionable improvement.

## Project Roundup¶

MuJoCo-py - GitHub
by OpenAI

OpenAI open-sources a high-performance Python wrapper around the MuJoCo robotics/physics library.

NIPS 2017: Adversarial Attacks and Defenses

Kaggle (with support from Google) is running a series of competitions for NIPS 2017 around adversarial attacks and defenses for Neural Networks. There are three sub-competitions:

1. Non-targeted Adversarial Attack: Simple confuse the classifier away from the desired target.
2. Targeted Adversarial Attack: Convince the classifier to misclassify an object to a specified target
3. Defense Against Adversarial Attack: Make a classifier robust to adversarial attacks.

Interestingly, these three sub-competitions are actually a single competition, as the adversarial attacks will be used against the classifiers from the defensive sub-competition. This will be fun to watch!

Attention-based RNN for sentiment classification in TensorFlow. Good to see projects working on documents in other languages.

## Blog/Talk Roundup¶

Deep Learning: A Next Step
by Kyunghyun Cho

A quick overview of some recent research on expanding the frontier of Deep Learning networks to be more multi-purpose like the human brain. A good selection of research tackling the issue from different approaches/mechanisms.

Meow Generator
by Alexia Jolicoeur-Martineau

A quick runthrough of generating cat images using various GANs, peppered with the author's comments and intuition throughout. Not a rigorous study, but still a fun read.

Unintuitive properties of neural networks
by Hugo Larochelle

Jump to slide 18 to skip the (still useful) overview of different learning/network paradigms. This talk goes through some still annoying quirks in the modern, state-of-the-art networks and training regimes we have today. Very useful to keep in mind, and in thinking about new research directions.

Deep learning In the brain
by Blake Aaron Richards

Turning the usual cliche of "neural networks are based on the brain" on its head, this talk instead notes how successful neural networks are, and ponders how the brain is or is not able to replicate the key properties of successful training of neural networks. A good read from a radically different perspective.

A good blog post on visualizing attention using Keras. I certainly should get back to more Keras work.

CAN (Creative Adversarial Network) — Explained
by Harshvardhan Gupta

An easy and intuitive overview of the idea and training regime behind Creative Adversarial Networks (CANs).