Machine Learning Mailing List - Issue 6

Jason Phang, Mon 22 May 2017, Machine learning mailing list

deep-learning, machine-learning, mlml

It's been quite the few weeks in the world of Machine Learning. The submission deadline for NIPS 2017 just passed, with a total of 3297 submissions. Can't wait to see the new innovations and techniques coming up!

ParlAI, a dialog research platform by Facebook

ParlAI - GitHub - arXiv

Facebook AI Research has developed a platform for carrying out dialog/textual language research. I haven't dug too deeply into it, but it looks like it ropes in common benchmark data sets as well as a hook into Amazon Mechanical Turk (commonly used for creating/evaluating research data) for a standardized research/evaluation/benchmarking interface. I haven't seen too much fanfare about this one, but maybe we'll start to see FAIR's later papers use more of this.

'Quick, Draw!' and SketchRNN by Google

A Neural Representation of Sketch Drawings - Blog Post
     by David Ha, Douglas Eck

Sketch RNN - GitHub - Notebook
     by Magenta (Google Brain)

Quick, Draw! Data

This is actually a follow-up release of code and data for a research paper released a while back, but it's a tremendously comprehensive release that's worth walking through.

In April, David Ha and Douglas Eck released a paper titled A Neural Representation of Sketch Drawings which described a network that sketched pictures of things. Rather than the usual ConvNet approach for generating images, this network instead draws sketches line by line sequentially, much as we might on paper. They mentioned then that they were using a data set from Quick, Draw! (more on that in a bit), but did not release the data. Never the less, the paper and the results were pretty interesting.

Now, we're getting the full release of the data set, as well as more exposure on Google's Magenta's research blog. For some context, Quick, Draw! was one of several Google A.I. experiments that was used to source various kinds of human input data, presented as tiny games/challenges. So it's nice that this data and research is now being made open to all to be expanded upon.

I've long thought that the big tech companies (and maybe other media companies too) have a big opportunity in making little apps and games with a large user-audience that can help them generate data for specific research projects. (In particular, I predict we're close to having a Turing-test kind of platform soon by either Google or Facebook, where humans and researcher-submitted agents alike interact and try to guess if their counterparts are human or AIs. I'm calling it now.) I suspect we'll be seeing many more of these experiments in the future.

Mixing Instruments with NSynth by Google

Making a Neural Synthesizer Instrument - arXiv
     by Magenta (Google Brain)

We previously talked about Magenta and NSynth, a WaveNet-based instrument synthesizer. Magenta has released a new set of user-friendly tools for people to play with their NSynth, as well as additional analysis of their latent code representaton of instruments and sounds and other creative explorations of ways to "misuse" NSynth. The main downside of WaveNet remains that it is relatively slow to run, but this is nevertheless an interesting read, particularly for audiophiles.

arXiv Roundup

A quick round-up of papers I really haven't had time to read (carefully):

Evaluating vector-space models of analogy
     by Dawn Chen, Joshua C. Peterson, Thomas L. Griffiths

So we should all be familiar by know with the impressive property of word2vec-style word-embeddings that are able to do semantic analogy operations such as King - Man + Woman = Queen. While it's a nice property that shows up in every introductory text on word2vec, and word-embeddings have found tremendous applications in research, it's worth exploring further just how much information is captured in those embeddings. Researchers from Berkeley poked further at the power of those vector-based analogies, and found that while they work well for certain analogies, they completely fall apart for others. The authors run through several other forms of analogies that embeddings fall apart on, with generous use of mechanical turks for data-generation and evaluation. Worth a look, and certainly worth keeping in mind as we run up against the limits of word-embeddings and operating in the embedded latent space.

A Survey on Deep Learning in Medical Image Analysis
     by Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen A.W.M. van der Laak, Bram van Ginneken, Clara I. Sánchez

Neural Style Transfer: A Review
     by Yongcheng Jing, Yezhou Yang, Zunlei Feng, Jingwen Ye, Mingli Song

Two review papers that I absolutely have not had the time to review. But review papers are always good for a quick recap and organization of common benchmarks and methods.

Outline Colorization through Tandem Adversarial Networks
     by Kevin Frans

This was written by a high-schooler! I feel bad about all my life decisions now.

Other News

  • Google DeepMind / AlphaGo has its The Future of Go summit next week (23-27 May), where it'll play 1:1 against the world's top player Ke Jie, among other challenges. Definitely worth keeping an eye on.
  • Stanford's releasing a massive Medical Image data set. Like the original ImageNet, it's open to research but requires authorization to access the data.
  • Google has a second-generation of TPUs, while NVidia has a new fancy data-center chip

Contents of this post are intended for entertainment, and only secondarily for information purposes. All opinions, omissions, mistakes and misunderstandings are my own.