Amazon’s creepy facial recog doorbell, Facebook open sources machine learning code and much more

Plus: Listen to some new classical piano generated by an algorithm

Roundup welcome to the last AI round up of the year; thank you for reading.

We bring you the horrors of Amazon’s plans to install doorbells with facial recognition software and some new (and improved) classical piano pieces generated by a neural network.

Spying doorbell: The American Civil Liberties Union is clashing with Amazon again. And this time it’s over a patent filing that pairs facial recognition technology with a doorbell.

The New York based nonprofit has been investigating Amazon’s Rekognition technology for a while. It called out Jeff Bezos’ biz for working with the police and the Department of Homeland Security, and even proved how inaccurate facial recognition could be.

Now, it’s tutting [PDF] at Amazon’s proposal to put a camera running Rekognition with doorbells. Amazon recently acquired Ring, a startup that specializes in combining the two, for over $1bn early this year.

That’s not all, however, the ACLU claims. The patent also lays out plans to analyse “fingerprints, skin-texture analysis, DNA, palm-vein analysis, hand geometry, iris recognition, odor/scent recognition, and even behavioral characteristics, like typing rhythm, gait, and voice recognition.”

“It’s rare for patent applications to lay out, in such nightmarish detail, the world a company wants to bring about” said Jacob Snow, a Technology & Civil Liberties Attorney for ACLU of Northern California.

“Amazon is dreaming of a dangerous future, with its technology at the center of a massive decentralized surveillance network, running real-time facial recognition on members of the public using cameras installed in people’s doorbells.”

The ACLU argued that facial recognition is biased against people of darker skin and women (because most training data is focused on white people) and a threat to the the First Amendment, since it creates a state of surveillance.

“It’s time for Amazon to take responsibility and stop chasing profit at the expense of safety and civil rights,” Snow concluded.

ML Perf results: The first round of results are in for MLPerf, a project aimed at benchmarking hardware by testing the training and inference times for different models across various AI chips.

Trying to work out the real performance of chips is a nightmare as so many companies claim to be faster than one another. So MLPerf was set up to try and test these claims in a fair way.

It tests several different tasks ranging from image classification, object detection, machine translation, recommendation systems and reinforcement learning. The training datasets and neural network models are adapted to each task.

The first round of results have been published in a table. They’re a little confusing, and the different chips aren’t all tested across all tasks so sometimes it’s a difficult to compare.

For example, Intel’s chips aren’t used in the object detection and machine translation tasks, there are no submissions for reinforcement learning using Nvidia’s chips and the different models of Google’s TPUs are used on different tasks.

So, what it broadly says is that in some cases GPUs are faster, in others TPUs are better and, um, Intel is behind. The Next Platform has more details.

PyTorch + Text = PyText: Facebook has released PyText, a framework that helps developers conduct natural language processing experiments and roll out the results for deployment.

It includes pre-built models in PyTorch to carry out a range of NLP tasks such as “document classification, sequence tagging, semantic parsing, multitask modeling, and other tasks”. Facebook uses the framework for Portal, a virtual assistant-like device powered by Amazon’s Alexa but with a camera, and for M, the bot that suggests features in its Messenger platform.

The idea is that developers can build PyText models in PyTorch, convert them to ONNX and export them to Caffe2 for production.

You can play around with it here.

Classic AI music: Machine-made music is notoriously bad, but it’s improving. Some of these samples from Google Magenta actually sound somewhat okay if you kind of squint with your ears.

All of it is produced by Music Transformer, a neural network that aims to try and create music with long term structure so that a string of notes actually sounds like a song instead of plonking random keys on a piano.

The details for Music Transformer were published in a paper on arXiv in September, but the new samples were released this week. It boils down to improving the “relative attention mechanism”, an algorithm that looks back on all the notes that were previously generated as it adds new ones.

The memory process needed to store old information is less clunky than a using a recurrent neural network and allows the researchers to train the system on longer sequences so it, too, can produce songs at greater lengths.

The post also includes some areas where Music Transformer breaks down given a few notes from Chopin’s Black-Key Etude. It doesn’t create anything that sounds like Chopin at all, but the “unconditioned samples” that don’t rely on any strict inputs from any songs sound pretty good.

You can have a listen here.