Launching the Wolfram Neural Net Repository
June 14, 2018
Sebastian Bodenstein, Senior Developer, Advanced Research Group
Matteo Salvarezza, Developer, Advanced Research Group
Meghan Rieu-Werden, Data Manager, Advanced Research Group
Taliesin Beynon, Lead Developer, Advanced Research Group
Today, we are excited to announce the official launch of the Wolfram Neural Net Repository! A huge amount of work has gone into training or converting around 70 neural net models that now live in the repository, and can be accessed programmatically in the Wolfram Language via NetModel:
✕
net = NetModel["ResNet-101 Trained on ImageNet Competition Data"] |
✕
net[] |
Neural nets have generated a lot of interest recently, and rightly so: they form the basis for state-of-the-art solutions to a dizzying array of problems, from speech recognition to machine translation, from autonomous driving to playing Go. Fortunately, the Wolfram Language now has a state-of-the-art neural net framework (and a growing tutorial collection). This has made possible a whole new set of Wolfram Language functions, such as FindTextualAnswer, ImageIdentify, ImageRestyle and FacialFeatures. And deep learning will no doubt play an important role in our continuing mission to make human knowledge computable.
However, training state-of-the art neural nets often requires huge datasets and significant computational resources that are inaccessible to most users. A repository of nets gives Wolfram Language users easy access to the latest net architectures and pre-trained nets, representing thousands of hours of computation time on powerful GPUs.
A great thing about the deep learning community is that it’s common for researchers to make their trained nets publicly available. These are often in the form of disparate scripts and data files using a multitude of neural net frameworks. A major goal of our repository is to curate and publish these models into a standard, easy-to-use format soon after they are released. In addition, we are providing our own trained models for various tasks.
This blog will cover three main use cases of the Wolfram Neural Net Repository:
- Exposing technology based on deep learning. Although much of this functionality will eventually be packaged as official Wolfram Language functions, the repository provides early access to a large set of functionality that until now was entirely impossible to do in the Wolfram Language.
- Using pre-trained nets as powerful feature extractors. Pre-trained nets can be used as powerful FeatureExtractor functions throughout the Wolfram Language’s other machine learning functionalities, such as Classify, Predict, FeatureSpacePlot, etc. This gives users fine-grained control over incorporating prior knowledge into their machine learning pipelines.
- Building nets using off-the-shelf architectures and pre-trained components. Access to carefully designed and trained modules unlocks a higher-level paradigm for using the Wolfram neural net framework. This paradigm frees users from the difficult and laborious task of building good net architectures from individual layers and allows them to transfer knowledge from nets trained on different domains to their own problems.
An important but indirect benefit of having a diverse and rich library of nets available in the Wolfram Neural Net Repository is to catalyze the development of the Wolfram neural net framework itself. In particular, the addition of models operating on audio and text has driven a diverse set of improvements to the framework; these include extensive support for so-called dynamic dimensions (variable-length tensors), five new audio NetEncoder types and NetStateObject for easy recurrent generation.
An Example
Each net published in the Wolfram Neural Net Repository gets its own webpage. Here, for example, is the page for a net that predicts the geoposition of an image:
At the top of the page is information about the net, such as its size and the data it was trained on. In this case, the net was trained on 100 million images. After that is a Wolfram Notebook showing how to use the net, which can be downloaded or opened in the Wolfram Cloud via these buttons:
Using notebooks in the Wolfram Cloud allows running of the examples in your browser without needing to install anything.
Under the Basic Usage section, we can immediately see how easy it is to perform a computation with this net. Let’s trace this example in more detail. Firstly, we obtain the net itself using NetModel:
✕
net = NetModel["ResNet-101 Trained on YFCC100m Geotagged Data"] |
The first time this particular net is requested, the WLNet file will be downloaded from Wolfram Research’s servers, during which a progress window will be displayed:
Next, we immediately apply this network to an image to obtain the prediction of this net, which is the geographic position where the photo was taken:
✕
position = net[] |
The GeoPosition produced as the output of this net is in sharp contrast to most other frameworks, where only numeric arrays are valid inputs and outputs of a net. A separate script is then requiredto import an image, reshape it, conform it to the correct color space and possibly remove the mean image, before producing the numeric tensor the net requires. In the Wolfram Language, we like nets to be “batteries included,” with the pre- and post-processing logic as part of the net itself. This is achieved by having an "Image" NetEncoder attached to the input port of the net and a "Class"NetDecoder that interprets the output as a GeoPosition object.
As the net returns a GeoPosition object rather than a simple list of data, further computation can immediately be performed on it. For example, we can plot the position on a map:
![]() ✕
GeoGraphics[GeoMarker[position], GeoRange -> 4000000]
|
After the basic example section are sections with other interesting demonstrations—for example:
One very important feature we provide is the ability to export nets to other frameworks. Currently, we support exporting to Apache MXNet, and the final section in each example page usually shows how to do this:
After the examples is a link to a notebook that shows how a user might construct the net themselves using NetChain, NetGraph and individual layers:
What’s in the Wolfram Neural Net Repository So Far?
We have invested much effort in converting publicly available models from other neural net frameworks (such as Caffe, Torch, MXNet, TensorFlow, etc.) into the Wolfram neural net format. In addition, we have trained a number of nets ourselves. For example, the net called by ImageIdentify is available via NetModel["Wolfram ImageIdentify Net V1"]. As of this release, there are around 70 available models:
![]() ✕
Length@NetModel[] |
Because adding new nets is an ongoing task, many more nets will be added over the next year. Let us have a look at some of the major classes of nets available in the repository.
There are nets that perform classification—for example, for determining the type of object in an image:
![]() ✕
image=;NetModel["ResNet-101 Trained on ImageNet Competition Data"][image] |
Or estimating a person’s age from an image of their face:
![]() ✕
face=; NetModel["Age Estimation VGG-16 Trained on IMDB-WIKI Data"][face] |
There are nets that perform regression—for example, predicting the location of the eyes, mouth and nose in an image of a face:
✕
face=; |
✕
landmarks = NetModel["Vanilla CNN for Facial Landmark Regression"][face] |
✕
HighlightImage[face, {PointSize[0.04], landmarks}, DataRange -> {{0, 1}, {0, 1}}] |
Or reconstructing the 3D shape of a face:
✕
face=; Image3D[255* NetModel["Unguided Volumetric Regression Net for 3D Face \ Reconstruction"][face], "Byte", BoxRatios -> {1, 1, 0.5}, ViewPoint -> Below] |
There are nets that perform speech recognition:
![]() ✕
record = AudioCapture["Memory"] |
✕
NetModel["Deep Speech 2 Trained on Baidu English Data"][record] |
see