A monthly roundup of news about Artificial Intelligence, Machine Learning and Data Science. This is an eclectic collection of interesting blog posts, software announcements and data applications from Microsoft and elsewhere that I've noted over the past month or so.
Open Source AI, ML & Data Science News
PyTorch 1.1 is now available, with new support for Tensorboard and improvements to distributed training and JIT compilation.
JupyterHub 1.0 is released, a milestone for the multi-user Jupyter Notebook server.
matplotlib 3.1 is released, adding several improvements to the Python data visualization library.
Python is now included in Windows 10, with updates available via the Microsoft Store.
FastBert, a simple PyTorch interface for training text classifiers based on the popular language representation model BERT, is released.
torchvision 0.3, the PyTorch library of datasets and tools for computer vision, adds new models for semantic segmentation and object detection.
ML.NET 1.0, the open-source cross-platform machine learning framework for .NET developers, is now available.
R 3.6.0 is released, with many new capabilities and improved memory usage and performance.
Industry News
The Wolfram Engine, featured in Mathematica and including a suite of algorithms for visualization, machine learning, NLP and more, is now available free for development (license required for production).
Facebook releases Pythia, a new open-source deep learning framework based on PyTorch, for multitasking in the vision and language domain.
Google Cloud AutoML Natural Language provides an interactive UI to classify content and build a predictive ML model without coding.
Google Cloud TPU Pods, cloud-based clusters that can include more than 1,000 TPU chips as an "ML supercomputer", are now publicly available in beta. NVIDIA T4 GPUs are also now generally available in GCP.
Snips open-sources Tract, an embedded neural network inference engine designed for wake-word detection by virtual assistant devices.
Hewlett Packard Enterprises announced plans to acquire Cray, the supercomputer company.
Intel announces several open-source initiatives for AI and cloud technologies, including a Deep Learning Reference Stack optimized for Intel chipsets.
Microsoft News
Azure Machine Learning Service adds new MLOps capabilities, providing version control, audit trails, and packaging, deployment and monitoring support for machine learning models via an Azure DevOps extension. Also, model deployment to FPGA is now generally available in Azure (and additionally in preview for Databox Edge).
Further updates to Azure Machine Learning Service are now in preview, including a new drag-and-drop visual interface, a new form-based UI for automated machine learning, model interpretability, and hosted Python notebooks.
ONNX Runtime 0.4 is released, adding support for Intel and NVIDIA accelerators to further reduce latency for deployed neural networks.
Azure Cognitive Services has added many new capabilities, highlights of which include:
- A new category of service: Decision. The services include Content Moderator, Anomaly Detector and a new service called Personalizer (in preview).
- Conversation Transcription, a new Speech service that can discriminate and transcribe audio from multiple speakers in real time.
- Ink Recognizer, a new Vision service that allows developers to add digital handwriting recognition to apps.
- Additional Cognitive Services offering container support: Anomaly Detector, Speech-to-Text and Text-to-Speech.
- Azure Search, which includes many cognitive skills including image recognition and language comprehension, is now generally available.
- Form Recognizer, a new knowledge discovery service that extracts key-value pairs and tables from forms in scanned and PDF documents, now in preview.
Visual Studio Code adds Remote Development, allowing use of remote Python workspaces over SSH, in Docker containers, and in Windows Subsystem for Linux. Other recent Python support improvements include Intellisense in the console and enhancements to the Python Language Server.
Azure Data Explorer now supports queries with custom Python code, as well as integration with Spark.
Azure SQL Database Edge, a small-footprint data engine with support for Python, R and Spark and optimized for edge devices and time-series data, is now in private preview.
Microsoft announces an end-to-end toolchain for autonomous systems (in preview), which developers can use to simulate and build robots and other AI-driven autonomous devices.
Learning resources
A beginner's tutorial on training a convolutional neural network, using only Python and numpy.
Rules of Machine Learning: Google's list of best practices for developers looking to create applications with machine learning capabilities.
ODSC suggests 25 public data sets to get started with machine learning, spanning text, images, and tabular data.
Foundations of Data Science, a free book by Avrim Blum, John Hopcroft and Ravi Kannan with a focus on matrix decompositions and associated ML techniques.
Azure Open Datasets, a collection of curated public datasets easily accessible to Azure ML services.
Open Images v5, a large collection of annotated images including segmentation masks for 2.8 million objects in 350 categories, has been released by Google.
Microsoft Learn now offers several modules with free training for AI engineers and data scientists.
Six principles behind health data-driven organizations, a useful resource on building a data science culture by Francesca Lazzeri.
Applications
How Python is used at Netflix for personalization, machine learning, experimentation, statistical analysis and more.
Google develops a method to infer depth maps for video by using "mannequin challenge" videos as training data.
Samsung researchers develop a method of animating a single photograph of a person as a realistic talking head (video).
Google's "Translatotron", a speech-to-speech translation model that translates speech audio into a second language while retaining the original speaking voice.
AI-based applications help children with disabilities bridge language gaps.
Facebook researchers develop a method to synthesize full-body video of a person performing actions animated in real time under joystick control, based only on real source video.
LaLiga and BMW are using the Azure Bot Framework SDK to deliver specialized personal assistant applications.
Find previous editions of the monthly AI roundup here.