We are excited to announce ML.NET 1.4 Preview and updates to Model Builder and CLI.
ML.NET is an open-source and cross-platform machine learning framework for .NET developers. ML.NET also includes Model Builder (a simple UI tool) and CLI to make it super easy to build custom Machine Learning (ML) models using Automated Machine Learning (AutoML).
Using ML.NET, developers can leverage their existing tools and skillsets to develop and infuse custom ML into their applications by creating custom machine learning models for common scenarios like Sentiment Analysis, Price Prediction, Sales Forecast prediction, Image Classification and more!
Following are some of the key highlights in this update:
ML.NET Updates
ML.NET 1.4 Preview is a backwards compatible release with no breaking changes so please update to get the latest changes.
In addition to bug fixes described here, in ML.NET 1.4 Preview we have released some exciting new features that are described in the following sections.
Database Loader (Preview)
This feature introduces a native database loader that enables training directly against relational databases. This loader supports any relational database provider supported by System.Data
in .NET Core or .NET Framework, meaning that you can use any RDBMS such as SQL Server, Azure SQL Database, Oracle, SQLite, PostgreSQL, MySQL, Progress, IBM DB2, etc.
In previous ML.NET releases, since ML.NET 1.0, you could also train against a relational database by providing data through an IEnumerable
collection by using the LoadFromEnumerable() API where the data could be coming from a relational database or any other source. However, when using that approach, you as a developer are responsible for the code reading from the relational database (such as using Entity Framework or any other approach) which needs to be implemented properly so you are streaming data while training the ML model, as in this previous sample using LoadFromEnumerable().
However, this new Database Loader provides a much simpler code implementation for you since the way it reads from the database and makes data available through the IDataView is provided out-of-the-box by the ML.NET framework so you just need to specify your database connection string, what’s the SQL statement for the dataset columns and what’s the data-class to use when loading the data. It is that simple!
Here’s example code on how easily you can now configure your code to load data directly from a relational database into an IDataView which will be used later on when training your model.
//Lines of code for loading data from a database into an IDataView for a later model training string connectionString = @"Data Source=YOUR_SERVER;Initial Catalog= YOUR_DATABASE;Integrated Security=True"; string commandText = "SELECT * from SentimentDataset"; DatabaseLoader loader = mlContext.Data.CreateDatabaseLoader(); DatabaseSource dbSource = new DatabaseSource(SqlClientFactory.Instance, connectionString, commandText); IDataView trainingDataView = loader.Load(dbSource); // ML.NET model training code using the training IDataView //... public class SentimentData { public string FeedbackText; public string Label; }
This feature is in preview and can be accessed via the Microsoft.ML.Experimental
v0.16-Preview nuget package available here.
For further learning see this complete sample app using the new DatabaseLoader.
Image classification with deep neural networks retraining (Preview)
This new feature enables native DNN transfer learning with ML.NET, targeting image classification as our first high level scenario.
For instance, with this feature you can create your own custom image classifier model by natively training a TensorFlow model from ML.NET API with your own images.
Image classifier scenario – Train your own custom deep learning model with ML.NET
In order to use TensorFlow, ML.NET is internally taking dependency on the Tensorflow.NET library.
The Tensorflow.NET library is an open source and low level API library that provides the .NET Standard bindings for TensorFlow. That library is part of the SciSharp stack libraries.
Microsoft (the ML.NET team) is closely working with the TensorFlow.NET library team not just for providing higher level APIs for the users in ML.NET (such as our new ImageClassification API) but also helping to improve and evolve the Tensorflow.NET library as an open source project.
We would like to acknowledge the effort and say thank you to the Tensorflow.NET library team for their agility and great collaboration with us.
The stack diagram below shows how ML.NET implements these new DNN training features. Although we currently only support training TensorFlow models, PyTorch support is in the roadmap.
As the first main scenario for high level APIs, we are currently focusing on image classification. The goal of these new high-level APIs is to provide powerful and easy to use interfaces for DNN training scenarios like image classification, object detection and text classification.
The below API code example shows how easily you can train a new TensorFlow model which under the covers is based on transfer learning from a selected architecture (pre-trained model) such as Inception v3 or Resnet.
Image classifier high level API code using transfer learning from Inceptionv3 pre-trained model
var pipeline = mlContext.Transforms.Conversion.MapValueToKey(outputColumnName: "LabelAsKey", inputColumnName: "Label") .Append(mlContext.Model.ImageClassification("ImagePath", "LabelAsKey", arch: ImageClassificationEstimator.Architecture.InceptionV3)); //Can also use ResnetV2101 // Train the model ITransformer trainedModel = pipeline.Fit(trainDataView);
The important line in the above code is the one using the mlContext.Model.ImageClassification
classifier trainer which as you can see is a high level API where you just need to select the base pre-trained model to derive from, in this case Inception v3, but you could also select other pre-trained models such as Resnet v2101. Inception v3 is a widely used image recognition model trained on the ImageNet dataset. Those pre-trained models or architectures are the culmination of many ideas developed by multiple researchers over the years and you can easily take advantage of it now.
The DNN Image Classification training API is still in early preview and we hope to get feedback from you that we can incorporate in the next upcoming releases.
For further learning see this sample app training a custom TensorFlow model with provided images.
Enhanced for .NET Core 3.0
ML.NET is now building for .NET Core 3.0. This means ML.NET can take advantage of the new features when running in a .NET Core 3.0 application. The first new feature we are using is the new hardware intrinsics feature, which allows .NET code to accelerate math operations by using processor specific instructions.
Of course, you can still run ML.NET on older versions, but when running on .NET Framework, or .NET Core 2.2 and below, ML.NET uses C++ code that is hard-coded to x86-based SSE instructions. SSE instructions allow for four 32-bit floating-point numbers to be processed in a single instruction. Modern x86-based processors also support AVX instructions, which allow for processing eight 32-bit floating-point numbers in one instruction. ML.NET’s C# hardware intrinsics code supports both AVX and SSE instructions and will use the best one available. This means when training on a modern processor, ML.NET will now train faster because it can do more concurrent floating-point operations than it could with the existing C++ code that only supported SSE instructions.
Another advantage the C# hardware intrinsics code brings is that when neither SSE nor AVX are supported by the processor, for example on an ARM chip, ML.NET will fall back to doing the math operations one number at a time. This means more processor architectures are now supported by the core ML.NET components. (Note: There are still some components that don’t work on ARM processors, for example FastTree, LightGBM, and OnnxTransformer. These components are written in C++ code that is not currently compiled for ARM processors.)
For more information on how ML.NET uses the new hardware intrinsics APIs in .NET Core 3.0, please check out Brian Lui’s blog post Using .NET Hardware Intrinsics API to accelerate machine learning scenarios.
Model Builder in VS and CLI updated to latest GA version
The Model Builder tool in Visual Studio and the ML.NET CLI (both in preview) have been updated to use the latest ML.NET GA version (1.3) and addresses lots of customer feedback. Learn more about the changes here.
Model Builder updated to latest ML.NET GA version
Model Builder uses the latest GA version of ML.NET (1.3) and therefore the generated C# code also references ML.NET 1.3.
Improved support for other OS cultures
This addresses many frequently reported issues where developers want to use their own local culture OS settings to train a model in Model Builder. Please read this issue for more details.
Customer feedback addressed for Model Builder
There were many issues fixed in this release. Learn more in the release notes.
New sample apps
Coinciding with this new release, we’re also announcing new interesting sample apps covering additional scenarios:
New ML.NET video playlist at YouTube
We have created a ML.NET Youtube playlist at the .NET foundation channel with a list made of selected videos, each video focusing on a single and particular ML.NET feature, so it is great for learning purposes.
Access here the ML.NET Youtube playlist.
Try ML.NET and Model Builder today!
- Get started with ML.NET here.
- Get started with Model Builder here.
- Refer to documentation for tutorials and more resources.
- Learn from samples apps for different scenarios using ML.NET.
Summary
We are excited to release these updates for you and we look forward to seeing what you will build with ML.NET. If you have any questions or feedback, you can ask them here for ML.NET and Model Builder.
Happy coding!
The ML.NET team.
This blog was authored by Cesar de la Torre and Eric Erhardt plus additional contributions of the ML.NET team.
Acknowledgements
- As mentioned above, we would like to acknowledge the effort and say thank you to the Tensorflow.NET library team for their agility and great collaboration with us. Special kudos for Haiping (Oceania2018.)
- Special thanks for Jon Wood (@JWood) for his many and great YouTube videos on ML.NET that we’re also pointing from our ML.NET YouTube playlist mentioned in the blog post. Also, thanks for being an early adopter and tester for the new DatabaseLoader.
The post Announcing ML.NET 1.4 Preview and Model Builder updates (Machine Learning for .NET) appeared first on .NET Blog.