How do you use System.Drawing in .NET Core?

September 11, 2018, 6:38 pm

≫ Next: Improved Engineering with Azure Pipelines

≪ Previous: Visual Studio 2017 version 15.9 Preview 2

I've been doing .NET image processing since the beginning. In fact I wrote about it over 13 years ago on this blog when I talked about Compositing two images into one from the ASP.NET Server Side and in it I used System.Drawing to do the work. For over a decade folks using System.Drawing were just using it as a thin wrapper over GDI (Graphics Device Interface) which were very old Win32 (Windows) unmanaged drawing APIs. We use them because they work fine.

.NET Conf: Join us this week! September 12-14, 2018 for .NET Conf! It's a FREE, 3 day virtual developer event co-organized by the .NET Community and Microsoft. Watch all the sessions here. Join a virtual attendee party after the last session ends on Day 1 where you can win prizes! Check out the schedule here and attend a local event in your area organized by .NET community influencers all over the world.

For a while there was a package called CoreCompat.System.Drawing that was a .NET Core port of a Mono version of System.Drawing.

However, since then Microsoft has released System.Drawing.Common to provide access to GDI+ graphics functionality cross-platform.

There is a lot of existing code - mine included - that makes assumptions that .NET would only ever run on Windows. Using System.Drawing was one of those things. The "Windows Compatibility Pack" is a package meant for developers that need to port existing .NET Framework code to .NET Core. Some of the APIs remain Windows only but others will allow you to take existing code and make it cross-platform with a minimum of trouble.

Here's a super simple app that resizes a PNG to 128x128. However, it's a .NET Core app and it runs in both Windows and Linux (Ubuntu!)

using System;

using System.Drawing;

using System.Drawing.Drawing2D;

using System.Drawing.Imaging;

using System.IO;


namespace imageresize

{

    class Program

    {

        static void Main(string[] args)

        {

            int width = 128;

            int height = 128;

            var file = args[0];

            Console.WriteLine($"Loading {file}");

            using(FileStream pngStream = new FileStream(args[0],FileMode.Open, FileAccess.Read))

            using(var image = new Bitmap(pngStream))

            {

                var resized = new Bitmap(width, height);

                using (var graphics = Graphics.FromImage(resized))

                {

                    graphics.CompositingQuality = CompositingQuality.HighSpeed;

                    graphics.InterpolationMode = InterpolationMode.HighQualityBicubic;

                    graphics.CompositingMode = CompositingMode.SourceCopy;

                    graphics.DrawImage(image, 0, 0, width, height);

                    resized.Save($"resized-{file}", ImageFormat.Png);

                    Console.WriteLine($"Saving resized-{file} thumbnail");

                }       

            }     

        }

    }

}

Here it is running on Ubuntu:

Resizing Images on Ubuntu

NOTE that on Ubuntu (and other Linuxes) you may need to install some native dependencies as System.Drawing sits on top of native libraries

sudo apt install libc6-dev 

sudo apt install libgdiplus

There's lots of great options for image processing on .NET Core now! It's important to understand that this System.Drawing layer is great for existing System.Drawing code, but you probably shouldn't write NEW image management code with it. Instead, consider one of the great other open source options.

ImageSharp - A cross-platform library for the processing of image files; written in C#
- Compared to System.Drawing ImageSharp has been able to develop something much more flexible, easier to code against, and much, much less prone to memory leaks. Gone are system-wide process-locks; ImageSharp images are thread-safe and fully supported in web environments.

Here's how you'd resize something with ImageSharp:

using (Image<Rgba32> image = Image.Load("foo.jpg"))

{

    image.Mutate(x => x

         .Resize(image.Width / 2, image.Height / 2)

         .Grayscale());

    image.Save("bar.jpg"); // Automatic encoder selected based on extension.

}

Magick.NET -A .NET library on top of ImageMagick
SkiaSharp - A .NET wrapper on top of Google's cross-platform Skia library

It's awesome that there are so many choices with .NET Core now!

Sponsor: Rider 2018.2 is here! Publishing to IIS, Docker support in the debugger, built-in spell checking, MacBook Touch Bar support, full C# 7.3 support, advanced Unity support, and more.

↧

Improved Engineering with Azure Pipelines

September 11, 2018, 5:00 pm

≫ Next: Use AI to streamline healthcare operations

≪ Previous: How do you use System.Drawing in .NET Core?

↧

Use AI to streamline healthcare operations

September 12, 2018, 1:00 am

≫ Next: How Security Center and Log Analytics can be used for Threat Hunting

≪ Previous: Improved Engineering with Azure Pipelines

The profound impact of machine learning (ML) and artificial intelligence (AI) is changing the way heath organizations think about many of the challenges they face. Making data-informed decisions based on actionable insights is improving many aspects of healthcare from patient diagnosis and outcomes to operational efficiencies.

Data-informed decision making

While making decisions with deep insight into relevant data, healthcare organizations must be especially mindful of how they implement such solutions. Regulations like HIPAA and HITRUST compliance require data be kept secure, private and anonymized for those who don’t require access to patient data.

Further, IT staff are often unprepared or understaffed to implement such solutions. This is why the Azure Healthcare AI Blueprint was created, to bootstrap AI solutions for healthcare organizations using Microsoft Azure Platform as a Service (PaaS). After installing the blueprint, organizations can learn from the reference implementation and better understand the components of a complete solution built with Azure services.

The Azure healthcare AI blueprint

The blueprint is installed to Azure via PowerShell scripts, and creates a complete environment that can run Azure Machine Learning Studio (MLS) experiments right away. In fact, there is a simple patient length of stay (LOS) experiment built right in.

Other demo scripts are also available if you want to delve further. There is a script to admit 10 new patients and one to discharge two patients. Running these scripts is covered in the blueprint documentation.

Operational process flow

The care line manager submits newly admitted patient data that is immediately available via a Power BI dashboard, but also is processed by the LOS machine learning experiment. The patient’s data is input into the patient registration system. The system invokes an Azure function with the data in the Fast Healthcare Interoperability Resources (FHIR) format. The Admit Patient Azure function stores data to an Azure SQL database and to MLS where it is used to predict the patient’s LOS. When patients are admitted or discharged from the facility, another Azure function writes the data to Azure SQL database.

There are many other services in the overall solution such as monitoring and security. But MLS is where the magic of the LOS experiment happens, driving operational decisions such as staffing and projected bed counts.

Wrapping up

There are many components and services in the blueprint, and we’ve examined just a few that are used for a single LOS experiment. More sophisticated algorithms and models are used for more complex scenarios.

Recommended next steps

Read about the Azure Healthcare AI Blueprint to see if it’s a good fit for your organization. This could serve as the catalyst for your organization’s change to data-informed decision making.
Download the scripts, install instructions, and other artifacts in the blueprint on GitHub.

↧

How Security Center and Log Analytics can be used for Threat Hunting

September 12, 2018, 2:00 am

≫ Next: Real-time data analytics and Azure Data Lake Storage Gen2

≪ Previous: Use AI to streamline healthcare operations

Organizations today are constantly under attack. Azure Security Center (ASC) uses advanced analytics and global threat intelligence to detect malicious threats, and the new capabilities that our product team is adding everyday empower our customers to respond quickly to these threats.

However, just having great tools that alert about the threats and attacks is not enough. The reality is that no security tool can detect 100 percent of the attack. In addition, many of the tools that raise alerts are optimized for low false positive rates. Hence, they might miss some suspicious outlier activity in your environment which could have been flagged and investigated. This is something that Security Center and the Azure Log Analytics team understands. The product has built-in features that you can use to launch your investigations and hunting campaigns in addition to responding to alerts that it triggers.

In the real world, if you need to do threat hunting, there are several considerations that you should consider. You not only need a good analyst team, you need an even larger team of service engineers and administrators that worry about deploying an agent to collect the investigations related data, parsing them in a format where queries could be run, building tools that help query this data and lastly indexing the data so that your queries run faster and actually give results. ASC and Log Analytics take care of all of this and will make hunting for threats much easier. What organizations need is a change in mindset. Instead of being just alert driven, they should also incorporate active threat hunting into their overall security program.

What is Threat Hunting?

Loosely defined it is the process of proactively and iteratively searching through your varied log data with the goal of detecting threats that evade existing security solutions. If you think about it, Threat Hunting is a mindset. A mindset wherein - instead of just reacting to alerts you are proactive about securing your organization’s environment and are looking for signs of malicious activity within your enterprise, without prior knowledge of those signs. Threat hunting involves hypothesizing about attackers’ behavior. Researching these hypotheses and techniques to determine the artifacts that would be left in the logs. Checking if your organization collects and stores these logs. Then verifying these hypotheses that you derived - in your environment's logs.

Hunting teaches you how to find data, how to distinguish between normal activity and an outlier, gives a better picture of your network and shows you your detection gaps. Security analysts who do regular hunting are better trained to respond and triage during an actual incident.

Today, we are going to look at some examples of these simple hunts that an analyst can start with. In our previous posts, we have already touched a little bit about this. You can read more about detecting malicious activity and finding hidden techniques commonly deployed by attackers and how Azure Security Center helps analyze attacks using Investigation and Log Search.

Scenario 1

A lot of security tools look for abnormally large data transfers to an external destination. To evade these security tools and to reduce the amount of data sent over the network, attackers often compress the collected data prior to exfil. The popular tools of choice for compression are typically 7zip/Winzip/WinRar etc. Attackers have also been known to use their own custom programs for compressing data.

For example, while using WinRar to compress the data a few of the switches that seem to be most commonly used are "a -m5 –hp." While the " a " switch specifies adding the file to an archive, the “ -m5” switch specifies the level of compression where “5” is the maximum compression level. The “-hp” switch is used to encrypt content and header information. With the knowledge of these command line switches, we may detect some of this activity.

Below is a simple query to run this logic where we are looking for these command line switches. In this example, if we look at the result we can see that all the command line data looks benign except where Svchost.exe is using the command line switches associated with WinRAR. In addition, the binary Svchost.exe is running from a temp directory when ideally it should be running from %windir%/system32 folder. Threat actors have been known to rename their tools to a well-known process name to hide in plain sight. This is considered suspicious and a good starting point for an investigation. An analyst can take one of the many approaches from here to uncover the entire attack sequence. They can pivot into logon data to find what happened in this logon session or focus on what user account was used. They can also investigate what IP addresses may have connected to this machine or what IP address this machine connected to during the given time frame.

SecurityEvent
| where TimeGenerated >= ago(2d)
| search CommandLine : " -m5 " and CommandLine : " a "
| project NewProcessName , CommandLine

Another good example like this could be the use of popular Nirsoft tools like mail password viewer or IE password viewer being used maliciously by attackers to gather passwords from email clients as well as password stored in a browser. Knowing the command line for these tools, one may find interesting log entries if they search for command line parameters such as: /stext or /scomma, which allows discovery of potentially malicious activity without needing to know the process name. To provide a previously seen example, seeing a command line like “notepad.exe /stext output.txt” is a good indication that notepad might be a renamed Nirsoft tool and likely malicious activity.

Scenario 2

Building on the earlier example where we saw Rar.exe being renamed to svchost.exe. Malware writers often use windows system process names for their malicious process names to make them blend in with other legitimate commands that the Windows system executes. If an analyst is familiar with the well-known Windows processes they can easily spot the bad ones. For example, Svchost.exe is a system process that hosts many Windows services and is generally the most abused by attackers. For the svchost.exe process, it is common knowledge that:

It runs from %windir%/system32 or %windir%/SysWOW64.
It runs under NT AUTHORITYSYSTEM, LOCAL SERVICE, or NETWORK SERVICE accounts.

Based on this knowledge, an analyst can create a simple query looking for a process named Svchost.exe. It is recommended to filter out well-known security identifiers (SIDs) that are used to launch the legitimate svchost.exe process. The query also filters out the legitimate locations from which svchost.exe is launched.

SecurityEvent
| where TimeGenerated >= ago(2d)
| where ProcessName contains "svchost.exe"
| where SubjectUserSid != "S-1-5-18"
| where SubjectUserSid != "S-1-5-19"
| where SubjectUserSid != "S-1-5-20"
| where NewProcessName !contains "C:\Windows\System32"
| where NewProcessName !contains "C:\Windows\Syswow64"

Additionally, from the returned results we also check if svchost.exe is a child of services.exe or if it is launched with a command line that has –k switch (e.g. svchost.exe -k defragsvc). Filtering using these conditions will often give interesting results that an analyst can dig further into in order to find if this is the normal activity or if it is part of a compromise or attack.

There is nothing new or novel here. Security analysts know about it. In fact, a lot of security tools including ASC will detect this. However, the goal here is a change in mindset of not only responding to alerts but proactively looking for anomalies and outliers in your environment.

Scenario 3

After initial compromise either through Brute Force attack, spear phishing or other methods, attackers often move to the next step which can loosely be called network propagation stage. The goal of the Network Propagation phase is to identify and move to desired systems within the target environment with the intention of discovering credentials and sensitive data. Sometimes as part of this one might see one account being used to log in on unusually high number of machines in the environment or lot of different account authentication requests coming from one machine. Say if this is the second scenario where we want to find machines that have been used to authenticate accounts more than our desired threshold we could probably write a query like below:

SecurityEvent
    | where EventID == 4624
    | where AccountType == "User"
    | where TimeGenerated >= ago(1d)
    | summarize IndividualAccounts = dcount(Account) by Computer
    | where IndividualAccounts > 4

If we also wanted to see what alerts fired on these machines we could extend the above query and join them with the SecurityAlerts table.

SecurityEvent
    | where EventID == 4624
    | where AccountType == "User"
    | where TimeGenerated >= ago(1d)
    | extend Computer = toupper(Computer)
    | summarize IndividualAccounts = dcount(Account) by Computer
    | where IndividualAccounts > 4
| join (SecurityAlert
                 | extend ExtProps=parsejson(ExtendedProperties)
                 | extend Computer=toupper(tostring(ExtProps["Compromised Host"]))
                 )
on Computer

These are just a few examples. The possibilities are endless. With practice, a good analyst knows when to dig deeper and when to move on to the next item on their hunting journey. Nothing in the world of cyber gets better unless victims start defending themselves more holistically. Enabling our customers on this journey and providing them with the tools to protect themselves when they move to Azure is what drives us every day.

Happy Hunting!

↧

Real-time data analytics and Azure Data Lake Storage Gen2

September 12, 2018, 3:00 am

≫ Next: Azure preparedness for Hurricane Florence

≪ Previous: How Security Center and Log Analytics can be used for Threat Hunting

It’s been a little more than two months since we launched Azure Data Lake Storage Gen2, we’re thrilled and overwhelmed by the response we’ve received from customers and partners alike. We built Azure Data Lake Storage to deliver a no-compromises data lake and the high level of customer engagement in Gen 2’s public preview confirms our approach. We have heard from customers both large and small and across a broad range of markets and industries that Gen2’s ability to provide object storage scale and cost effectiveness with a world class data lake experience is exceeding their expectations and we couldn’t be happier to hear it!

Partner enablement in Gen2

In fact we are actively partnering with leading ISV’s across the big data spectrum of platform providers, data movement and ETL, governance and data lifecycle management (DLM), analysis, presentation, and beyond to ensure seamless integration between Gen2 and their solutions.

Over the next few months you will hear more about the exciting work these partners are doing with ADLS Gen2. We’ll do blog posts, events, and webinars that highlight these industry-leading solutions.

In fact, I am happy to announce our first joint Gen2 engineering-ISV webinar with Attunity on September 18th, Real-time Big Data Analytics in the Cloud 101: Expert Advice from the Attunity and Azure Data Lake Storage Gen2 Teams.

Real-time analytics and ADLS Gen2

Azure Data Lake Storage Gen2 is at the core of Azure Analytics workflows. One of the workflows that has generated significant interest is for real-time analytics. With the explosive growth of data generated from sensors, social media, business apps, many organizations are looking for ways to drive real-time insights and orchestrate immediate action using cloud analytic services. As the diagram below indicates, multiple Azure services exist to provide an end to end solution for driving real-time analytics workflows.

Many of our customers are just getting started with building their real-time analytics cloud architectures. We’re excited to partner with Attunity to help customers learn more about real-time analytics, data lakes and how you can quickly move from evaluation to execution.

Attunity is a recognized leader in data integration and real-time data capture with deep skills in real-time data management. They are also a Microsoft Gold partner. In our webinar, we’ll explore the following questions:

Why is real-time data important for driving business insights?
What’s a data lake and why would you use one to store your real-time data?
How can you use change data capture (CDC) technology to efficiently transfer data to the cloud?
How can you build sophisticated analytic workflows quickly?
Why is Azure Data Lake Storage Gen2 the best data lake for real-time analytics?

Next steps

We work with a number of amazing partners. We’re excited about the prospect of showcasing Attunity solutions and helping customers get to insights faster in their big data analytics workflows using Azure and partner solutions.

We hope you join us on September 18th for our webinar. To register for this webinar, please visit the event signup page. To sign up for the Azure Data Lake Storage Gen2 preview please visit our product page.

↧

Azure preparedness for Hurricane Florence

September 12, 2018, 3:30 am

≫ Next: Five habits of highly effective Azure users

≪ Previous: Real-time data analytics and Azure Data Lake Storage Gen2

As Hurricane Florence continues its journey to the mainland, our thoughts are with those in its path. Please stay safe. We’re actively monitoring Azure infrastructure in the region. We at Microsoft have taken all precautions to protect our customers and our people.

Our datacenters (US East, US East 2, and US Gov Virginia) have been reviewed internally and externally to ensure that we are prepared for this weather event. Our onsite teams are prepared to switch to generators if utility power is unavailable or unreliable. All our emergency operating procedures have been reviewed by our team members across the datacenters, and we are ensuring that our personnel have all necessary supplies throughout the event.

As a best practice, all customers should consider their disaster recovery plans and all mission-critical applications should be taking advantage of geo-replication.

Rest assured that Microsoft is focused on the readiness and safety of our teams, as well as our customers’ business interests that rely on our datacenters.

You can reach our handle @AzureSupport on Twitter, we are online 24/7. Any business impact to customers will be communicated through Azure Service Health in Azure portal.

If there is any change to the situation, we will keep customers informed of Microsoft’s actions through this announcement.

For guidance on Disaster Recovery best practices see references below:

Disaster Recovery for Azure services, which also includes references to specific service documentation on business continuity.
Our FastTrack Advisory on Disaster Recovery steps.

↧

Five habits of highly effective Azure users

September 12, 2018, 4:00 am

≫ Next: Kickstart your artificial intelligence/machine learning journey with the Healthcare Blueprint

≪ Previous: Azure preparedness for Hurricane Florence

There’s a lot you can do with Azure. But whether you’re modernizing your IT environment, building next-generation apps, harnessing the power of artificial intelligence, or deploying any of a million other solutions, there are foundational habits that can help you succeed.

On the Azure team, we spend a lot of time listening to our customers and understanding what makes them successful. We’ve started to compile a handful of routine activities that can help you get the most out of Azure.

Stay on top of proven practice recommendations

Highly effective Azure users understand that optimization is never really finished. To stay on top of the latest proven practices, they regularly review Azure Advisor.

Advisor is a free tool that analyzes your Azure resource configurations and usage and offers recommendations to optimize your workloads for high availability, security, performance, and cost. Examples of popular Advisor recommendations include rightsizing or shutting down underutilized Virtual Machines (VMs) and configuring backup for your VMs.

You can think of Advisor as your single pane of glass for recommendations across Azure. Advisor integrates with companion tools like Azure Security Center and Azure Cost Management, which makes it easier for you to discover and act on all your optimizations. We’re constantly adding more recommendations, and you’re constantly doing new things with Azure, so make it a habit to check Advisor regularly to stay on top of Azure proven practices.

Review your personalized Advisor proven practice recommendations.

Check out Azure Advisor documentation to help you get started.

Stay in control of your resources on the go

We live in a world of always on, and highly effective Azure users recognize that they need to stay in control of their Azure resources anytime, anywhere, not just from their desktop. They use the Azure mobile app to monitor and manage their Azure resources on the go.

With the Azure mobile app, you can check for alerts, review metrics, and take corrective actions to fix common issues, right from your Android or iOS phone or tablet. You can also tap the Cloud Shell button and run your choice of shell environment (PowerShell or Bash) for full access to your Azure services in a command-line experience.

Download the Azure mobile app to stay in control on the go.

Explore the Azure mobile app in this video demo.

Stay informed during issues and maintenance

Highly effective Azure users recognize how important it is to stay informed about Azure maintenance and service issues. They turn to Azure Service Health, a free service that provides personalized alerts and guidance for everything from planned maintenance and service changes to health advisories like service transitions. Service Health can notify you, help you understand the impact to your resources, and keep you updated as the issue is resolved.

Highly effective Azure users configure Service Health to inform their teams of service issues by setting up alerts for the subscriptions, services, and regions that are relevant to them. For example, they might create an alert to:

Send an email to a dev team when a resource in a dev/test subscription is impacted.
Update ServiceNow or PagerDuty via webhook to alert your on-duty operations team when a resource in production is impacted.
Send an SMS to a regional IT operations manager when resources in a given region are impacted.

Stay informed during issues and maintenance by setting up your Azure Service Health alerts.

For more information, read documentation on creating Service Health alerts.

Stay up-to-date with the latest announcements

The pace of change in the cloud is rapid. Highly effective Azure users understand that they need all the help they can get to stay up-to-date with the latest releases, announcements, and innovations.

The Azure updates page on Azure.com is the central place to get all your updates about Azure, from pricing changes to SDK updates. Highly effective Azure users go beyond simply bookmarking the Azure updates page — they subscribe to products and features that are relevant to them, so they’ll receive proactive notifications whenever there’s an announcement or change.

Subscribe for your Azure updates.

Explore product availability by region to find specific release dates for your preferred regions.

Stay engaged with your peers – share and learn

This list of habits of highly effective Azure users is by no means exhaustive. There are many more, which is why this fifth habit is so important. Time and again, we see highly effective Azure users staying engaged with their peers to share good habits they’ve discovered and learn new ones from the community.

There are as many ways to stay engaged with your peers as there are good habits to share and learn. Here are a few ideas to get started:

Get involved with the Azure Community.
Create or join a practice-sharing community within your own organization.
Attend industry conferences and trade groups.
Participate in Meet Ups in your area.
Reach out to your peers and Azure on Twitter.

Join us at Microsoft Ignite in-person and online

Finally, another great way to share and learn strong Azure habits is to join us at Microsoft Ignite in Orlando, Florida from September 24 to September 28. You’ll gain new skills, meet other experts, and discover the latest technology.

Look for the Five Habits of Highly Effective Azure Users booth on the Ignite Expo floor to get started fast with these habits and become an even more effective Azure user. It’s also never too soon to start planning which Ignite sessions you’d like to attend, either in-person or streaming live and on-demand.

Regardless of how you’re using Azure, start putting these five habits of highly effective Azure users into practice today.

↧

Kickstart your artificial intelligence/machine learning journey with the Healthcare Blueprint

September 12, 2018, 5:00 am

≫ Next: How to extract building footprints from satellite images using deep learning

≪ Previous: Five habits of highly effective Azure users

Azure blueprints are far more than models drawn on paper or solution descriptions in a document. They are packages of scripts, data, and other artifacts needed to install and exercise a reference implementation solution on Azure. The Azure Security and Compliance Blueprint - HIPAA/HITRUST Health Data and AI is one such blueprint targeting a specific scenario common in healthcare.

The healthcare blueprint

The healthcare blueprint includes a real healthcare scenario and an associated machine learning experiment for predicted patient length of stay (LOS). This use case is valuable to healthcare organizations because it forecasts bed counts, operational needs, and staffing requirements. This adds up to considerable savings for the organization using a LOS machine learning experiment.

Blueprint solution guide

A blueprint, like the one for AI in healthcare, consists of multiple components along with documentation. That said, there may be some areas that lack clarity and cause trouble in using the blueprint services after installation. To help with any pain points in the installation and usage of the Healthcare AI blueprint, we’ve developed a solution guidance document, Implementing the Azure blueprint for AI.

The article introduces the blueprint and walks through tips for installation and running the AI/ML experiments. For those just getting started with this blueprint, the document gives some insight into the solution. It also provides guidance that those unfamiliar with Azure will find helpful.

Next steps

Read the solution guide, Implementing the Azure blueprint for AI to better understand how to proceed with the overall blueprint solution.
Explore the use cases page for the blueprint in healthcare. It includes useful links and resources, as well as case studies.
Read the original documentation to begin installing and using the blueprint.
Download the blueprint.

↧

How to extract building footprints from satellite images using deep learning

September 12, 2018, 6:00 am

≫ Next: ASP.NET Core 2.2.0-preview2 now available

≪ Previous: Kickstart your artificial intelligence/machine learning journey with the Healthcare Blueprint

As part of the AI for Earth team, I work with our partners and other researchers inside Microsoft to develop new ways to use machine learning and other AI approaches to solve global environmental challenges. In this post, we highlight a sample project of using Azure infrastructure for training a deep learning model to gain insight from geospatial data. Such tools will finally enable us to accurately monitor and measure the impact of our solutions to problems such as deforestation and human-wildlife conflict, helping us to invest in the most effective conservation efforts.

Applying machine learning to geospatial data

When we looked at the most widely-used tools and datasets in the environmental space, remote sensing data in the form of satellite images jumped out.

Today, subject matter experts working on geospatial data go through such collections manually with the assistance of traditional software, performing tasks such as locating, counting and outlining objects of interest to obtain measurements and trends. As high-resolution satellite images become readily available on a weekly or daily basis, it becomes essential to engage AI in this effort so that we can take advantage of the data to make more informed decisions.

Geospatial data and computer vision, an active field in AI, are natural partners: tasks involving visual data that cannot be automated by traditional algorithms, abundance of labeled data, and even more unlabeled data waiting to be understood in a timely manner. The geospatial data and machine learning communities have joined effort on this front, publishing several datasets such as Functional Map of the World (fMoW) and the xView Dataset for people to create computer vision solutions on overhead imagery.

An example of infusing geospatial data and AI into applications that we use every day is using satellite images to add street map annotations of buildings. In June 2018, our colleagues at Bing announced the release of 124 million building footprints in the United States in support of the Open Street Map project, an open data initiative that powers many location based services and applications. The Bing team was able to create so many building footprints from satellite images by training and applying a deep neural network model that classifies each pixel as building or non-building. Now you can do exactly that on your own!

With the sample project that accompanies this blog post, we walk you through how to train such a model on an Azure Deep Learning Virtual Machine (DLVM). We use labeled data made available by the SpaceNet initiative to demonstrate how you can extract information from visual environmental data using deep learning. For those eager to get started, you can head over to our repo on GitHub to read about the dataset, storage options and instructions on running the code or modifying it for your own dataset.

Semantic segmentation

In computer vision, the task of masking out pixels belonging to different classes of objects such as background or people is referred to as semantic segmentation. The semantic segmentation model (a U-Net implemented in PyTorch, different from what the Bing team used) we are training can be used for other tasks in analyzing satellite, aerial or drone imagery – you can use the same method to extract roads from satellite imagery, infer land use and monitor sustainable farming practices, as well as for applications in a wide range of domains such as locating lungs in CT scans for lung disease prediction and evaluating a street scene.

Illustration from slides by Tingwu Wang, University of Toronto (source).

Satellite imagery data

The data from SpaceNet is 3-channel high resolution (31 cm) satellite images over four cities where buildings are abundant: Paris, Shanghai, Khartoum and Vegas. In the sample code we make use of the Vegas subset, consisting of 3854 images of size 650 x 650 squared pixels. About 17.37 percent of the training images contain no buildings. Since this is a reasonably small percentage of the data, we did not exclude or resample images. In addition, 76.9 percent of all pixels in the training data are background, 15.8 percent are interior of buildings and 7.3 percent are border pixels.

Original images are cropped into nine smaller chips with some overlap using utility functions provided by SpaceNet (details in our repo). The labels are released as polygon shapes defined using well-known text (WKT), a markup language for representing vector geometry objects on maps. These are transformed to 2D labels of the same dimension as the input images, where each pixel is labeled as one of background, boundary of building or interior of building.

Some chips are partially or completely empty like the examples below, which is an artifact of the original satellite images and the model should be robust enough to not propose building footprints on empty regions.

Training and applying the model

The sample code contains a walkthrough of carrying out the training and evaluation pipeline on a DLVM. The following segmentation results are produced by the model at various epochs during training for the input image and label pair shown above. This image features buildings with roofs of different colors, roads, pavements, trees and yards. We observe that initially the network learns to identify edges of building blocks and buildings with red roofs (different from the color of roads), followed by buildings of all roof colors after epoch 5. After epoch 7, the network has learnt that building pixels are enclosed by border pixels, separating them from road pixels. After epoch 10, smaller, noisy clusters of building pixels begin to disappear as the shape of buildings becomes more defined.

A final step is to produce the polygons by assigning all pixels predicted to be building boundary as background to isolate blobs of building pixels. Blobs of connected building pixels are then described in polygon format, subject to a minimum polygon area threshold, a parameter you can tune to reduce false positive proposals.

Training and model parameters

There are a number of parameters for the training process, the model architecture and the polygonization step that you can tune. We chose a learning rate of 0.0005 for the Adam optimizer (default settings for other parameters) and a batch size of 10 chips, which worked reasonably well.

Another parameter unrelated to the CNN part of the procedure is the minimum polygon area threshold below which blobs of building pixels are discarded. Increasing this threshold from 0 to 300 squared pixels causes the false positive count to decrease rapidly as noisy false segments are excluded. The optimum threshold is about 200 squared pixels.

The weight for the three classes (background, boundary of building, interior of building) in computing the total loss during training is another parameter to experiment with. It was found that giving more weights to interior of building helps the model detect significantly more small buildings (result see figure below).

Each plot in the figure is a histogram of building polygons in the validation set by area, from 300 square pixels to 6000. The count of true positive detections in orange is based on the area of the ground truth polygon to which the proposed polygon was matched. The top histogram is for weights in ratio 1:1:1 in the loss function for background : building interior : building boundary; the bottom histogram is for weights in ratio 1:8:1. We can see that towards the left of the histogram where small buildings are represented, the bars for true positive proposals in orange are much taller in the bottom plot.

Last thoughts

Building footprint information generated this way could be used to document the spatial distribution of settlements, allowing researchers to quantify trends in urbanization and perhaps the developmental impact of climate change such as climate migration. The techniques here can be applied in many different situations and we hope this concrete example serves as a guide to tackling your specific problem.

Another piece of good news for those dealing with geospatial data is that Azure already offers a Geo Artificial Intelligence Data Science Virtual Machine (Geo-DSVM), equipped with ESRI’s ArcGIS Pro Geographic Information System. We also created a tutorial on how to use the Geo-DSVM for training deep learning models and integrating them with ArcGIS Pro to help you get started.

Finally, if your organization is working on solutions to address environmental challenges using data and machine learning, we encourage you to apply for an AI for Earth grant so that you can be better supported in leveraging Azure resources and become a part of this purposeful community.

Acknowledgement

I would like thank Victor Liang, Software Engineer at Microsoft, who worked on the original version of this project with me as part of the coursework for Stanford’s CS231n in Spring 2018, and Wee Hyong Tok, Principal Data Scientist Manager at Microsoft for his help in drafting this blog post.

↧

ASP.NET Core 2.2.0-preview2 now available

September 12, 2018, 8:00 am

≫ Next: Announcing ML.NET 0.5

≪ Previous: How to extract building footprints from satellite images using deep learning

Today we’re very happy to announce that the second preview of the next minor release of ASP.NET Core and .NET Core is now available for you to try out. We’ve been working hard on this release over the past months, along with many folks from the community, and it’s now ready for a wider audience to try it out and provide the feedback that will continue to shape the release.

How do I get it?

You can download the new .NET Core SDK for 2.2.0-preview2 (which includes ASP.NET 2.2.0-preview2) from https://www.microsoft.com/net/download/dotnet-core/sdk-2.2.0-preview2

Visual Studio requirements

Customers using Visual Studio should also install and use the Preview channel of Visual Studio 2017 (15.9 Preview 2) in addition to the SDK when working with .NET Core 2.2 and ASP.NET Core 2.2 projects. Please note that the Visual Studio preview channel can be installed side-by-side with existing an Visual Studio installation without disrupting your current development environment.

Azure App Service Requirements

If you are hosting your application on Azure App Service, you can follow these instructions to install the required site extension for hosting your 2.2.0-preview2 applications.

Impact to machines

Please note that is a preview release and there are likely to be known issues and as-yet-to-be discovered bugs. While the .NET Core SDK and runtime installs are side-by-side, your default SDK will become the latest one. If you run into issues working on existing projects using earlier versions of .NET Core after installing the preview SDK, you can force specific projects to use an earlier installed version of the SDK using a global.json file as documented here. Please log an issue if you run into such cases as SDK releases are intended to be backwards compatible.

What’s new in Preview 2

For a full list of changes, bug fixes, and known issues you can read the release notes.

SignalR Java Client updated to support Azure SignalR Service

The SignalR Java Client, first introduced in preview 1, now has support for the Azure SignalR Service. You can now develop Java and Android applications that connect to a SignalR server using the Azure SignalR Service. To get this new functionality, just update your Maven or Gradle file to reference version 0.1.0-preview2-35174 of the SignalR Client package.

Problem Details support

In 2.1.0, MVC introduced ProblemDetails, based on the RFC 7807 specification for carrying detils of an error with a HTTP Response. In preview2, we’re standardizing around using ProblemDetails for client error codes in controllers attributed with ApiControllerAttribute. An IActionResult returning a client error status code (4xx) will now return a ProblemDetails body. The result additionally includes a correlation ID that can be used to correlate the error using request logs. Lastly, ProducesResponseType for client errors, default to using ProblemDetails as the response type. This will be documented in Open API / Swagger output generated using NSwag or Swashbuckle.AspNetCore. Documentation for configuring the ProblemDetails response can be found here – https://aka.ms/AA2k4zg.

ASP.NET Core Module Improvements

We’ve introduced a new module (aspNetCoreModuleV2) for hosting ASP.NET Core application in IIS in 2.2.0-preview1. This new module adds the ability to host your .NET Core application within the IIS worker process and avoids the additional cost of reverse-proxying your requests over to a separate dotnet process.

ASP.NET Core 2.2.0-preview2 or newer projects default to the new in-process hosting model. If you are upgrading from preview1, you will need to add a new project property to your .csproj file.

<PropertyGroup>
  <TargetFramework>netcoreapp2.2</TargetFramework>
  <AspNetCoreHostingModel>inprocess</AspNetCoreHostingModel>
</PropertyGroup>

Visual Studio 15.9-preview2 adds the ability to switch your hosting model as part of your development-time experience.

Hosting in IIS

To deploy applications targeting ASP.NET Core 2.2.0-preview2 on servers with IIS, you require a new version of the 2.2 Runtime & Hosting Bundle on the target server. The bundle is available at https://www.microsoft.com/net/download/dotnet-core/2.2.

Caveats

There are a couple of caveats with the new in-process hosting model: – You are limited to one application per IIS Application Pool. – No support for .NET Framework. The new module is only capable of hosting .NET Core in the IIS process.

If you have a ASP.NET Core 2.2 app that’s using the in process hosting model, you can turn it off by setting the <AspNetCoreHostingModel> element to outofprocess in your .csproj file.

Template Updates

We’ve cleaned up the Bootstrap 4 project template work that we started in Preview 1. We’ve also added support to the default Identity UI for using both Bootstrap 3 & 4. For compatibility with existing apps the default Bootstrap version for the default UI is now Bootstrap 3, but you can select which version of Boostrap you want to use when calling AddDefaultUI.

HealthCheck Improvements

There are a few small, but important, changes to health checks in preview2.

You can now call AddCheck<T> where T is a type of IHealthCheck:

services.AddHealthChecks()
        .AddCheck<MyHealthCheck>();

This will register your health check as a transient service, meaning that each time the health check service is called a new instance will be created. We allow you to register IHealthCheck implementations with any service lifetime when you register them manually:

services.AddHealthChecks();
services.AddSingleton<IHealthCheck, MySingletonCheck>();

A scope is created for each invocation of the HealthChecksService. As with all DI lifetimes you should be careful when creating singleton objects that depend on services with other lifetimes as described here.

You can filter which checks you want to execute when using the middleware or the HealthCheckService directly. In this example we are executing all our health checks when a request is made on the ready path, but just returning a 200 OK when the live path is hit:

// The readiness check uses all of the registered health checks (default)
app.UseHealthChecks("/health/ready");

// The liveness check uses an 'identity' health check that always returns healthy
app.UseHealthChecks("/health/live", new HealthCheckOptions()
{
    // Exclude all checks, just return a 200.
    Predicate = (check) => false,
});

You might do this if, for example, you are using Kubernetes and want to run a comprehensive set of checks before traffic is sent to your application but otherwise are OK as long as you are reachable and still running.

What’s still to come?

We are investaging adding a tags mechanism to checks, so that they can be set and filtered on. We also want to provide an Entity Framework specific check that will check whatever database has been configured to be used with your DbContext.

Migrating an ASP.NET Core 2.1 project to 2.2

To migrate an ASP.NET Core project from 2.1.x to 2.2.0-preview2, open the project’s .csproj file and change the value of the the element to netcoreapp2.2. You do not need to do this if you’re targeting .NET Framework 4.x.

Giving Feedback

The main purpose of providing previews is to solicit feedback so we can refine and improve the product in time for the final release. Please help provide us feedback by logging issues in the appropriate repository at https://github.com/aspnet or https://github.com/dotnet. We look forward to receiving your feedback!

↧

Announcing ML.NET 0.5

September 12, 2018, 9:53 am

≫ Next: A (Belated) Welcome to C# 7.3

≪ Previous: ASP.NET Core 2.2.0-preview2 now available

Today, coinciding with the .NET Conf 2018, we’re announcing the release of ML.NET 0.5. It’s been a few months already since we released ML.NET 0.1 at //Build 2018, a cross-platform, open source machine learning framework for .NET developers. While we’re evolving through new preview releases, we are getting great feedback and would like to thank the community for your engagement as we continue to develop ML.NET together in the open.

In this 0.5 release we are adding TensorFlow model scoring as a transform to ML.NET. This enables using an existing TensorFlow model within an ML.NET experiment. In addition we are also addressing a variety of issues and feedback we received from the community. We welcome feedback and contributions to the conversation: relevant issues can be found here.

As part of the upcoming road in ML.NET, we really want your feedback on making ML.NET easier to use. We are working on a new ML.NET API which improves flexibility and ease of use. When the new API is ready and good enough, we plan to deprecate the current LearningPipeline API. Because this will be a significant change we are sharing our proposals for the multiple API options and comparisons at the end of this blog post. We also want an open discussion where you can provide feedback and help shape the long-term API for ML.NET.

This blog post provides details about the following topics in ML.NET:

Added a TensorFlow model scoring transform (TensorFlowTransform)

TensorFlow is a popular deep learning and machine learning toolkit that enables training deep neural networks (and general numeric computations).

Deep learning is a subset of AI and machine learning that teaches programs to do what comes naturally to humans: learn by example.
Its main differentiator compared to traditional machine learning is that a deep learning model can learn to perform object detection and classification tasks directly from images, sound or text, or even deliver tasks such as speech recognition and language translation, whereas traditional ML approaches relied heavily on feature engineering and data processing.
Deep learning models need to be trained by using very large sets of labeled data and neural networks that contain multiple layers. Its current popularity is caused by several reasons. First, it just performs better on some tasks like Computer Vision and second because it can take advantage of huge amounts of data (and requires that volume in order to perform well) that are nowadays becoming available.

With ML.NET 0.5 we are starting to add support for Deep Learning in ML.NET. Today we are introducing the first level of integration with TensorFlow in ML.NET through the new TensorFlowTransform which enables taking an existing TensorFlow model, either trained by you or downloaded from somewhere else, and get the scores from the TensorFlow model in ML.NET.

This new TensorFlow scoring capability doesn’t require you to have a working knowledge of TensorFlow internal details. Longer term we will be working on making the experience for performing Deep Learning with ML.NET even easier.

The implementation of this transform is based on code from TensorFlowSharp.

As shown in the following diagram, you simply add a reference to the ML.NET NuGet packages in your .NET Core or .NET Framework apps. Under the covers, ML.NET includes and references the native TensorFlow library which allows you to write code that loads an existing trained TensorFlow model file for scoring.

The following code snippet shows how to use the TensorFlow transform in the ML.NET pipeline:

// ... Additional transformations in the pipeline code

pipeline.Add(new TensorFlowScorer()
{
    ModelFile = "model/tensorflow_inception_graph.pb",   // Example using the Inception v3 TensorFlow model
    InputColumns = new[] { "input" },                    // Name of input in the TensorFlow model
    OutputColumn = "softmax2_pre_activation"             // Name of output in the TensorFlow model
});

// ... Additional code specifying a learner and training process for the ML.NET model

The code example above uses the pre-trained TensorFlow model named Inception v3, that you can download from here. The Inception v3 is a very popular image recognition model trained on the ImageNet dataset where the TensorFlow model tries to classify entire images into a thousand classes, like “Umbrella”, “Jersey”, and “Dishwasher”.

The Inception v3 model can be classified as a deep convolutional neural network and can achieve reasonable performance on hard visual recognition tasks, matching or exceeding human performance in some domains. The model/algorithm was developed by multiple researchers and based on the original paper: “Rethinking the Inception Architecture for Computer Vision” by Szegedy, et. al.

In the next ML.NET releases, we will add functionality to enable identifying the expected inputs and outputs of TensorFlow models. For now, use the TensorFlow APIs or a tool like Netron to explore the TensorFlow model.

If you open the previous sample TensorFlow model file (tensorflow_inception_graph.pb) with Netron and explore the model’s graph, you can see how it correlates the InputColumn with the node’s input at the beginning of the graph:

And how the OutputColumn correlates with softmax2_pre_activation node’s output almost at the end of the graph.

Limitations: We are currently updating the ML.NET APIs for improved flexibility, as there are a few limitations to use TensorFlow in ML.NET today. For now (when using the LearningPipeline API), these scores can only be used within a LearningPipeline as inputs (numeric vectors) to a learner like a classifier learner. However, with the upcoming new ML.NET APIs, the TensorFlow model scores will be directly accessible, so you score with the TensorFlow model without the current need to add an additional learner and its related train process as implemented in this sample. It creates a multi-class classification ML.NET model based on a StochasticDualCoordinateAscentClassifier using a label (object name) related to a numeric vector feature generated/scored per image file by the TensorFlow model.

Take into account that the mentioned TensorFlow code examples using ML.NET are using the current LearningPipeline API available in v0.5. Moving forward, the ML.NET API enabling to use TensorFlow will be slightly different and not based on the “pipeline”. This is related to the next section of this blog post which focuses on the new upcoming API for ML.NET.

Finally, we also want to highlight that the ML.NET framework is currently surfacing TensorFlow, but in the future we might look into additional Deep Learning library integrations, such as Torch and CNTK.

You can find an additional code example using the TensorFlowTransform with the existing LearningPipeline API here.

Explore the upcoming new ML.NET API and provide feedback

As mentioned at the beginning of this blog post, we are really looking forward to get your feedback as we create the new ML.NET API while crafting ML.NET. This evolution in ML.NET offers more flexible capabilities than what the current LearningPipeline API offers. The LearningPipeline API will be deprecated when this new API is ready and good enough.

The following links to some example feedback we got in the form of GitHub issues about the limitations when using the LearningPipeline API:

Therefore, based on feedback on the LearningPipeline API, quite a few weeks ago we decided to switch to a new ML.NET API that would address most of the limitations the LearningPipeline API currently has.

Design principles for this new ML.NET API

We are designing this new API based on the following principles of :

Using parallel terminology with other well-known frameworks like Scikit-Learn, TensorFlow and Spark and we will try to be consistent in terms of naming and concepts making it easier for developers to understand and learn ML.NET Core.
Keeping simple and concise ML scenarios such as simple train and predict.
Allowing advanced ML scenarios (not possible with the current LearningPipeline API as explained in the next section).

We have also explored API approaches like Fluent API, declarative, and imperative.
For additional deeper discussion on principles and required scenarios, check out this issue in GitHub.

Why ML.NET is switching from the `LearningPipeline` API to a new API?

As part of the preview version crafting process (remember that ML.NET is still in early previews), we’ve been getting LearningPipeline API feedback and discovered quite a few limitations we need to address by creating a more flexible API.

Specifically, the new ML.NET API offers attractive features which aren’t possible with the current LearningPipeline API:

Strongly-typed API: This new Strongly-typed API takes advantage of C# capabilities so errors can be discovered in compilation time along with improved Intellisense in the editors.
Better flexibility: This API provides a decomposable train and predict process, eliminating rigid and linear pipeline execution. With the new API, execute a certain code path and then fork the execution so multiple paths can re-use the initial common execution. For example, share a given transforms’ execution and transformed data with multiple learners and trainers, or decompose pipelines and add multiple learners.

This new API is based on concepts such as Estimators, Transforms and DataView, shown in the following code in this blog post.

Improved usability: Direct call to the APIs from your code, no more scaffolding or insolation layer creating an obscure separation between what the user/developer writes and the internal APIs. Entrypoints are no longer mandatory.
Ability to simply score with TensorFlow models. Thanks to the mentioned flexibility in the API, you can also simply load a TensorFlow model and score by using it without needing to add any additional learner and training process, as explained in the previous “Limitations” topic within the TensorFlow section.
Better visibility of the transformed data: You have better visibility of the data while applying transformers.

Comparison of strongly-typed API vs. `LearningPipeline` API

Another important comparison is related to the Strongly Typed API feature in the new API.
As an example of issues you can get when you don’t have strongly typed API, the LearningPipeline API (as illustrated in the following code) provides access to data columns by specifying the comlumn’s names as strings, so if you make a typo (i.e. you wrote “Descrption” without the ‘i’ instead of “Description”, as the typo in the sample code), you will get a run-time exception:

pipeline.Add(new TextFeaturizer("Description", "Descrption"));

However, when using the new ML.NET API, it is strongly typed, so if you make a typo, it will be caught in compilation time plus you can also take advatage of Intellisense in the editor.

var estimator = reader.MakeEstimator()
                .Append(row => (                    
                    description: row.description.FeaturizeText()))

Details on decomposable train and predict API

The following code snippet shows how the transforms and training process of the “GitHub issues labeler” sample app can be implemented with the new API in ML.NET.

This is our current proposal and based on your feedback this API will probably evolve accordingly.

New ML.NET API code example:

public static async Task BuildAndTrainModelToClassifyGithubIssues()
{
    var env = new MLEnvironment();

    string trainDataPath = @"Dataissues_train.tsv";

    // Create reader
    var reader = TextLoader.CreateReader(env, ctx =>
                                    (area: ctx.LoadText(1),
                                    title: ctx.LoadText(2),
                                    description: ctx.LoadText(3)),
                                    new MultiFileSource(trainDataPath), 
                                    hasHeader : true);

    var loss = new HingeLoss(new HingeLoss.Arguments() { Margin = 1 });

    var estimator = reader.MakeNewEstimator()
        .Append(row => (
            // Convert string label to key. 
            label: row.area.ToKey(),
            // Featurize 'description'
            description: row.description.FeaturizeText(),
            // Featurize 'title'
            title: row.title.FeaturizeText()))
        .Append(row => (
            // Concatenate the two features into a vector and normalize.
            features: row.description.ConcatWith(row.title).Normalize(),
            // Preserve the label - otherwise it will be dropped
            label: row.label))
        .Append(row => (
            // Preserve the label (for evaluation)
            row.label,
            // Train the linear predictor (SDCA)
            score: row.label.PredictSdcaClassification(row.features, loss: loss)))
        .Append(row => (
            // Want the prediction, as well as label and score which are needed for evaluation
            predictedLabel: row.score.predictedLabel.ToValue(),
            row.label,
            row.score));

    // Read the data
    var data = reader.Read(new MultiFileSource(trainDataPath));

    // Fit the data to get a model
    var model = estimator.Fit(data);

    // Use the model to get predictions on the test dataset and evaluate the accuracy of the model
    var scores = model.Transform(reader.Read(new MultiFileSource(@"Dataissues_test.tsv")));
    var metrics = MultiClassClassifierEvaluator.Evaluate(scores, r => r.label, r => r.score);

    Console.WriteLine("Micro-accuracy is: " + metrics.AccuracyMicro);

    // Save the ML.NET model into a .ZIP file
    await model.WriteAsync("github-Model.zip");
}

public static async Task PredictLableForGithubIssueAsync()
{
    // Read model from an ML.NET .ZIP model file
    var model = await PredictionModel.ReadAsync("github-Model.zip");

    // Create a prediction function that can be used to score incoming issues
    var predictor = model.AsDynamic.MakePredictionFunction<GitHubIssue, IssuePrediction>(env);

    // This prediction will classify this particular issue in a type such as "EF and Database access"
    var prediction = predictor.Predict(new GitHubIssue
    {
        title = "Sample issue related to Entity Framework",
        description = @"When using Entity Framework Core I'm experiencing database connection failures when running queries or transactions. Looks like it could be related to transient faults in network communication agains the Azure SQL Database."
    });

    Console.WriteLine("Predicted label is: " + prediction.predictedLabel);
}

Compare with the following old LearningPipeline API code snippet that lacks flexibility because the pipeline execution is not decomposable but linear:

Old LearningPipeline API code example:

public static async Task BuildAndTrainModelToClassifyGithubIssuesAsync()
{
        // Create the pipeline
    var pipeline = new LearningPipeline();

    // Read the data
    pipeline.Add(new TextLoader(DataPath).CreateFrom<GitHubIssue>(useHeader: true));

    // Dictionarize the "Area" column
    pipeline.Add(new Dictionarizer(("Area", "Label")));

    // Featurize the "Title" column
    pipeline.Add(new TextFeaturizer("Title", "Title"));

    // Featurize the "Description" column
    pipeline.Add(new TextFeaturizer("Description", "Description"));
    
    // Concatenate the provided columns
    pipeline.Add(new ColumnConcatenator("Features", "Title", "Description"));

    // Set the algorithm/learner to use when training
    pipeline.Add(new StochasticDualCoordinateAscentClassifier());

    // Specify the column to predict when scoring
    pipeline.Add(new PredictedLabelColumnOriginalValueConverter() { PredictedLabelColumn = "PredictedLabel" });

    Console.WriteLine("=============== Training model ===============");

    // Train the model
    var model = pipeline.Train<GitHubIssue, GitHubIssuePrediction>();

    // Save the model to a .zip file
    await model.WriteAsync(ModelPath);

    Console.WriteLine("=============== End training ===============");
    Console.WriteLine("The model is saved to {0}", ModelPath);
}

public static async Task<string> PredictLabelForGitHubIssueAsync()
{
    // Read model from an ML.NET .ZIP model file
    _model = await PredictionModel.ReadAsync<GitHubIssue, GitHubIssuePrediction>(ModelPath);
    
    // This prediction will classify this particular issue in a type such as "EF and Database access"
    var prediction = _model.Predict(new GitHubIssue
        {
            Title = "Sample issue related to Entity Framework", 
            Description = "When using Entity Framework Core I'm experiencing database connection failures when running queries or transactions. Looks like it could be related to transient faults in network communication agains the Azure SQL Database..."
        });

    return prediction.Area;
}

The old LearningPipeline API is a fully linear code path, so you can’t decompose it in multiple pieces.
For instance, the BikeSharing ML.NET sample (available at the machine-learning-samples GitHub repo) is using the current LearningPipeline API.

This sample compares the regression learner accuracy using the evaluators API by:

Performing several data transforms to the original dataset
Training and creating seven different ML.NET models based on seven different regression trainers/algorithms (such as FastTreeRegressor, FastTreeTweedieRegressor, StochasticDualCoordinateAscentRegressor, etc.)

The intent is to help you compare the regression learners for a given problem.

Since the data transformations are the same for those models, you might want to re-use the code execution related to transforms. However, because the the LearningPipeline API only provides a single linear execution, you need to run the same data transformation steps for every model you create/train, as shown in the following code excerpt from the BikeSharing ML.NET sample.

var fastTreeModel = new ModelBuilder(trainingDataLocation, new FastTreeRegressor()).BuildAndTrain();
var fastTreeMetrics = modelEvaluator.Evaluate(fastTreeModel, testDataLocation);
PrintMetrics("Fast Tree", fastTreeMetrics);

var fastForestModel = new ModelBuilder(trainingDataLocation, new FastForestRegressor()).BuildAndTrain();
var fastForestMetrics = modelEvaluator.Evaluate(fastForestModel, testDataLocation);
PrintMetrics("Fast Forest", fastForestMetrics);

var poissonModel = new ModelBuilder(trainingDataLocation, new PoissonRegressor()).BuildAndTrain();
var poissonMetrics = modelEvaluator.Evaluate(poissonModel, testDataLocation);
PrintMetrics("Poisson", poissonMetrics);

//Other learners/algorithms
//...

Where the BuildAndTrain() method needs to have both data transforms plus the different algorithm per case, as shown in the following code:

public PredictionModel<BikeSharingDemandSample, BikeSharingDemandPrediction> BuildAndTrain()
{
    var pipeline = new LearningPipeline();
    pipeline.Add(new TextLoader(_trainingDataLocation).CreateFrom<BikeSharingDemandSample>(useHeader: true, separator: ','));
    pipeline.Add(new ColumnCopier(("Count", "Label")));
    pipeline.Add(new ColumnConcatenator("Features", 
                                        "Season", 
                                        "Year", 
                                        "Month", 
                                        "Hour", 
                                        "Weekday", 
                                        "Weather", 
                                        "Temperature", 
                                        "NormalizedTemperature",
                                        "Humidity",
                                        "Windspeed"));
    pipeline.Add(_algorythm);

    return pipeline.Train<BikeSharingDemandSample, BikeSharingDemandPrediction>();
}

With the old LearningPipeline API, for every training using a different algorithm you need to run again the same process, performing the following steps again and again:

Load dataset from file
Make column transformations (concat, copy, or additional featurizers or dictionarizers, if needed)

But with the new ML.NET API based on Estimators and DataView you will be able to re-use parts of the execution, like in this case, re-using the data transforms execution as the base for multiple models using different algorithms.

You can also explore other “aspirational code examples” with the new API here

Because this will be a significant change in ML.NET we want to share our proposals and start an open discussion with you where you can provide your feedback and help shape the long-term API for ML.NET.

Provide your feedback on the new API

Want to get involved? Start by providing feedback at this blog post comments below or through issues at the ML.NET GitHub repo: https://github.com/dotnet/machinelearning/issues

Get started!

If you haven’t already, get started with ML.NET here!

Next, explore some other great resources:

Tutorials and resources at the Microsoft Docs ML.NET Guide
Code samples at the machinelearning-samples GitHub repo

We look forward to your feedback and welcome you to file issues with any suggestions or enhancements in the ML.NET GitHub repo.

This blog was authored by Cesar de la Torre, Gal Oshri, John Alexander, and Ankit Asthana

Thanks,

The ML.NET Team

↧

A (Belated) Welcome to C# 7.3

September 12, 2018, 10:00 am

≫ Next: Announcing Entity Framework Core 2.2 Preview 2 and the preview of the Cosmos DB provider and spatial extensions for EF Core

≪ Previous: Announcing ML.NET 0.5

A (Belated) Welcome to C# 7.3

Better late than never! Some of you may have noticed that C# 7.3 already shipped, back in Visual Studio 2017 update 15.7. Some of you may even be using the features already.

C# 7.3 is the newest point release in the 7.0 family and it continues themes of performance-focused safe code, as well as bringing some small "quality of life" improvements in both new and old features.

For performance, we have a few features that improve ref variables, pointers, and stackalloc. ref variables can now be reassigned, letting you treat ref variables more like traditional variables. stackalloc now has an optional initializer syntax, letting you easily and safely initialize stack allocated buffers. For struct fixed-size buffers, you can now index into the buffer without using a pinning statement. And when you do need to pin, we’ve made the fixed statement more flexible by allowing it to operate on any type that has a suitable GetPinnableReference method.

For feature improvements, we’ve removed some long time restrictions on constraints to System.Enum, System.Delegate, and we’ve added a new unmanaged constraint that allows you to take a pointer to a generic type parameter. We’ve also improved overload resolution (again!), allowed out and pattern variables in more places, enabled tuples to be compared using == and !=, and fixed the [field: ] attribute target for auto-implemented properties to target the property’s backing field.

All of these features are small additions to the language, but they should make each of these parts of the language a little easier and more pleasant. If you want more details, you can see the 15.7 release notes or check out the documentation for What’s new in C# 7.3.

Andy Gocke
C#/VB Compiler Team

↧

Announcing Entity Framework Core 2.2 Preview 2 and the preview of the Cosmos DB provider and spatial extensions for EF Core

September 12, 2018, 10:04 am

≫ Next: If not Notebooks, then what? Look to Literate Programming

≪ Previous: A (Belated) Welcome to C# 7.3

Today we are making EF Core 2.2 Preview 2 available, together with a preview of our data provider for Cosmos DB and new spatial extensions for our SQL Server and in-memory providers.

Obtaining the preview

The preview bits are available on NuGet, and also as part of ASP.NET Core 2.2 Preview 2 and the .NET Core SDK 2.2 Preview 2, also releasing today.

If you are working on an application based on ASP.NET Core, we recommend you upgrade to ASP.NET Core 2.2 Preview 2 following the instructions in the announcement.

The SQL Server and the in-memory providers are included in ASP.NET Core, but for other providers and any other type of application, you will need to install the corresponding NuGet package. For example, to add the 2.2 Preview 2 version of the SQL Server provider in a .NET Core library or application from the command line, use:

$ dotnet add package Microsoft.EntityFrameworkCore.SqlServer -v 2.2.0-preview2-35157

Or from the Package Manager Console in Visual Studio:

PM> Install-Package Microsoft.EntityFrameworkCore.SqlServer -Version 2.2.0-preview2-35157

For more details on how to add EF Core to your projects see our documentation on Installing Entity Framework Core.

The Cosmos DB provider and the spatial extensions ship as new separate NuGet packages. We’ll explain how to get started with them in the corresponding feature descriptions.

What is new in this preview?

As we explained in our roadmap annoucement back in June, there will be a large number of bug fixes (you can see the list of issues we have fixed so far here, but only a relatively small number of new features in EF Core 2.2.

Here are the most salient new features:

New EF Core provider for Cosmos DB

This new provider enables developers familiar with the EF programing model to easily target Azure Cosmos DB as an application database, with all the advantages that come with it, including global distribution, elastic scalability, “always on” availability, very low latency, and automatic indexing.

The provider targets the SQL API in Cosmos DB, and can be installed in an application issuing the following command form the command line:

$ dotnet add package Microsoft.EntityFrameworkCore.Cosmos.Sql -v 2.2.0-preview2-35157

Or from the Package Manager Console in Visual Studio:

PM> Install-Package Microsoft.EntityFrameworkCore.Cosmos.Sql -Version 2.2.0-preview2-35157

To configure a DbContext to connect to Cosmos DB, you call the UseCosmosSql() extension method. For example, the following DbContext connects to a database called “MyDocuments” on the Cosmos DB local emulator to store a simple blogging model:

public class BloggingContext : DbContext
{
  public DbSet<Blog> Blogs { get; set; }
  public DbSet<Post> Posts { get; set; }

  protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
  {
    optionsBuilder.UseCosmosSql(
      "https://localhost:8081",
      "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==",
      "MyDocuments");
  }
}

public class Blog
{
  public int BlogId { get; set; }
  public string Name { get; set; }
  public string Url { get; set; }
  public List<Post> Posts { get; set; }
}

public class Post
{
  public int PostId { get; set; }
  public string Title { get; set; }
  public string Content { get; set; }
}

If you want, you can create the database programmatically, using EF Core APIs:

using (var context = new BloggingContext())
{
  await context.Database.EnsureCreatedAsync();
}

Once you have connected to an existing database and you have defined your entities, you can start storing data in the database, for example:

using (var context = new BloggingContext())
{
  context.Blogs.Add(
    new Blog
    {
        BlogId = 1,
        Name = ".NET Blog",
        Url = "https://blogs.msdn.microsoft.com/dotnet/",
        Posts = new List<Post>
        {
            new Post
            {
                PostId = 2,
                Title = "Welcome to this blog!"
            },
        }
      }
    });
}

And you can write queries using LINQ:

var dotNetBlog = context.Blogs.Single(b => b.Name == ".NET Blog");

Current capabilities and limitations of the Cosmos DB provider

Around a year ago, we started showing similar functionality in demos, using a Cosmos DB provider prototype we put together as a proof of concept. This helped us get some great feedback:

Most customers we talked to confirmed that they could see a lot of value in being able to use the EF APIs they were already familiar with to target Cosmos DB, and potentially other NoSQL databases.
There were specific details about how the prototype worked, that we needed to fix.
For example, our prototype mapped entities in each inheritance hierarchy to their own separate Cosmos DB collections, but because of the way Cosmos DB pricing works, this could become unnecessarily expensive. Based on this feedback, we decided to implement a new mapping convention that by default stores all entity types defined in the DbContext in the same Cosmos DB collection, and uses a discriminator property to identify the entity type.

The preview we are releasing today, although limited in many ways, is no longer a prototype, but the actual code we plan on keeping working on and eventually shipping. Our hope is that by releasing it early in development, we will enable many developers to play with it and provide more valuable feedback.

Here are some of the known limitations we are working on overcoming for Preview 3 and RTM.

No asynchronous query support: Currently, LINQ queries can only be executed synchronously.
Only some of the LINQ operators translatable to Cosmos DB’s SQL dialect are currently translated.
No synchronous API support for SaveChanges(), EnsureCreated() or EsureDeleted(): you can use the asynchronous versions.
No auto-generated unique keys: Since entities of all types share the same collection, each entity needs to have a globally unique key value, but in Preview 2, if you use an integer Id key, you will need to set it explicitly to unique values on each added entity. This has been addressed, and in our nightly builds we now automatically generate GUID values.
No nesting of owned entities in documents: We are planning to use entity ownership to decide when an entity should be serialized as part of the same JSON document as the owner. In fact we are extending the ability to specify ownership to collections in 2.2. However this behavior hasn’t been implemented yet and each entity is stored as its own document.

You can track in more detail our progress overcoming these and other limitations in this issue on GitHub.

For anything else that you find, please report it as a new issue.

Spatial extensions for SQL Server and in-memory

Support for exposing the spatial capabilities of databases through the mapping of spatial columns and functions is a long-standing and popular feature request for EF Core. In fact, some of this functionality has been available to you for some time if you use the EF Core provider for PostgreSQL, Npgsql. In EF Core 2.2, we are finally attempting to address this for the providers that we ship.

Our implementation picks the same NetTopologySuite library that the PostgreSQL provider uses as the source of spatial .NET types you can use in your entity properties. NetTopologySuite is a database agnostic spatial library that implements standard spatial functionality using .NET idioms like properties and indexers.

The extension then adds the ability to map and convert instances of these types to the column types supported by the underlying database, and usage of methods defined on these types in LINQ queries, to SQL functions supported by the underlying database.

You can install the spatial extension using the following command form the command line:

$ dotnet add package Microsoft.EntityFrameworkCore.SqlServer.NetTopologySuite -v 2.2.0-preview2-35157

Or from the Package Manager Console in Visual Studio:

PM> Install-Package Microsoft.EntityFrameworkCore.SqlServer.NetTopologySuite -Version 2.2.0-preview2-35157

Once you have installed this extension, you can enable it in your DbContext by calling the UseNetTopologySuite() method inside UseSqlServer() either in OnConfiguring() or AddDbContext().

For example:

public class SensorContext : DbContext
{
  public DbSet<Measurement> Measurements { get; set; }

  protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
  {
    optionsBuilder
      .UseSqlServer(
        @"Server=(localdb)mssqllocaldb;Database=SensorDatabase;Trusted_Connection=True;ConnectRetryCount=0",
        sqlOptions => sqlOptions.UseNetTopologySuite())
  }
}

Then you can start using spatial types in your model definition. In this case, we will use NetTopologySuite.Geometries.Point to represent the location of a measurement:

using NetTopologySuite.Geometries;
...
  public class Measurement
  {
      public int Id { get; set; }
      public DateTime Time { get; set; }
      public Point Location { get; set; }
      public double Temperature { get; set; }
  }

Once you have configured the DbContext and the model in this way, you can create the database, and start persisting spatial data:

using (var context = new SensorContext())
{
  context.Database.EnsureCreated();
  context.AddRange(
    new Measurement { Time = DateTime.Now, Location = new Point(0, 0), Temperature = 0.0},
    new Measurement { Time = DateTime.Now, Location = new Point(1, 1), Temperature = 0.1},
    new Measurement { Time = DateTime.Now, Location = new Point(1, 2), Temperature = 0.2},
    new Measurement { Time = DateTime.Now, Location = new Point(2, 1), Temperature = 0.3},
    new Measurement { Time = DateTime.Now, Location = new Point(2, 2), Temperature = 0.4});
  context.SaveChanges();
}

And once you have a database containing spatial data, you can start executing queries:

var currentLocation = new Point(0, 0);

var nearestMesurements =
  from m in context.Measurements
  where m.Location.Distance(currentLocation) < 2
  orderby m.Location.Distance(currentLocation) descending
  select m;

foreach (var m in nearestMeasurements)
{
    Console.WriteLine($"A temperature of {m.Temperature} was detected on {m.Time} at {m.Location}.");
}

This will result in the following SQL query being executed:

SELECT [m].[Id], [m].[Location], [m].[Temperature], [m].[Time]
FROM [Measurements] AS [m]
WHERE [m].[Location].STDistance(@__currentLocation_0) < 2.0E0
ORDER BY [m].[Location].STDistance(@__currentLocation_0) DESC

Current capabilities and limitations of the spatial extensions

It is possible to map properties of concrete types from NetTopologySuite.Geometries such as Geometry, Point, or Polygon, or interfaces from GeoAPI.Geometries, such as IGeometry, IPoint, IPolygon, etc.
Only SQL Server and in-memory database are supported: For in-memory it is not necessary to call UseNetTopologySuite(). SQLite will be enabled in Preview 3.
EF Core Migrations does not scaffold spatial types correctly, so you currently cannot use Migrations to create the database schema or apply seed data without workarounds.
Mapping to Geography columns isn’t fully tested and may have limitations. If you attempt this, make sure that you configure the underlying column type in OnModelCreating():
```
modelBuilder.Entity<Measurement>().Property(b => b.Location).HasColumnType("Geography");
```
And that you specify an SRID, for example 4326, in all spatial instances you persist or use in queries:
```
var currentLocation = new Point(0, 0) { SRID = 4326 };
```

For anything else that you find, please report it as a new issue.

Collections of owned entities

EF Core 2.2 extends the ability to express ownership relationships to one-to-many associations. This helps constraining how entities in an owned collection can be manipulated (for example, they cannot be used without an owner) and triggers automatic behaviors such as implicit eager loading. In the case of relational database, owned collects are mapped to separate tables from the owner, just like regular one-to-many associations, but in the case of a document-oriented database such as Cosmos DB, we plan to nest owned entities (in owned collections or references) within the same JSON document as the owner.

You can use the feature by invoking the new OwnsMany() API:

modelBuilder.Entity<Customer>().OwnsMany(c => c.Addresses);

Query tags

This feature is designed to facilitate the correlation of LINQ queries in code with the corresponding generated SQL output captured in logs.

To take advantage of the feature, you annotate a query using the new WithTag() API in a LINQ query. Using the spatial query from the previous example:

var nearestMesurements =
    from m in context.Measurements.WithTag(@"This is my spatial query!")
    where m.Location.Distance(currentLocation) < 2.5
    orderby m.Location.Distance(currentLocation) descending
    select m;

This will generate the following SQL output:

-- EFCore: (#This is my spatial query!)
SELECT [m].[Id], [m].[Location], [m].[Temperature], [m].[Time]
FROM [Measurements] AS [m]
WHERE [m].[Location].STDistance(@__currentLocation_0) < 2.5E0
ORDER BY [m].[Location].STDistance(@__currentLocation_0) DESC

Provider compatibility

Although we have setup testing to make sure that existing providers will continue to work with EF Core 2.2, there might be unexpected problems, and we welcome users and provider writers to report compatibility issues on our issue tracker.

What comes next?

We are still working in some additional features we would like to include in EF Core 2.2, like reverse engineering of database views into query types, support for spatial types with SQLite, as well as additional bug fixes. We are planning on releasing EF Core 2.2 in the last calendar quarter of 2018.

In the meantime, our team has started working on the our next major release, EF Core 3.0, which will include, among other improvements, a significant overhaul of our LINQ implementation.

We will also soon start the work to make Entity Framework 6 compatible with .NET Core 3.0, which was announced last may.

Your feedback is really needed!

We encourage you to play with the new features, and we thank you in advance for posting any feedback to our issue tracker.

The spatial extensions and the Cosmos DB provider in particular are very large features that expose a lot of new capabilities and APIs. Really being able to ship these features as part of EF Core 2.2 RTM is going to depend on your valuable feedback and on our ability to use it to iterate over the design in the next few months.

↧

If not Notebooks, then what? Look to Literate Programming

September 12, 2018, 10:29 am

≫ Next: Announcing .NET Core 2.2 Preview 2

≪ Previous: Announcing Entity Framework Core 2.2 Preview 2 and the preview of the Cosmos DB provider and spatial extensions for EF Core

Author and research engineer Joel Grus kicked off an important conversation about Jupyter Notebooks in his recent presentation at JupyterCon:

There's no video yet available of Joel's talk, but you can guess the theme of that opening slide, and walking through the slides conveys the message well, I think. Yuhui Xie, author and creator of the rmarkdown package, provides a detailed summary and response to Joel's talk, where he lists Joel's main critiques of Notebooks:

Hidden state and out-of-order execution
Notebooks are difficult for beginners
Notebooks encourage bad habits
Notebooks discourage modularity and testing
Jupyter’s autocomplete, linting, and way of looking up the help are awkward
Notebooks encourage bad processes
Notebooks hinder reproducible + extensible science
Notebooks make it hard to copy and paste into Slack/Github issues
Errors will always halt execution
Notebooks make it easy to teach poorly
Notebooks make it hard to teach well

Yihui suggests that many of these shortcomings of Notebooks could be addressed through literate programming systems, where the document you edit is plain-text (and so easy to edit, manage, and track), and computations are strictly processed from the beginning of the document to the end. I use the RMarkdown system myself, and find it a delightful way of combining code, output and graphics in a single document, which can in turn be rendered in a variety of formats including HTML, PDF, Word and even PowerPoint.

Yihui expands on these themes in greater detail in his excellent book (with JJ Allaire and Garrett Grolemund), R Markdown: The Definitive Guide, published by CRC Press. Incidentally, the book itself is a fine example of literate programming; you can find the R Markdown source here, and you can read the book in its entirety here. As Joel mentions in his talk, an automatically-generated document of that length and complexity simply wouldn't be possible with Notebooks.

All that being said, RMarkdown is (for now) a strictly R-based system. Are there equivalent literate programming systems for Python? That's a genuine question — I don't know the Python ecosystem well enough to answer — but if you have suggestions please leave them in the comments.

Yihui Xie: The First Notebook War

↧

Announcing .NET Core 2.2 Preview 2

September 12, 2018, 12:26 pm

≫ Next: Deep dive into Azure Boards

≪ Previous: If not Notebooks, then what? Look to Literate Programming

Today, we are announcing .NET Core 2.2 Preview 2. We have great improvements that we want to share and that we would love to get your feedback on, either in the comments or at dotnet/core #1938.

ASP.NET Core 2.2 Preview 2 and Entity Framework 2.2 Preview 2 are also releasing today. We are also announcing C# 7.3 and ML.NET 0.5.

You can see complete details of the release in the .NET Core 2.2 Preview 2 release notes. Related instructions, known issues, and workarounds are included in the releases notes. Please report any issues you find in the comments or at dotnet/core #1938.

Thanks for everyone that contributed to .NET Core 2.2. You’ve helped make .NET Core a better product!

Download .NET Core 2.2

You can download and get started with .NET Core 2.2, on Windows, macOS, and Linux:

.NET Core 2.2 Preview 2 SDK (includes the runtime)
.NET Core 2.2 Runtime

Docker images are available at microsoft/dotnet for .NET Core and ASP.NET Core.

.NET Core 2.2 Preview 2 can be used with Visual Studio 15.8, Visual Studio for Mac and Visual Studio Code.

Tiered Compilation Enabled

The biggest change in .NET Core 2.2 Preview 2 is tiered compilation is enabled by default. We announced that tiered compilation was available as part of the .NET Core 2.1 release. At that time, you had to enable tiered compilation via application configuration or an environment variable. It is now enabled by default and can be disabled, as needed.

You can see the benefit of tiered compilation in the image below. The baseline is .NET Core 2.1 RTM, running in a default configuration, with tiered compilation disabled. The second scenario has tiered compilation. You can see a significant request-per-second (RPS) throughput benefit with tiered compilation enabled.

The numbers in the chart are scaled so that baseline always measures 1.0. That approach makes it very easy to calculate performance changes as a percentage. The first two tests are TechEmpower benchmarks and the last one is Music Store, our frequent sample ASP.NET app.

Platform Support

.NET Core 2.2 is supported on the following operating systems:

Windows Client: 7, 8.1, 10 (1607+)
Windows Server: 2008 R2 SP1+
macOS: 10.12+
RHEL: 6+
Fedora: 27+
Ubuntu: 14.04+
Debian: 8+
SLES: 12+
openSUSE: 42.3+
Alpine: 3.7+

Chip support follows:

x64 on Windows, macOS, and Linux
x86 on Windows
ARM32 on Linux (Ubuntu 18.04+, Debian 9+)

Closing

Please download and test .NET Core 2.2 Preview 2. We’re looking for feedback on the release with the intent of shipping the final version later this year.

We recently shared how Bing.com runs of .NET Core 2.1. The Bing.com site experienced significant benefits when it moved to .NET Core 2.1. Please do check out that post if you are interested in case study of running .NET Core in production. You may also want to take a look at the .NET Customers site, if you are interested in a broader set of customer stories.

↧

Deep dive into Azure Boards

September 13, 2018, 1:00 am

≫ Next: HDInsight Tools for VSCode: Integrations with Azure Account and HDInsight Explorer

≪ Previous: Announcing .NET Core 2.2 Preview 2

Azure Boards is a service for managing the work for your software projects. Teams need tools that flex and grow. Azure Boards does just that, brining you a rich set of capabilities including native support for Scrum and Kanban, customizable dashboards, and integrated reporting.

In this post I’ll walk through a few core features in Azure Boards and give some insight in to how you can make them work for your teams and projects.

Work items

All work in Azure Boards is tracked through an artifact called a work item. Work items are where you and your team describe the details of what’s needed. Each work item uses a state model to track and communicate progress. For example, a common state model might be: New > Active > Closed. As work progresses, items are updated accordingly, allowing everyone who works on the project to have a complete picture of where things are at. Below is a picture of the work items hub in Azure Boards. This page is the home for all work items and provides quick filters to allow you to find the items you need.

Opening a work item brings you to a much richer view, including the history of all changes, any related discussion, and links to development artifacts including branches, pull requests, commits, and builds. Work items are customizable, supporting the ability to add new fields, create rules, and modify aspects of the layout. For more information, visit the work items documentation page.

Boards, Backlogs, and Sprints

Azure Boards provides a variety of choices for planning and managing work. Let’s look at a few of the core experiences.

Boards

Each project comes with a pre-configured Kanban board perfect for managing the flow of your work. Boards are highly customizable allowing you to add the columns you need for each team and project. Boards support swim lanes, card customization, conditional formatting, filtering, and even WIP limits. For more information, visit the Kanban boards documentation page.

Backlogs

Backlogs help you keep things in order of priority, and to understand the relationships between your work. Drag and drop items to adjust the order, or quickly assign work to an upcoming sprint. For more information, visit backlogs documentation page.

Sprints

Finally, sprints give you the ability to create increments of work for your team to accomplish together. Each sprint comes equipped with a backlog, taskboard, burndown chart, and capacity planning view to help you and your team deliver your work on time. For more information, visit the sprints documentation page.

Dashboards

In any project, it’s critical that you have a clear view of what’s happening. Azure Boards comes complete with a rich canvas for creating dashboards. Add widgets as needed to track progress and direction. For more information, visit the dashboards documentation page.

Queries

And finally, one of the most powerful features in Azure Boards is the query engine. Queries let you tailor exactly what you’re tracking, creating easy to monitor KPIs. It’s simple to create new queries and pin them to dashboards for quick monitoring and status. For more information, visit the on queries documentation page.

Getting started

If you’re new to Azure Boards, it’s easy to get started, just head over to the Azure DevOps homepage, and click Start free to create your first Azure DevOps project. If you’ve got feedback to share, or questions that need answering, please reach out on twitter at @AzureDevOps.

Thanks,

Aaron Bjork

↧

HDInsight Tools for VSCode: Integrations with Azure Account and HDInsight Explorer

September 13, 2018, 2:00 am

≫ Next: Azure Marketplace new offers – Volume 19

≪ Previous: Deep dive into Azure Boards

Making it easy for developers to get started on coding has always been our top priority. We are happy to announce that HDInsight Tools for VS Code now integrates with VS Code Azure Account. This new feature makes your Azure HDInsight sign-in experience much easier. For first-time users, the tools put the required sign-in code into the copy buffer and automatically opens the Azure sign-in portal where the user can paste the code and complete the authentication process. For returning users, the tools sign you in automatically. You can quickly start authoring PySpark or Hive jobs, performing data queries, or navigating your Azure resources.

We are also excited to introduce a graphical tree view for the HDInsight Explorer within VS Code. With HDInsight Explorer, data scientists and data developers can navigate HDInsight Hive and Spark clusters across subscriptions and tenants, and browse Azure Data Lake Storage and Blob Storage connected to these HDInsight clusters. Moreover, you can inspect your Hive metadata database and table schema.

Key Customer Benefits

Support Azure auto sign-in and improve sign-in experiences via integration with Azure Account extension.
Enable multi-tenant support so you can manage your Azure subscription resources across tenants.
Gain insights into available HDInsight Spark, Hadoop and HBase clusters across environments, subscriptions, and tenants.
Facilitate Spark and Hive programming by exposing Hive metadata tables and schema in HDInsight Explorer, as well as displaying Blob Storage and Azure Data Lake Storage.

How to install or update

First, install Visual Studio Code and download Mono 4.2.x (for Linux and Mac). Then get the latest HDInsight Tools by going to the VSCode Extension repository or the VSCode Marketplace and searching for HDInsight Tools for VSCode.

For more information about HDInsight Tools for VSCode, please use the following resources:

User Manual: HDInsight Tools for VSCode
User Manual: Set Up PySpark Interactive Environment
Demo Video: HDInsight for VSCode Video
Hive LLAP: Use Interactive Query with HDInsight

Learn more about today’s announcements on the Azure blog and Big Data blog. Discover more on the Azure service updates page.

If you have questions, feedback, comments, or bug reports, please use the comments below or send a note to hdivstool@microsoft.com.

↧

Azure Marketplace new offers – Volume 19

September 13, 2018, 4:00 am

≫ Next: Search MSRC fix for TFS 2017 Update 3

≪ Previous: HDInsight Tools for VSCode: Integrations with Azure Account and HDInsight Explorer

We continue to expand the Azure Marketplace ecosystem. From August 1, 2018 to August 15, 2018 50 new offers successfully met the onboarding criteria and went live. See details of the new offers below:

Virtual Machine

	AudioCodes IP Phone Manager Express: AudioCodes IP Phone Manager enables administrators to offer a reliable desktop phone service within their organization. Deploy and monitor AudioCodes IP phones to increase productivity and lower IT expenses.
	Balabit Privileged Session Management (PSM): Balabit Privileged Session Management (PSM) controls privileged access to remote IT systems; records activities in searchable, movie-like audit trails; and prevents malicious actions.
	BOSH Stemcell for Windows Server 1803: BOSH Stemcell for Windows Server 1803 by Pivotal Software Inc.
	Consul Certified by Bitnami: Consul is a tool for discovering and configuring services in your infrastructure. Bitnami certifies that our images are secure, up-to-date, and packaged using industry best practices.
	etcd Certified by Bitnami: etcd is a distributed key-value store designed to securely store data across a cluster. etcd is widely used in production due to its reliability, fault tolerance, and ease of use.
	F5 BIG-IP Virtual Edition (BYOL): This is F5's application delivery services platform for Azure. From traffic management and service offloading to application access, acceleration, and security, the BIG-IP Virtual Edition ensures your apps are fast, available, and secure.
	F5 Per-App Virtual Edition (PAYG): F5 Per-App Virtual Editions (VEs) provide application delivery controller (ADC) and web application firewall (WAF) functionality for Azure-hosted applications, delivering intelligent traffic management and security services on a per-app basis.
	GigaSECURE Cloud 5.4.00: GigaSECURE Cloud delivers intelligent network traffic visibility for workloads running in Azure and enables increased security, operational efficiency, and scale across virtual networks (VNets).
	Informix: Informix features a cloud-delivered, ready-to-run database system. Informix is configured for OLTP workloads and includes entitlement to the Informix Warehouse Accelerator, delivering incredible query acceleration.
	Intellicus BI Server (100 Users - Linux): Intellicus BI Server is an enterprise reporting and business intelligence platform with all the features needed to create a comprehensive data analytics platform.
	Intellicus BI Server (100 Users): Intellicus BI Server is an enterprise reporting and business intelligence platform with all the features needed to create a comprehensive data analytics platform.
	Intellicus BI Server (25 Users - Linux): Intellicus BI Server is an enterprise reporting and business intelligence platform with all the features needed to create a comprehensive data analytics platform.
	Intellicus BI Server (50 Users - Linux): Intellicus BI Server is an enterprise reporting and business intelligence platform with all the features needed to create a comprehensive data analytics platform.
	NATS Certified by Bitnami: NATS is an open-source, lightweight, high-performance messaging system. It is ideal for distributed systems and supports modern cloud architectures and pub-sub, request-reply, and queuing models.
	Neo4j Certified by Bitnami: Neo4j is a high-performance graph store with all the features expected of a mature and robust database, like a friendly query language and ACID transactions.
	ZooKeeper Certified by Bitnami: ZooKeeper provides a reliable, centralized register of configuration data and services for distributed applications. Bitnami certifies that our images are secure, up-to-date, and packaged using industry best practices.

Web Applications

	AccessData Lab 6.4 for Azure: Manage digital forensic investigations in the cloud with AccessData Lab 6.4 for Azure. Power through massive data sets, handle various data types, and run multiple cases at the same time, all within a collaborative, scalable environment.
	Axians myOperations Family: The myOperations family by Axians opens up an era of new freedom and control for IT managers and users. Developed by experienced consultants, it is the product of years of listening to customers’ voices and needs.
	BitDam: BitDam protects email from advanced content-borne attacks. BitDam couples deep application learning with alien application code flow detection, preventing illegal attack code hidden in URLs and documents from being run by enterprise applications.
	etcd Cluster: etcd is a distributed key-value store designed to securely store data across a cluster. This solution provisions a configurable number of etcd nodes to create a fault-tolerant, distributed, reliable key-value store.
	HPCBOX Cluster for OpenFOAM: HPCBOX provides intelligent workflow capability that lets you plug cloud infrastructure into your application pipeline, giving you granular control of your HPC cloud resources and applications.
	NATS Cluster: NATS is an open-source, lightweight, high-performance messaging system. This solution provisions a configurable number of NATS nodes to create a high-performance distributed messaging system.
	NeuVector Container Security Platform: This multi-vector container security platform is integrated into Docker and Kubernetes and deploys easily on any Azure instance running Docker and/or Kubernetes.
	StealthMail Email Security: StealthMail makes your emails secure and invisible to email relays, hackers, or public internet. StealthMail gives you exclusive control over your encryption keys, data, and access rights so your email communication is fully protected.
	Veritas™ Resiliency Platform (express install): This Bring Your Own License (BYOL) version of Veritas Resiliency Platform (VRP) provides single-click disaster recovery and migration for any source workload into Azure. Meet your recovery time objectives with confidence.
	Vertica Analytics Platform: With Vertica Analytics Platform for Azure, you can tap into core enterprise capabilities with the deployment model that makes sense for your business. Vertica Analytics Platform runs on-premises, on industry-standard hardware, and in the cloud.
	ZooKeeper Cluster: ZooKeeper gives you a reliable, centralized register of configuration data and services for distributed applications. This solution provides scalable data storage and provisions a configurable number of nodes that form a fault-tolerant ZooKeeper cluster.

Consulting Services

	AI in business: 1-Day Assessment: Discover which AI technologies can bring business value to your company in this one-day assessment. Topics will include cognitive services, machine learning, and bots.
	Azure Architecture: 1-Day Workshop: igroup will hold an on-site technical workshop with your IT team and a stakeholder and will conduct a deep dive into your business objectives and software needs, helping you gather requirements and prioritize your goals.
	Azure for Data Management & IoT: Half-Day Briefing: In this free half-day briefing, TwoConnect will discuss how to take your business management and IoT project to the next level in a fast, flexible, and affordable manner.
	Azure Governance: 1 Day Workshop: ClearPointe will hold an Azure governance workshop and consultation to evaluate your current policies and procedures and align with the pillars of a strong Azure governance model.
	Azure IaaS Jumpstart - Proof of Concept: T4SPartners' Azure Jumpstart is a fixed-scope services offering designed to help you quickly plan and deploy a hybrid infrastructure spanning your datacenters and the cloud.
	Azure IoT: 8-Wk Initial Deployment: Work with Lixar to implement an IoT initial deployment for remote monitoring that leverages Azure IoT Central and Lixar’s experience in deploying large-scale IoT solutions.
	Azure Optimization: 5-Day Assessment (USA): In this free assessment, our Azure experts personally review (no tools) every aspect of your tenant and produce recommendations to improve performance, lower costs, add availability, and strengthen security.
	Big Data Platform: 8-Wk PoC: Work with Lixar to implement a Big Data proof of concept that leverages Azure Data Lake and Lixar’s methodology for Azure-based data platform solutions.
	Blockchain: 5-Wk PoC: Leveraging Lixar’s approach to implementing blockchain solutions, companies can quickly turn out end-to-end prototypes on Azure using Blockchain-as-a-Service and Azure App Service components.
	Cloud Migration - 1 Hour Briefing: This briefing will provide a high-level view of the Azure platform and how it can transform your datacenter. T4SPartners will demo the solution to show the capabilities that will help potential customers with an end-to-end migration method.
	Current State & Solution Design: 3-Wk Assessment: Clientek will define a set of minimal marketable features, create a release plan, and outline an architectural approach. At the end of the assessment, you will receive a full project proposal.
	Data Intelligence+AI & Machine Learning: 4 Wk PoC: Lixar has a proven methodology for developing machine learning models that work with numeric data to provide hindsight analysis and deeper insight into the data, along with foresight and predictions.
	DevOps Strategy and PoC: Leveraging Azure DevOps, organizations can focus on building applications while automating processes and maintaining insights into the environment and the health of the application.
	Essentials for Hands-On Labs: 1-Hr Briefing: This briefing will include an overview and demo of using preconfigured, extended Microsoft Azure environments and/or virtual machines for hands-on labs.
	Intercept Managed Security: 2-Hr Implementation: Gain control over the security and compliance of your IT environment by monitoring behavior and taking preventive automated actions. You receive a dashboard that displays the latest security status of the components.
	Introductions & Technical Deep-Dive: 3-Hr Briefing: Learn how Clientek's agile approach to custom software development will provide your organization with the technical advancements needed to reach the next level.
	Lift & Shift to Azure Cloud: 3-Wk PoC: Lixar is offering a lift-and-shift or a digital transformation, giving you a boost to help you move your web-based application to the cloud in a matter of weeks.
	Migrate HL7 & HIPAA Apps to Azure 1/2 Day Briefing: In this free briefing, we will discuss how to take your healthcare apps to the next level in a fast, flexible, and affordable manner by leveraging Microsoft Azure and modern technologies.
	Sage on Azure: 5-Day Implementation: Move your on-premises Sage accounting application to Microsoft Azure for centralized access anytime, anywhere. This lift-and-shift implementation is for technical and business leaders and is delivered remotely.
	SharePoint Add-in development: 5-Wk Implementation: SharePoint Add-ins let you customize your SharePoint sites’ behavior to your specific business needs. Add-ins will extend boundaries and improve your SharePoint experience.
	SQL Management Studio Add-in: 6-Wk Implementation: SQL Server Management Studio Add-ins let you safely and effectively customize your SSMS behavior to your specific business needs. Add-ins will extend boundaries and improve your SQL Database management experience.
	Supply Chain Logistics to Azure - 1/2 Day Briefing: In this free briefing, we will discuss how to take your supply chain logistics apps to the next level in a fast, flexible, and affordable manner by leveraging modern Microsoft technologies.
	Use Azure to Connect Everything: Half-Day Briefing: This free half-day briefing will cover how TwoConnect can help you use Azure and related Microsoft technologies to seamlessly connect all your apps to one another.

↧

Search MSRC fix for TFS 2017 Update 3

September 13, 2018, 8:02 am

≫ Next: How can I pause my code in Visual Studio?: Breakpoints FAQ

≪ Previous: Azure Marketplace new offers – Volume 19

Issue description Service endpoints feature was introduced in TFS 2018. With that feature, Elasticsearch URL can be configured as an endpoint by any team member (Contributor). As a result, Elasticsearch index data (which serves as a backend for search feature) can be accessed or modified by server-side tasks running on the TFS server. This would mean... Read More

↧

How can I pause my code in Visual Studio?: Breakpoints FAQ

September 13, 2018, 9:00 am

≫ Next: Announcing TypeScript 3.1 RC

≪ Previous: Search MSRC fix for TFS 2017 Update 3

Have you ever found a bug in your code and wanted to pause code execution to inspect the problem? If you are a developer, there’s a strong chance you have experienced or will experience this issue many, many times. While the short and sweet answer to this problem is to use a breakpoint, the longer answer is that Visual Studio actually provides multiple kinds of breakpoints and methods that let you pause your code depending on the context! Based on the different scenarios you may experience while debugging, here are some of the various ways to pause your code and set or manage a breakpoint in Visual Studio 2017:

While my app is running, how can I pause to inspect a line of code that may contain a bug?

The easiest way to pause or “break” execution to inspect a line of code is to use a breakpoint, a tool that allows you to run your code up to a specified line before stopping. Breakpoints are an essential aspect of debugging, which is the process of detecting and removing errors and bugs from your code.

Select the left margin or press F9 next to the line of code you would like to stop at.
Run your code or hit Continue (F5) and your program will pause prior to execution at the location you marked.

Where can I manage and keep track of all my breakpoints?

If you have set multiple breakpoints located in different areas or files of your project, it can be hard to find and keep track of them. The Breakpoints Window is a central location where you can view, add, delete, and label your breakpoints. If it’s not already visible, this window can be accessed by navigating to the top tool bar in Visual Studio and selecting Debug –> Window –> Breakpoints (or CTRL + ALT + B).

How can I stop execution only when my application reaches a specific state?

Conditional Breakpoints are an extended feature of regular breakpoints that allow you to control where and when a breakpoint executes by using conditional logic. If it’s difficult or time-consuming to manually recreate a particular state in your application to inspect a bug, conditional breakpoints are a good way to mitigate that process. Conditional breakpoints are also useful for determining the state in your application where a variable is storing incorrect data. To create a conditional breakpoint:

Set a breakpoint on the desired line.
Hover over the breakpoint and select the Settings gear icon that appears.
Check the Conditions option. Make sure the first dropdown is set to Conditional Statement.
Input valid conditional logic for when you want the break to occur and hit enter to save the breakpoint.

How can I break a loop at a certain iteration when debugging?

You can select the Hit Count option when creating a conditional breakpoint (see above) to specify a specific loop iteration where you want to halt your code. Instead of having to manually step through each iteration, you can use hit count to break at the relevant iteration where your code starts misbehaving.

How can I break at the start of a function that I know the name of but not its location in my code?

Though a standard breakpoint can be used here, function breakpoints can also be used to break at the start of a function call. Function breakpoints can be used over other breakpoints when you know the function’s name but not its location in code. If you have multiple overloaded methods or a function contained within several different projects, function breakpoints are a good way to avoid having to manually set a breakpoint at each function call location. To create a function breakpoint:

Select Debug –> New Breakpoint –> Break at Function.
Input the desired function name and hit enter. These breakpoints can also be created and viewed via the Breakpoints Window.

How can I break only when a specific object’s property or value changes?

If you are debugging in C++, data breakpoints can be used to stop execution when a particular variable stored at a specific memory address changes. Exclusive to C++, these can be set via the Watch Window or the Breakpoints Window. For more info on data breakpoints, check out this blog post.

If you are debugging managed code, a current workaround and equivalent alternative to data breakpoints is to use an Object ID with a conditional breakpoint. To perform this task:

In break mode, right click on the desired object and select Make Object ID, which will give you a handle to that object in memory.
Add a conditional breakpoint to the desired setter where the conditional statement is “this == $[insert handle here].”
Press Continue (F5) and you will now break in the setter when that particular property value changes for the desired instance.
In the Call Stack, double click on the previous frame to view the line of code that is changing the specific object’s property.

How can I break when a handled or unhandled exception is thrown?

When exceptions are thrown at runtime, you are typically given a message about it in the console window and/or browser, but you would then have to set your own breakpoints to debug the issue. However, Visual Studio also allows you to break when a specified exception is thrown automatically, regardless of whether it is being handled or not.

You can configure which thrown exceptions will break execution in the Exception Settings window.

Can I set a breakpoint in the call stack?

If you are using the call stack to examine your application’s execution flow or view function calls currently on the stack, you may want to use call stack breakpoints to pause execution at the line where a calling function returns.

Open the call stack (Debug –> Windows –> Call Stack, or CTRL + ALT + C)
In the call stack, right-click on the calling function and select Breakpoint –> Insert Breakpoint (F9).

How can I pause execution at a specific assembly instruction?

If you are examining the disassembly window to inspect method efficiency, inexplainable debugger behavior, or you just want to study how your code works behind the scenes when translated into assembly code, disassembly breakpoints may be useful to you. Disassembly breakpoints can be used to break at a specific line of assembly code, accessible only when code execution is already paused. To place a disassembly breakpoint:

Open the disassembly window (Debug –> Windows –> Disassembly, or Ctrl + Alt + D)
Click in the left margin at the line you want to break at (or press F9).

Excited to try out any of these breakpoints? Let us know in the comments!

For more info on Visual Studio 2017 breakpoints, check out the official documentation. For any issues or suggestions, please let us know via Help > Send Feedback > Report a Problem in the IDE.

Leslie Richardson Program Manager, Visual Studio Debugging & Diagnostics

Leslie is a Program Manager on the Visual Studio Debugging and Diagnostics team, focusing primarily on improving the overall debugging experience and feature set.

↧

Data-informed decision making

The Azure healthcare AI blueprint

Operational process flow

Wrapping up

Recommended next steps

What is Threat Hunting?

Scenario 1

Scenario 2

Scenario 3

Partner enablement in Gen2

Real-time analytics and ADLS Gen2

Next steps

Stay on top of proven practice recommendations

Review your personalized Advisor proven practice recommendations.

Stay in control of your resources on the go

Stay informed during issues and maintenance

Stay up-to-date with the latest announcements

Subscribe for your Azure updates.

Stay engaged with your peers – share and learn

Join us at Microsoft Ignite in-person and online

The healthcare blueprint

Blueprint solution guide

Next steps

Applying machine learning to geospatial data

Semantic segmentation

Satellite imagery data

Training and applying the model

Training and model parameters

Last thoughts

Acknowledgement

How do I get it?

Visual Studio requirements

Azure App Service Requirements

Impact to machines

What’s new in Preview 2

SignalR Java Client updated to support Azure SignalR Service

Problem Details support

ASP.NET Core Module Improvements

Hosting in IIS

Caveats

Template Updates

HealthCheck Improvements

What’s still to come?

Migrating an ASP.NET Core 2.1 project to 2.2

Giving Feedback

Added a TensorFlow model scoring transform (TensorFlowTransform)

Explore the upcoming new ML.NET API and provide feedback

Design principles for this new ML.NET API

Why ML.NET is switching from the LearningPipeline API to a new API?

Comparison of strongly-typed API vs. LearningPipeline API

Details on decomposable train and predict API

Provide your feedback on the new API

Get started!

A (Belated) Welcome to C# 7.3

Obtaining the preview

What is new in this preview?

New EF Core provider for Cosmos DB

Current capabilities and limitations of the Cosmos DB provider

Spatial extensions for SQL Server and in-memory

Current capabilities and limitations of the spatial extensions

Collections of owned entities

Query tags

Provider compatibility

What comes next?

Your feedback is really needed!

Download .NET Core 2.2

Tiered Compilation Enabled

Platform Support

Closing

Work items

Boards, Backlogs, and Sprints

Boards

Backlogs

Sprints

Dashboards

Queries

Getting started

Key Customer Benefits

How to install or update

Virtual Machine

Why ML.NET is switching from the `LearningPipeline` API to a new API?

Comparison of strongly-typed API vs. `LearningPipeline` API