继续浏览精彩内容
慕课网APP
程序员的梦工厂
打开
继续
感谢您的支持,我会继续努力的
赞赏金额会直接到老师账户
将二维码发送给自己后长按识别
微信支付
支付宝支付

学习ML.NET(1): 使用LearningPipeline构建机器学习流水线

胡说叔叔
关注TA
已关注
手记 517
粉丝 130
获赞 581

ML.NET使用LearningPipeline类定义执行期望的机器学习任务所需的步骤,让机器学习的流程变得直观。

下面用鸢尾花瓣预测快速入门的示例代码讲解流水线是如何工作的。

?


using Microsoft.ML;using Microsoft.ML.Data;using Microsoft.ML.Runtime.Api;using Microsoft.ML.Trainers;using Microsoft.ML.Transforms;using System; namespace myApp{    class Program    {        // STEP 1: Define your data structures         // IrisData is used to provide training data, and as         // input for prediction operations        // - First 4 properties are inputs/features used to predict the label        // - Label is what you are predicting, and is only set when training        public class IrisData        {            [Column("0")]            public float SepalLength;             [Column("1")]            public float SepalWidth;             [Column("2")]            public float PetalLength;             [Column("3")]            public float PetalWidth;             [Column("4")]            [ColumnName("Label")]            public string Label;        }         // IrisPrediction is the result returned from prediction operations        public class IrisPrediction        {            [ColumnName("PredictedLabel")]            public string PredictedLabels;        }         static void Main(string[] args)        {            // STEP 2: Create a pipeline and load your data            var pipeline = new LearningPipeline();             // If working in Visual Studio, make sure the 'Copy to Output Directory'             // property of iris-data.txt is set to 'Copy always'            string dataPath = "iris-data.txt";            pipeline.Add(new TextLoader(dataPath).CreateFrom<IrisData>(separator: ','));             // STEP 3: Transform your data            // Assign numeric values to text in the "Label" column, because only            // numbers can be processed during model training            pipeline.Add(new Dictionarizer("Label"));             // Puts all features into a vector            pipeline.Add(new ColumnConcatenator("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth"));             // STEP 4: Add learner            // Add a learning algorithm to the pipeline.             // This is a classification scenario (What type of iris is this?)            pipeline.Add(new StochasticDualCoordinateAscentClassifier());             // Convert the Label back into original text (after converting to number in step 3)            pipeline.Add(new PredictedLabelColumnOriginalValueConverter() { PredictedLabelColumn = "PredictedLabel" });             // STEP 5: Train your model based on the data set            var model = pipeline.Train<IrisData, IrisPrediction>();             // STEP 6: Use your model to make a prediction            // You can change these numbers to test different predictions            var prediction = model.Predict(new IrisData()            {                SepalLength = 3.3f,                SepalWidth = 1.6f,                PetalLength = 0.2f,                PetalWidth = 5.1f,            });             Console.WriteLine($"Predicted flower type is: {prediction.PredictedLabels}");        }    }}

 

创建工作流实例

首先,创建LearningPipeline实例

?

1var pipeline = new LearningPipeline();

添加步骤

然后,调用LearningPipeline实例的Add方法向流水线添加步骤,每个步骤都继承自ILearningPipelineItem接口。

一个基本的工作流包括以下几个步骤,其中,蓝色部分是可选的。

  • 加载数据集

继承自ILearningPipelineLoader接口。

一个工作流必须包含至少1个加载数据集步骤。

?

123//使用TextLoader加载数据string dataPath = "iris-data.txt";pipeline.Add(new TextLoader(dataPath).CreateFrom<IrisData>(separator: ','));
  • 数据预处理

继承自CommonInputs.ITransformInput接口。

一个工作流可以包含0到多个数据预处理步骤,用于将已加载的数据集标准化,示例代码中就包含2了个数据预处理步骤。

?


//由于Label文本数据,算法不能识别数据,需要将其转换为字典pipeline.Add(new Dictionarizer("Label"));  //算法只能从Features列获取数据,需要数据中的多列连接到Features列中pipeline.Add(new ColumnConcatenator("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth"));
  • 选择学习算法

继承自CommonInputs.ITrainerInput接口。

一个工作流必须且只能包含1个学习算法。

?

12//使用线性分类器pipeline.Add(new StochasticDualCoordinateAscentClassifier());
  • 标签转换

继承自CommonInputs.ITransformInput接口。

一个工作流可以包含0到多个标签转换步骤,用于将预测得到的标签转换成方便识别的数据。

?

12//将Label从字典转换成文本数据pipeline.Add(new PredictedLabelColumnOriginalValueConverter() { PredictedLabelColumn = "PredictedLabel" });

 执行工作流

最后,调用LearningPipeline实例的Train方法,就可以执行工作流得到预测模型。

?

1var model = pipeline.Train<IrisData, IrisPrediction>();

原文出处:https://www.cnblogs.com/feiyun0112/p/ML-NET-1.html  

打开App,阅读手记
0人推荐
发表评论
随时随地看视频慕课网APP