ML.NET使用LearningPipeline类定义执行期望的机器学习任务所需的步骤,让机器学习的流程变得直观。
下面用鸢尾花瓣预测快速入门的示例代码讲解流水线是如何工作的。
| usingMicrosoft.ML;usingMicrosoft.ML.Data;usingMicrosoft.ML.Runtime.Api;usingMicrosoft.ML.Trainers;usingMicrosoft.ML.Transforms;usingSystem;namespacemyApp{    classProgram    {        // STEP 1: Define your data structures        // IrisData is used to provide training data, and as         // input for prediction operations        // - First 4 properties are inputs/features used to predict the label        // - Label is what you are predicting, and is only set when training        publicclassIrisData        {            [Column("0")]            publicfloatSepalLength;            [Column("1")]            publicfloatSepalWidth;            [Column("2")]            publicfloatPetalLength;            [Column("3")]            publicfloatPetalWidth;            [Column("4")]            [ColumnName("Label")]            publicstringLabel;        }        // IrisPrediction is the result returned from prediction operations        publicclassIrisPrediction        {            [ColumnName("PredictedLabel")]            publicstringPredictedLabels;        }        staticvoidMain(string[] args)        {            // STEP 2: Create a pipeline and load your data            varpipeline = newLearningPipeline();            // If working in Visual Studio, make sure the 'Copy to Output Directory'             // property of iris-data.txt is set to 'Copy always'            stringdataPath = "iris-data.txt";            pipeline.Add(newTextLoader(dataPath).CreateFrom<IrisData>(separator: ','));            // STEP 3: Transform your data            // Assign numeric values to text in the "Label" column, because only            // numbers can be processed during model training            pipeline.Add(newDictionarizer("Label"));            // Puts all features into a vector            pipeline.Add(newColumnConcatenator("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth"));            // STEP 4: Add learner            // Add a learning algorithm to the pipeline.             // This is a classification scenario (What type of iris is this?)            pipeline.Add(newStochasticDualCoordinateAscentClassifier());            // Convert the Label back into original text (after converting to number in step 3)            pipeline.Add(newPredictedLabelColumnOriginalValueConverter() { PredictedLabelColumn = "PredictedLabel"});            // STEP 5: Train your model based on the data set            varmodel = pipeline.Train<IrisData, IrisPrediction>();            // STEP 6: Use your model to make a prediction            // You can change these numbers to test different predictions            varprediction = model.Predict(newIrisData()            {                SepalLength = 3.3f,                SepalWidth = 1.6f,                PetalLength = 0.2f,                PetalWidth = 5.1f,            });            Console.WriteLine($"Predicted flower type is: {prediction.PredictedLabels}");        }    }} | 
创建工作流实例
首先,创建LearningPipeline实例
| 1 | varpipeline = newLearningPipeline(); | 
添加步骤
然后,调用LearningPipeline实例的Add方法向流水线添加步骤,每个步骤都继承自ILearningPipelineItem接口。
一个基本的工作流包括以下几个步骤,其中,蓝色部分是可选的。

- 加载数据集 
继承自ILearningPipelineLoader接口。
一个工作流必须包含至少1个加载数据集步骤。
| 123 | //使用TextLoader加载数据stringdataPath = "iris-data.txt";pipeline.Add(newTextLoader(dataPath).CreateFrom<IrisData>(separator: ',')); | 
- 数据预处理 
继承自CommonInputs.ITransformInput接口。
一个工作流可以包含0到多个数据预处理步骤,用于将已加载的数据集标准化,示例代码中就包含2了个数据预处理步骤。
| //由于Label文本数据,算法不能识别数据,需要将其转换为字典pipeline.Add(newDictionarizer("Label")); //算法只能从Features列获取数据,需要数据中的多列连接到Features列中pipeline.Add(newColumnConcatenator("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth")); | 
- 选择学习算法 
继承自CommonInputs.ITrainerInput接口。
一个工作流必须且只能包含1个学习算法。
| 12 | //使用线性分类器pipeline.Add(newStochasticDualCoordinateAscentClassifier());  | 
- 标签转换 
继承自CommonInputs.ITransformInput接口。
一个工作流可以包含0到多个标签转换步骤,用于将预测得到的标签转换成方便识别的数据。
| 12 | //将Label从字典转换成文本数据pipeline.Add(newPredictedLabelColumnOriginalValueConverter() { PredictedLabelColumn = "PredictedLabel"}); | 
执行工作流
最后,调用LearningPipeline实例的Train方法,就可以执行工作流得到预测模型。
| 1 | varmodel = pipeline.Train<IrisData, IrisPrediction>(); | 
 
		 随时随地看视频
随时随地看视频 
				 
				 
				 
				 
				