手记

SAS Iris dataset Plot

SAS Day 34:

Background Story:

Once, in my machine learning class, the professor asked what software do we use for data science? One student answered: “SAS”.
Then the professor laughed and said: “Oh dear, you must be in the wrong class, nobody uses SAS in data science industry”.

SAS stands for Statistical Analytic Software, it is most widely used in health-related fields. Although it is known Python and R are the most popular Data Science Languages, I think SAS has its strength as well, (better than excel!!).

At least it came with the Iris Dataset!

[caption id=“attachment_2240” align=“alignnone” width=“550”]

Fotomanie / Pixabay[/caption]

Today we will use Iris Dataset for Scatter Plots:

Scatter Plot Matrix

ods graphics on / height=500px width=500px;
proc sgscatter data=sashelp.iris(where=(species ="Virginica" ));
title "Fisher Iris Data";
matrix petallength 
petalwidth SepalLength/ ellipse=(type=predicted)
diagonal=(histogram normal kernel);
run;
   ods graphics on/reset= all;

Panel of scatter plots

ods graphics on / height=500px width=500px;
proc sgscatter data=sashelp.iris;
title "Fisher Iris Data";
plot petallength*petalwidth
     sepallength*sepalwidth
   petallength*sepallength
   petalwidth*sepalwidth
   /group=species;
   run;
   ods graphics on/reset= all;

As we can observe from the previous graphs, Sestosa has more differences compared with Versicolor and Virginica, which is consistent with our Iris Dataset Cluster Analysis with Python.

Personal Thought:

I used to feel a bit ashamed that i use SAS more often than Python or R, because those programs sound a lot cooler. Now, I think SAS deserve my appreciation as well, like the lyrics “Wild Lily also has Spring(野百合也有春天)”! SAS is a wonderful software with Iris Dataset!

0人推荐
随时随地看视频
慕课网APP