继续浏览精彩内容
慕课网APP
程序员的梦工厂
打开
继续
感谢您的支持,我会继续努力的
赞赏金额会直接到老师账户
将二维码发送给自己后长按识别
微信支付
支付宝支付

Normalization

慕姐8265434
关注TA
已关注
手记 1309
粉丝 222
获赞 1065

Data transformation is one of the critical steps in Data Mining.  Among many data transformation methods, normalization is a most frequently used technique. For example, we can use Z-score normalization to reduce possible noise in sound frequency.

We will introduce three common normalization method, Max-Min Normalization, Z-Score Normalization, Scale multiplication.

  • Max-Min Normalization x_normal= (x- min(x))/ (max(x)- min(x))
    *it will scale all the data between 0 and 1. *
    Example: Chinese high schools use 150 point scale, USA high schools use 100 point scale and Russian high schools use 5 point scale.

webp

norm1

  • Z-Score Normalization X_z-normal= (X- mean)/ sd
    It will transform the data in units relative to the standard deviation. Example: It is useful when comparing data sets with different units (cm and inch).

webp

norm2

  • Scale multiplication
    Z_z-normal =X*10  or  Z_z-normal =X/10
    It will transform the data in scales of muliple of 10.
    Example: Some money transactions are too large, we will divide 1000 to make it viewer friendly.

webp

norm3

Happy Studying !

Source Code

import randomimport matplotlib.pyplot as pltimport numpy as npfrom matplotlib import colorsfrom matplotlib.ticker import PercentFormatterfrom matplotlib import pylab

y=random.sample(range(0,150),50)
x=list(map(int,y))
x1=np.array(x)
xmin=min(x)
xmax=max(x)#Max-Min normalizationmmnorm=(x1 - xmin)/(xmax-xmin)#plotfig,axs=plt.subplots(1,2,sharey=True)#Original random numberaxs[0].hist(x, bins=10)
axs[0].title.set_text("Random Data")#Max-Min normalizaed histogram Plotaxs[1].hist(mmnorm, bins=10,color="lightblue")
plt.title("Max-Min Normalized Data")
plt.show()#Z-score Normalizationy2=random.sample(range(0,150),50)
x2=list(map(int,y3))
x21=np.array(x2)
mean=np.mean(x21)
sd=np.std(x21)#scale normalizationznorm=(x21-mean)/sd#plotfig,axs=plt.subplots(1,2,sharey=True)#Original random numberaxs[0].hist(x2, bins=10, color="green")
axs[0].title.set_text("Random Data")#scale normalizaed histogram Plotaxs[1].hist(znorm, bins=10,color="lightgreen")
plt.title("Z-score Normalized Data")
plt.show()#scaley3=random.sample(range(1000,10000),50)
x3=list(map(int,y3))
x31=np.array(x3)#scale normalizationsnorm=x31/1000#plotfig,axs=plt.subplots(1,2,sharey=True)#Original random numberaxs[0].hist(x3, bins=10, color="orange")
axs[0].title.set_text("Random Data")#scale normalizaed histogram Plotaxs[1].hist(snorm, bins=10,color="yellow")
plt.title("Scale Normalized Data")
plt.show()



作者:乌然娅措
链接:https://www.jianshu.com/p/5b0f1638c460


打开App,阅读手记
0人推荐
发表评论
随时随地看视频慕课网APP