我编写了一个递归函数来增强 pandas.DataFrame.describe。它将峰度和偏斜添加为行。它还创建了第二个描述表来转置第一个描述表,以便您获得汇总统计数据的摘要。
它工作得很好,只是我不喜欢编写具有多个出口的函数。我尝试用一个 return 语句编写它(请参阅注释掉的部分),但它会在一个表中创建两个转置表。正确,但太多了。
def get_better_desc(df, recursions: int = 1):
'''Adds kurtosis and skew to pandas.DataFrame.describe output. And, creates
second transposed version of this table called on itself for summary stats
of summary stats.
Parameters:
df: pandas.DataFrame, or Series but its super_desc isn't so meaningful.
recursions: integer number of times to apply recursively to create
super_desc. Default value of 1 is all that is necessary.
Returns:
better_desc: pandas.Dataframe (or Series) with kurtosis and skew added.
super_desc: pandas.DataFrame (or Series) of better_desc transposed and
made into a better_desc itself.'''
kurt = df.kurtosis()
kurt.name = 'kurt'
skew = df.skew()
skew.name = 'skew'
better_desc = df.describe().append([kurt, skew])
if recursions > 0:
super_desc = get_better_desc(better_desc.transpose(),
recursions=(recursions - 1))
return better_desc, super_desc
else:
return better_desc
# if recursions > 0:
# super_desc = get_better_desc(better_desc.transpose(),
# recursions=(recursions - 1))
# else:
# super_desc = better_desc
# return better_desc, super_desc
陪伴而非守候
相关分类