我正在尝试对我拥有的数据应用一种规范化形式。我希望从数据框中的每个值中减去每行的中值。到目前为止我所拥有的:
# Generate sample data
data = { "sample_name": ["s1", "s2", "s3", "s4", "s5", "s6"],
"group_name": ["g1", "g1", "g1", "g2", "g2", "g2"],
'col1':[1, 22, 3, 45, 31, 53],
'col2':[30, 21, 10, 42, 56, 20],
'col3':[78, 25, 33, 87, 20, 19],
'col4':[11, 23, 14, 98, 55, 66],
'col5':[19, 29, 39, 49, 59, 69],
}
df = pd.DataFrame(data)
# calculate medians of each row
median_ls = list(df.median(axis=1))
# [19.0, 23.0, 14.0, 49.0, 55.0, 53.0]
预期结果是:
-18,11,59,-8,0
-1,-2,2,0,6
-11,-4,19,0,25
-4,-7,38,49,0
-24,1,-35,0,4
0,-33,-34,13,16
我看过df.apply(<function>, axis=1),但无法弄清楚如何跨行迭代应用特定于行的函数的语法。
慕少森
SMILET
慕斯709654
相关分类