来自以下Pandas数据框。
df = pd.DataFrame({'Id': [102,102,102,303,303,944,944,944,944],'A':[1.2,1.2,1.2,0.8,0.8,2.0,2.0,2.0,2.0],'B':[1.8,1.8,1.8,1.0,1.0,2.2,2.2,2.2,2.2],
'A_scored_time':[10,25,0,33,0,40,0,90,0],'B_scored_time':[0,0,30,0,41,0,75,0,95]})
我试图创建源自的组合的['A_scored_time','B_scored_time']列表,以获得以下与unique对应的列表Id:
Id(102) = A_Time = [10,25], B_Time = [30]
Id(303) = A_Time = [33], B_Time = [41]
Id(944) = A_Time = [40,90], B_Time = [75,95]
该列表将在下面的功能中应用。
x1 = [1,0,0]
x2 = [0,1,0]
x3 = [0,0,1]
k = 100 # constant
total_timeslot = 100 # same as k
A_Time = []
B_Time = []
对于范围内的i(区别ID),df在此处具有3个不同的ID。对于每个i,概率阵列y。
y = np.array([1-(A + B)/k, A/k, B/k])
def sum_squared_diff(x1, x2, x3, y):
ssd = []
for k in range(total_timeslot):
if k in A_Time:
ssd.append(sum((x2 - y) ** 2))
elif k in B_Time:
ssd.append(sum((x3 - y) ** 2))
else:
ssd.append(sum((x1 - y) ** 2))
return ssd
输出将是len k的数组。一旦获得此值,我将对所有n(n个不同的Id)数组求和。这是我所追求的。
结果为df:
Id(102) = sum(sum_squared_diff(x1, x2, x3, y)) =5.872800000000018
Id(303) = sum(sum_squared_diff(x1, x2, x3, y)) = 3.9407999999999896
Id(944) = sum(sum_squared_diff(x1, x2, x3, y)) =7.760800000000006
给予 toatl sum = 17.574400000000015.
PIPIONE
相关分类