Python 中的数据分析（Dataframe 和嵌套循环）

试图了解如何在 python 中使用嵌套循环。我试图理解总结相同的值并学会了使用 group_by 函数（基于我今天看到的 stackoverflow 中的另一个问题）。我想学习 pytonic-dataframe 方式。

现在我想用以下方式总结工作日。我根据场景总结单位，例如：Scenario = 1，Company = A，Country = USA，Unit = HR+Corporate Client，总结工作时间 = 65+63 = 128 等等。在原始数据之后我包括输出应该是什么样子。我不确定这是否也适用于 group_by，这更像是一种枢轴方式。

我从嵌套循环开始，但在索引日期时遇到问题。因此，我的代码仅按日期过滤，效率不高，但有效。我了解到嵌套循环对于数据帧是不够的，但不确定我可以走哪条路。代码如下所示：

import pandas as pd

working_date_start = '2017-07-14'

working_date_end = '2017-07-15'

flag_scenario = 0

Scenario = 0

df = pd.read_csv('C:/Comapny_WorkingHours.csv', encoding='cp1252', sep=';', index_col=None).dropna()

df = df[(df['working_date'] >= working_date_start) & (df['working_date'] < working_date_end) & (df['flag'] == flag_scenario) & (df['Scenario'] >= Scenario)]

pd_date = pd.DatetimeIndex(df['working_date'].values)

df['working_date'] = pd_date

index_data = df.set_index('working_date')

for current_date in index_data.index.unique():

print('calculating date: ' +str(current_date))

for i in range(0, len(df)):

for j in range(i+1, len(df)):

if df.iloc[i]['Scenario'] == df.iloc[j]['Scenario'] and df.iloc[i]['Unit'] != df.iloc[j]['Unit'] and df.iloc[i]['Company'] == 'Company A' and df.iloc[j]['Company'] == 'Company A' and df.iloc[i]['Country'] == 'USA' and df.iloc[j]['Country'] == 'USA':

print(df.iloc[i]['Scenario'], df.iloc[j]['Scenario'])

print(df.iloc[i]['Unit'], df.iloc[j]['Unit'])

繁华开满天机

浏览 165回答 1

1回答

随时随地看视频慕课网APP