在单个 csv 文件上创建具有连续值的列

首页课程实战体系课手记专栏慕课教程

在单个 csv 文件上创建具有连续值的列

我有一个很大的 csv 文件，我把它分成了六个单独的文件。我正在使用“for 循环”来读取每个文件并创建一个列，其中的值递增 1。

whole_file=['100Hz1-raw.csv','100Hz2-raw.csv','100Hz3-raw.csv','100Hz4-raw.csv','100Hz5-raw.csv','100Hz6-raw.csv']

first_file=True

for piece in whole_file:

if not first_file:

skip_row = [0] # if it is not the first csv file then skip the header row (row 0) of that file

else:

skip_row = []

V_raw = pd.read_csv(piece)

V_raw['centiseconds'] = np.arange(len(V_raw)) #label each centisecond

我的输出：

我想要的输出

有没有一种聪明的方法来做我想做的事。

慕斯王

浏览 111回答 2

2回答

森栏

存储厘秒的最后一个值并从那里开始计数：whole_file=['100Hz1-raw.csv','100Hz2-raw.csv','100Hz3-raw.csv','100Hz4-raw.csv','100Hz5-raw.csv','100Hz6-raw.csv']first_file=True## create old_centiseconds variableold_centiseconds = 0for piece in whole_file:    if not first_file:        skip_row = [0] # if it is not the first csv file then skip the header row (row 0) of that file    else:        skip_row = []    V_raw = pd.read_csv(piece)    # add old_centiseconds onto what you had before    V_raw['centiseconds'] = np.arange(len(V_raw)) + old_centiseconds #label each centisecond    # update old_centiseconds    old_centiseconds += len(V_raw)

0 0

RISEBY

正如我在评论中所说，您可能希望将数据视为一个 numpy 数组，因为这需要更少的内存。您可以通过将 .csv 文件作为 numpy 数组打开然后附加到一个空列表来实现。如果您想将这些 numpy 数组附加在一起，您可以.vstack。下面的代码应该能够做到这一点：from numpy import genfromtxtwhole_file=['100Hz1-raw.csv','100Hz2-raw.csv','100Hz3-raw.csv','100Hz4-raw.csv','100Hz5-raw.csv','100Hz6-raw.csv']whole_file_numpy_array = []for file_name in whole_file:        my_data = genfromtxt(file_name, delimiter=',')        whole_file_numpy_array.append(file_name)    combined_numpy_array = np.vstack(whole_file_numpy_array)

0 0

随时随地看视频慕课网APP