在比较两个数据帧时,是否有任何有效的方法可以为单元格分配 id?

我想为 df2 中的数据连续分配一个特定的 ID,并基于此,ID 我想转换它在 df1 中的所有出现。我写的代码需要很多时间来执行。有没有其他办法?


for i in range (0,35261):

    for j in range (0,54793):

        if (df2.V_ID[i] == df.V_ID[j]):

            df.V_ID[j] = i

df 的示例数据:


        time               IP1           IP2        GETVIDEO    V_ID                       IP3

0   2008-03-11 17:28:17 63.22.65.77 205.181.173.92  GETVIDEO    ORDhCi6JQaY&signature   254.212.25.169

1   2008-03-11 17:28:20 63.22.65.94 35.139.184.95   GETVIDEO    xEcFchOvj4Y&signature   254.212.19.255

2   2008-03-11 17:28:22 63.22.65.73 35.139.176.183  GETVIDEO    z-oBoCMSfbw&signature   254.212.19.196

3   2008-03-11 17:28:23 63.22.65.73 102.15.230.123  GETVIDEO    pSo-_TavE1U&signature   254.212.25.206

4   2008-03-11 17:28:23 63.22.65.77 102.15.134.225  GETVIDEO    kHtaORb0LUk&signature   254.212.22.122

5   2008-03-11 17:28:23 63.22.65.77 102.15.111.222  GETVIDEO    t7qjlPPmeJE&origin  105.136.78.115

6   2008-03-11 17:28:27 63.22.65.73 35.139.31.8     GETVIDEO    2UPaRi0WY7c&origin  105.136.78.115

7   2008-03-11 17:28:28 63.22.65.73 102.15.143.68   GETVIDEO    lAzrUxpybs0&signature   254.212.21.130

8   2008-03-11 17:28:30 63.22.65.73 205.181.139.118 GETVIDEO    J_KKyw8V-l0&origin  105.136.78.115

9   2008-03-11 17:28:31 63.22.65.73 102.15.143.20   GETVIDEO    xnsPfRdSU0Q&origin  105.136.78.115

10  2008-03-11 17:28:34 63.22.65.94 102.15.141.151  GETVIDEO    qDKx6CkQM04&origin  105.136.78.115

df2 的示例数据:


        V_ID            count

0   2UPaRi0WY7c&origin  768

1   t7qjlPPmeJE&origin  142

2   CKrTlXN9-iE&origin  107

3   IZtPejST9IQ&origin  103

4   FKb3qRljGBc&origin  93

5   LcM0OT6mnqA&origin  67

6   7sei-eEjy4g&origin  62

7   qDKx6CkQM04&origin  53

8   4rb8aOzy9t4&origin  46

9   wjv4Fp7GiGk&origin  46

10  SKDXBvPIepI&sign    44


侃侃无极
浏览 142回答 1
1回答

慕斯709654

import pandas as pd df2 = pd.DataFrame({'V_ID': ['a','b','c','d'], 'count':[12,5,7,9]})df = pd.DataFrame({'time':['2008-03-11', '2008-03-11', '2008-03-11','2008-03-11', '2008-03-11', '2008-03-11', '2008-03-11'],                   'V_ID': ['a', 'sdf', 'c','rge', 'gfg', 'a', 'a']})# Create an index column for df2df2 = df2.reset_index()# Key-value pairs of index and V_IDmapping = df2['V_ID'].to_dict()# Invert key-value pairs mapping = {v: k for k, v in mapping.items()}# Replace values in df['V_ID'] that matches with keys in mapping with valuesdf['V_ID'] = df['V_ID'].replace(mapping)print(df)         time V_ID0  2008-03-11    01  2008-03-11  sdf2  2008-03-11    23  2008-03-11  rge4  2008-03-11  gfg5  2008-03-11    06  2008-03-11    0
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python