on在对in 的工作方式产生严重误解之后join(剧透:与onin非常不同merge),这是我的示例代码。
import pandas as pd
index1 = pd.MultiIndex.from_product([["variables"], ["number", "fruit"]])
df1 = pd.DataFrame([["one", "apple"], ["two", "banana"]], columns=index1)
index2 = pd.MultiIndex.from_product([["variables"], ["fruit", "color"]])
df2 = pd.DataFrame([["banana", "yellow"]], columns=index2)
print(df1.merge(df2, on="fruit", how="left"))
我得到一个KeyError. 我如何variables.fruit在这里正确引用?
要理解我的目的,请考虑没有多重索引的相同问题:
import pandas as pd
df1 = pd.DataFrame([["one", "apple"], ["two", "banana"]], columns=["number", "fruit"])
df2 = pd.DataFrame([["banana", "yellow"]], columns=["fruit", "color"])
# this is obviously incorrect as it uses indexes on `df1` as well as `df2`:
print(df1.join(df2, rsuffix="_"))
# this is *also* incorrect, although I initially thought it should work, but it uses the index on `df2`:
print(df1.join(df2, on="fruit", rsuffix="_"))
# this is correct:
print(df1.merge(df2, on="fruit", how="left"))
预期和想要的结果是这样的:
number fruit color
0 one apple NaN
1 two banana yellow
fruit当是多重索引的一部分时,如何获得相同的结果?
SMILET
相关分类