按多列分组时“TypeError:无法将 bool 转换为 numpy.ndarray”

我想按两列对数据框进行分组,以总结每家商店的平均月销售额。


数据(fact熊猫数据框):


store_id    sku_id  date    quantity    city    city    category    month

0   354 31253   2017-08-08  1   Paris   Paris   Shirt   8

1   354 31253   2017-08-19  1   Paris   Paris   Shirt   8

2   354 31258   2017-07-30  1   Paris   Paris   Shirt   7

3   354 277171  2017-09-28  1   Paris   Paris   Shirt   9

4   174 295953  2017-08-16  1   London  London  Shirt   8

分组基于store_idormonth只能正常工作,但是当我尝试同时按store_idand分组时month,我得到:


groupby_month = fact['quantity'].groupby(fact['store_id', 'month'])

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-169-a8cffb72ab7c> in <module>

----> 1 groupby_month = fact['quantity'].groupby(fact['store_id', 'month'])

      2 

      3 


D:\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)

   2925             if self.columns.nlevels > 1:

   2926                 return self._getitem_multilevel(key)

-> 2927             indexer = self.columns.get_loc(key)

   2928             if is_integer(indexer):

   2929                 indexer = [indexer]


D:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)

   2655                                  'backfill or nearest lookups')

   2656             try:

-> 2657                 return self._engine.get_loc(key)

   2658             except KeyError:

   2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))


pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()


pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()


pandas/_libs/index.pyx in pandas._libs.index.IndexEngine._get_loc_duplicates()


pandas/_libs/index.pyx in pandas._libs.index.IndexEngine._maybe_get_bool_indexer()


TypeError: Cannot convert bool to numpy.ndarray


阿晨1998
浏览 134回答 2
2回答

猛跑小猪

首先检查索引标签和列fact.index&nbsp;fact.columns如果您需要将索引转换为列,请使用:利用:fact.reset_index()然后你可以使用:fact.groupby(['store_id', 'month'])['quantity'].mean()输出:store_id&nbsp; month174&nbsp; &nbsp; &nbsp; &nbsp;8&nbsp; &nbsp; &nbsp; &nbsp; 1354&nbsp; &nbsp; &nbsp; &nbsp;7&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 8&nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 9&nbsp; &nbsp; &nbsp; &nbsp; 1Name: quantity, dtype: int64或更好:fact['mean']=fact.groupby(['store_id', 'month'])['quantity'].transform('mean')print(fact)&nbsp; &nbsp;store_id&nbsp; sku_id&nbsp; &nbsp; &nbsp; &nbsp; date&nbsp; quantity&nbsp; &nbsp; city&nbsp; city.1 category&nbsp; month&nbsp; \0&nbsp; &nbsp; &nbsp; &nbsp;354&nbsp; &nbsp;31253&nbsp; 2017-08-08&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp;Paris&nbsp; &nbsp;Paris&nbsp; &nbsp; Shirt&nbsp; &nbsp; &nbsp; 8&nbsp; &nbsp;1&nbsp; &nbsp; &nbsp; &nbsp;354&nbsp; &nbsp;31253&nbsp; 2017-08-19&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp;Paris&nbsp; &nbsp;Paris&nbsp; &nbsp; Shirt&nbsp; &nbsp; &nbsp; 8&nbsp; &nbsp;2&nbsp; &nbsp; &nbsp; &nbsp;354&nbsp; &nbsp;31258&nbsp; 2017-07-30&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp;Paris&nbsp; &nbsp;Paris&nbsp; &nbsp; Shirt&nbsp; &nbsp; &nbsp; 7&nbsp; &nbsp;3&nbsp; &nbsp; &nbsp; &nbsp;354&nbsp; 277171&nbsp; 2017-09-28&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp;Paris&nbsp; &nbsp;Paris&nbsp; &nbsp; Shirt&nbsp; &nbsp; &nbsp; 9&nbsp; &nbsp;4&nbsp; &nbsp; &nbsp; &nbsp;174&nbsp; 295953&nbsp; 2017-08-16&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; London&nbsp; London&nbsp; &nbsp; Shirt&nbsp; &nbsp; &nbsp; 8&nbsp; &nbsp;&nbsp; &nbsp;mean&nbsp;&nbsp;0&nbsp; &nbsp; &nbsp;1&nbsp;&nbsp;1&nbsp; &nbsp; &nbsp;1&nbsp;&nbsp;2&nbsp; &nbsp; &nbsp;1&nbsp;&nbsp;3&nbsp; &nbsp; &nbsp;1&nbsp;&nbsp;4&nbsp; &nbsp; &nbsp;1&nbsp;&nbsp;

慕勒3428872

需要添加“&nbsp;as_index=True&nbsp;”例如:“count_in = df.groupby(['time_in','id'],&nbsp;as_index=True&nbsp;)['time_in'].count()”
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python