我有一个代表餐厅顾客评分的数据框。star_rating
是该数据框中客户的评级。
我想要做的是在同一数据框中添加一列nb_fave_rating
,表示餐厅的好评总数。如果其星星数为 ,我认为“赞成”意见> = 3
。
data = {'rating_id': ['1', '2','3','4','5','6','7','8','9'],
'user_id': ['56', '13','56','99','99','13','12','88','45'],
'restaurant_id': ['xxx', 'xxx','yyy','yyy','xxx','zzz','zzz','eee','eee'],
'star_rating': ['2.3', '3.7','1.2','5.0','1.0','3.2','1.0','2.2','0.2'],
'rating_year': ['2012','2012','2020','2001','2020','2015','2000','2003','2004'],
'first_year': ['2012', '2012','2001','2001','2012','2000','2000','2001','2001'],
'last_year': ['2020', '2020','2020','2020','2020','2015','2015','2020','2020'],
}
df = pd.DataFrame (data, columns = ['rating_id','user_id','restaurant_id','star_rating','rating_year','first_year','last_year'])
df['star_rating'] = df['star_rating'].astype(float)
positive_reviews = df[df.star_rating >= 3.0 ].groupby('restaurant_id')
positive_reviews.head()
从这里开始,我不知道要计算餐厅的正面评论数量并将其添加到我的初始数据框的新列中df。
预期的输出会是这样的。
data = {'rating_id': ['1', '2','3','4','5','6','7','8','9'],
'user_id': ['56', '13','56','99','99','13','12','88','45'],
'restaurant_id': ['xxx', 'xxx','yyy','yyy','xxx','zzz','zzz','eee','eee'],
'star_rating': ['2.3', '3.7','1.2','5.0','1.0','3.2','1.0','2.2','0.2'],
'rating_year': ['2012','2012','2020','2001','2020','2015','2000','2003','2004'],
'first_year': ['2012', '2012','2001','2001','2012','2000','2000','2001','2001'],
'last_year': ['2020', '2020','2020','2020','2020','2015','2015','2020','2020'],
'nb_fave_rating': ['1', '1','1','1','1','1','1','0','0'],
}
所以我尝试了这个并得到了一堆 NaN
df['nb_fave_rating']=df[df.star_rating >= 3.0 ].groupby('restaurant_id').agg({'star_rating': 'count'})
df.head()
繁星点点滴滴
收到一只叮咚
慕标琳琳
侃侃尔雅
相关分类