手记

使用Python获取微信好友并进行数据分析

最近看到微信公众号推荐了些文章,有关于用Python爬取自己的微信好友,然后做了一些分析。其实之前我也有过这样的想法,一直没去实现。刚好今天元旦,回公司写了这么一个小项目。

其实获取微信好友很简单,有现成的模块直接使用,这是itchat的官网https://itchat.readthedocs.io/zh/latest/ 。首先通过pip3进行安装

pip3 install itchat

然后导入itchat模块,通过get_friends()方法获取所有微信好友,

import itchat

# auto_login()无参数,会生成一个二维码,扫描登录;设置为True时,手机端确认登录即可
itchat.auto_login(True)
friends = itchat.get_friends()

为了后面方便数据分析,我将微信好友信息入库处理,首先创建数据库,

create table t_friends
(
  id           int auto_increment primary key,
  user_name    varchar(255) null,
  nick_name    varchar(20)  null,
  remark_name  varchar(20)  null,
  sex          int          null,
  head_img_url varchar(255) null,
  province     varchar(20)  null,
  city         varchar(20)  null,
  signature    varchar(255) null
);

将获取的微信好友插入数据库,

import pymysql

connect = pymysql.connect(host='localhost',
				user='root',
				password='root1234',
				db='itchat_db',
				charset='utf8mb4')

cursor = connect.cursor()

for friend in friends:
    sql = "INSERT INTO t_friends (`user_name`, `nick_name`, `remark_name`, `sex`, `head_img_url`, `province`, `city`, `Signature`) VALUES (%s, %s, %s, %s, %s, %s, %s, %s) "
    cursor.execute(sql, (friend['UserName'], friend['NickName'], friend['RemarkName'], friend['Sex'], friend['HeadImgUrl'], friend['Province'], friend['City'], friend['Signature']))
    
connect.commit()
connect.close()

有了数据之后,就可以进行分析了。我使用的是基于图像处理库的pylab接口模块matplotlib,还是通过pip3进行安装,

pip3 install matplotlib

先分析一下好友的男女比例,

import pymysql
import matplotlib.pyplot as plt

connect = pymysql.connect(host='localhost',
				user='root',
				password='root1234',
				db='itchat_db',
				charset='utf8mb4')

cursor = connect.cursor()

sql = "select case when sex = 1 then '男' when sex = 2 then '女' else '其它' end as '性别', count(sex) from t_friends group by sex;"

cursor.execute(sql)
results = cursor.fetchall()

fig, ax = plt.subplots(figsize=(15, 8), subplot_kw=dict(aspect="equal"))

data = [val[1] for val in results]
sex = [key[0] for key in results]


def func(pct, allvals):
	absolute = int(pct/100.*np.sum(allvals))
	return "{:.1f}%\n({:d} 人)".format(pct, absolute)

wedges, texts, autotexts = ax.pie(data, autopct=lambda pct: func(pct, data), textprops=dict(color="w"))

ax.legend(wedges, sex, title="男女比例", loc="cneter left", bbox_to_anchor=(1, 0, 0.5, 1))

plt.setp(autotexts, size=8, weight="bold")

ax.set_title("微信好友男女比例分布")

plt.show()

效果展示,

然后分析一下微信好友都是分布在哪些省份和城市,

import pymysql
import matplotlib.pyplot as plt

connect = pymysql.connect(host='localhost',
				user='root',
				password='root1234',
				db='itchat_db',
				charset='utf8mb4')

cursor = connect.cursor()

# 各省份人数查询SQL
sql = "select province, count(1) counts from t_friends where province != '' group by province order by counts desc limit 20;"

cursor.execute(sql)
results = cursor.fetchall()

cities = [city[0] for city in results]
counts = [count[1] for count in results]

fig, axs = plt.subplots(1, 1, figsize=(15, 8), sharey=True)

axs.bar(cities, counts)

for x, y in zip(cities, counts):
	plt.text(x, y+0.05, '%.0f' % y, ha='center', va='bottom', fontsize=11)

axs.set_title('微信好友所在省份前20分布')
plt.show()

效果展示,

import pymysql
import matplotlib.pyplot as plt

connect = pymysql.connect(host='localhost',
				user='root',
				password='root1234',
				db='itchat_db',
				charset='utf8mb4')

cursor = connect.cursor()

# 各城市人数查询SQL
sql1 = "select city, count(1) counts from t_friends where city != '' group by province, city order by counts desc limit 25;"

cursor.execute(sql1)
results1 = cursor.fetchall()

cities1 = [city[0] for city in results1]
counts1 = [count[1] for count in results1]

fig, axs = plt.subplots(1, 1, figsize=(15, 8), sharey=True)

axs.bar(cities, counts)

for x, y in zip(cities1, counts1):
	plt.text(x, y+0.05, '%.0f' % y, ha='center', va='bottom', fontsize=11)

axs.set_title('微信好友所在城市前25分布')
plt.show()

效果展示,

通过上面的饼图和柱状图来看,我的微信好友还是以男性居多,还有部分是未知性别的,啊哈哈哈(邪恶😈)。因为我是安徽人,所以安徽人居多是肯定的啦,大部分都是我从小学到大学的同学,朋友及家人等等。然后河南人占了第二的位置,也是能理解的,毕竟从毕业后,由于工作原因在郑州待了一年,唉,还是有点想念郑州的伙伴啊。剩下的比如江苏、浙江、上海是不少人向往、打拼的城市吧。其他的话有在脸书、推特上认识的一些朋友,就不细说了。

人生很短,为了梦想加油吧!


itchat是一个开源的微信个人号接口项目,它支持python2以及python3,很方便的扩展个人的微信号、方便自己的生活。如果你很感兴趣,那就去官网探索吧。


0人推荐
随时随地看视频
慕课网APP