我想从此页面抓取比赛结果:https://www.tennisexplorer.com/player/paire-4a33b/
从抓取的结果中,我想创建包含以下列的表:tournament、date、match_player_1、match_player_2、round、score 我创建了一个代码,它有效,但我不知道如何为每个比赛行添加比赛
import requests
from bs4 import BeautifulSoup
u = 'https://www.tennisexplorer.com/player/paire-4a33b/'
r = requests.get(u, timeout=120, headers=headers)
# print(r.status_code)
soup = BeautifulSoup(r.content, 'html.parser')
for tr in soup.select('#matches-2020-1-data tr'):
match_date = tr.select_one('td:nth-of-type(1)').get_text(strip=True)
match_surface = tr.select_one('td:nth-of-type(2)').get_text(strip=True)
match = tr.select_one('td:nth-of-type(3)').get_text(strip=True)
#...
我需要像这样创建表:
tournament date match_player_1 match_player_2 round score
Cincinnati Masters (New York) 22.08. Coric B. Paire B. 1R 6-0, 1-0
Ultimate Tennis Showdown 2 01.08. Moutet C. Paire B. NaN 15-0, 15-0, 15-0, 15-0
我如何将锦标赛与每场比赛联系起来
30秒到达战场
一只甜甜圈
相关分类