Python - Beautifulsoup - 只返回一个结果

4回答

慕村225694

你的matchscrape功能是错误的。而不是match.find返回第一项的函数，您应该使用与matches函数match.findAll函数相同的方式。然后像下面的例子一样遍历找到的日期时间。def matchscrape(g_data):    for match in g_data:        datetimes = match.findAll('div', class_='main time col-sm-2 hidden-xs')        for datetime in datetimes:            print("DateTimes; ", datetime.text.strip())            print('-' * 80)第二件事是解析 html 页面。该页面是用编写的，html因此您可能应该使用BeautifulSoup(page.text, 'html.parser')而不是lxml

江户川乱折腾

我也只有 1 个时间戳。不过，还有其他可能导致问题的原因。在这种情况下，网站通常具有动态内容，并且在某些情况下，这些内容并不总是随请求正确加载。如果您真的确定问题是请求没有正确获取站点，请尝试requests_html(pip install requests-html)，这会打开一个肯定会加载所有动态内容的会话：from requests_html import HTMLSessionfrom bs4 import BeautifulSoupsession = HTMLSession()request = session.get(LINK)html = BeautifulSoup(request.text, "html.parser")

陪伴而非守候

今天只有一次，但您可以通过首先使用所需日期发出 POST 请求并重新加载页面来获得明天的时间。例如：import requestsfrom bs4 import BeautifulSoupurl = 'https://sport-tv-guide.live/live/darts'select_date_url = 'https://sport-tv-guide.live/ajaxdata/selectdate'with requests.session() as s:    # print times for today:    print('Times for today:')    soup = BeautifulSoup(s.get(url).content, 'html.parser')    for t in soup.select('.time'):        print(t.get_text(strip=True, separator=' '))    # select tomorrow:    s.post(select_date_url, data={'d': '2020-07-19'}).text    # print times for tomorrow:    print('Times for 2020-07-19:')    soup = BeautifulSoup(s.get(url).content, 'html.parser')    for t in soup.select('.time'):        print(t.get_text(strip=True, separator=' '))印刷：Times for today:Darts 17:05Times for 2020-07-19:Darts 19:05Darts 19:05

哈士奇WWW

在获得上述有用的答案后，我能够确定问题是网站上存储了一个 cookie，其中包含用户选择的国家/地区信息，以显示运动日程数据。在这个例子中，澳大利亚的一个频道在 18:00 有一个列表。由于从请求模块收到的请求没有 cookie 数据，这最初没有通过我上面的代码显示在输出中。我能够通过以下代码提供必要的 cookie 信息def makesoup(url):    cookies = {'mycountries' : '101,28,3,102,42,10,18,4,2'} # pass cookie data    r = requests.post(url,  cookies=cookies)    return BeautifulSoup(r.text,"html.parser")现在输出了正确的信息只需发布此答案，以防将来帮助遇到类似问题的人。