猿问

无法计算 beautifulsoup 字符串总和

你好,我想抓取一个网页。我发布了我的代码,但我的目标行很重要。它不起作用。我的意思是没有错误,但也没有输出。我的代码就在那里。我需要对字符串求和,这就出现了问题。


import requests

from bs4 import BeautifulSoup

import pandas as pd


url='http://www.sis.itu.edu.tr/tr/ders_programlari/LSprogramlar/prg.php'

html_content = requests.get(url).text

soup = BeautifulSoup(html_content, "lxml")


url_course_main='http://www.sis.itu.edu.tr/tr/ders_programlari/LSprogramlar/prg.php?fb='



url_course=url_course_main+soup.find_all('option')[1].get_text()    <---this line

html_content_course=requests.get(a).text

soup_course=BeautifulSoup(html_content_course,'lxml')

for j in soup_course.find_all('td'):

    print(j.get_text())

当我改变我所展示的线路时


url_course=url_course_main+'AKM'

有效。


也soup.find_all('option')[1].get_text()等于AKM。你能猜出错误在哪里吗?


至尊宝的传说
浏览 123回答 2
2回答

沧海一幻觉

尝试使用requestsPython 的标准urllib.request. requests模块打开页面时出现问题:import urllib.requestfrom bs4 import BeautifulSoupurl='http://www.sis.itu.edu.tr/tr/ders_programlari/LSprogramlar/prg.php'html_content = urllib.request.urlopen(url).read()soup = BeautifulSoup(html_content, "lxml")url_course_main='http://www.sis.itu.edu.tr/tr/ders_programlari/LSprogramlar/prg.php?fb='url_course=url_course_main+soup.find_all('option')[1].get_text()html_content_course=urllib.request.urlopen(url_course).read()soup_course=BeautifulSoup(html_content_course,'lxml')for j in soup_course.find_all('td'):&nbsp; &nbsp; print(j.get_text(strip=True))印刷:2019-2020 Yaz Dönemi AKM Kodlu Derslerin Ders Programı&nbsp;...

潇湘沐

问题是在末尾get_text()给出空格并发送带有此空格的 url - 服务器找不到带有空格的文件。'AKM 'requests'AKM '我用><字符串'>{}<'.format(param)来显示这个空间 - >AKM <- 因为没有><它似乎没问题。代码需要get_text(strip=True)或get_text().strip()删除这个空格。import requestsfrom bs4 import BeautifulSoupurl = 'http://www.sis.itu.edu.tr/tr/ders_programlari/LSprogramlar/prg.php'html_content = requests.get(url).textsoup = BeautifulSoup(html_content, 'lxml')url_course_main = 'http://www.sis.itu.edu.tr/tr/ders_programlari/LSprogramlar/prg.php?fb='param = soup.find_all('option')[1].get_text()&nbsp;print('>{}<'.format(param))&nbsp; &nbsp;# I use `> <` to show spacesparam = soup.find_all('option')[1].get_text(strip=True)print('>{}<'.format(param))&nbsp; &nbsp;# I use `> <` to show spacesurl_course = url_course_main + paramhtml_content_course = requests.get(url_course).textsoup_course = BeautifulSoup(html_content_course, 'lxml')for j in soup_course.find_all('td'):&nbsp; &nbsp; print(j.get_text())
随时随地看视频慕课网APP

相关分类

Python
我要回答