如何在 Python 中使用 ul-li 下拉列表抓取网站？

基于问题Scraping a specific website with a search box and javascripts in Python，我试图从网站https://www.msci.com/esg-ratings/中获取公司评级，主要是在搜索框，在下拉菜单中选择该名称的所有选项（“RIO TINTO LIMITED”和“RIO TINTO PLC”此处为“rio tinto”），并在两者的右上角获得带有评级的图片。

但是，我在处理建议公司的 ul-li 退出菜单时遇到了麻烦：

from selenium import webdriver

from selenium.webdriver.support.ui import WebDriverWait

from selenium.webdriver.support import expected_conditions as EC

from selenium.webdriver.common.by import By

options = webdriver.ChromeOptions()

options.add_argument('-headless')

options.add_argument('-no-sandbox')

options.add_argument('-disable-dev-shm-usage')

options.add_argument('window-size=1920,1080')

wd = webdriver.Chrome(options=options)

wd.get('https://www.msci.com/esg-ratings')

WebDriverWait(wd, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="_esgratingsprofile_keywords"]'))).send_keys("RIO TINTO")

WebDriverWait(wd, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="ui-id-1"]/li[1]'))).click()

#WebDriverWait(wd,10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"#_esgratingsprofile_esg-ratings-profile-header > div.esg-ratings-profile-header-ratingdata > div.ratingdata-container > div.ratingdata-outercircle.esgratings-profile-header-yellow > div")))

print(wd.find_element_by_xpath('//*[@id="_esgratingsprofile_esg-ratings-profile-header"]/div[2]/div[1]/div[2]/div'))

（代码给出了 ElementClickInterceptedException。）

如何访问“RIO TINTO LIMITED”和“RIO TINTO PLC”所需的数据？

森林海

浏览 122回答 1

1回答

慕工程0101907

我在处理推荐公司的 ul-li 退出菜单时遇到了麻烦这是预期的，因为element您的目标是通过dynamic脚本呈现的。options.add_argument('-headless')为了克服这一点，你将不得不避免。你这里也有问题print(wd.find_element_by_xpath('//*[@id="_esgratingsprofile_esg-ratings-profile-header"]/div[2]/div[1]/div[2]/div'))您尝试打印元素的位置。由于目标元素是icon由呈现的CSS，因此您不能使用print()来输出它。相反，您需要将其保存为一个.png文件with open('filename.png', 'wb') as file:    file.write(driver.find_element_by_xpath('//*[@id="_esgratingsprofile_esg-ratings-profile-header"]/div[2]/div[1]/div[2]/div').screenshot_as_png)然后根据您的需要使用它。

0 0

随时随地看视频慕课网APP