我正在尝试在无头模式下使用 selenium 从网站下载 excel 文件。虽然它在大多数情况下工作得很好,但在少数情况下(一年中的几个月) driver.find_element_by_xpath() 无法按预期工作。我浏览了很多帖子,虽然当驱动程序正在寻找它时该元素可能没有出现,但情况并非如此,因为我彻底检查了它并且还尝试使用 time.sleep() 来减慢进程,在附注我还使用 driver.implicitly_wait() 使事情变得更容易,因为网站实际上需要一段时间才能在页面上加载内容。我无法使用请求,因为它在获取请求的响应中不显示任何数据。我的脚本如下:
from selenium import webdriver
import datetime
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import Select
import os
import shutil
import time
import calendar
currentdir = os.path.dirname(__file__)
Initial_path = 'whateveritis'
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_experimental_option("prefs", {
"download.default_directory": f"{Initial_path}",
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"safebrowsing.enabled": True
})
def save_hist_data(year, months):
def waitUntilDownloadCompleted(maxTime=1200):
driver.execute_script("window.open()")
# switch to new tab
driver.switch_to.window(driver.window_handles[-1])
# navigate to chrome downloads
driver.get('chrome://downloads')
# define the endTime
endTime = time.time() + maxTime
while True:
try:
# get the download percentage
downloadPercentage = driver.execute_script(
"return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('#progress').value")
# check if downloadPercentage is 100 (otherwise the script will keep waiting)
if downloadPercentage == 100:
# exit the method once it's completed
return downloadPercentage
except:
pass
猛跑小猪
相关分类