这是几个小时抓取后的错误回溯:
The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.
这是我的 selenium python 设置:
#scrape.py
from selenium.common.exceptions import *
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.chrome.options import Options
def run_scrape(link):
chrome_options = Options()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument("--headless")
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument("--lang=en")
chrome_options.add_argument("--start-maximized")
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
chrome_options.add_argument("user-agent=Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36")
chrome_options.binary_location = "/usr/bin/google-chrome"
browser = webdriver.Chrome(executable_path=r'/usr/local/bin/chromedriver', options=chrome_options)
browser.get(<link passed here>)
try:
#scrape process
except:
#other stuffs
browser.quit()
#multiprocess.py
import time,
from multiprocessing import Pool
from scrape import *
if __name__ == '__main__':
start_time = time.time()
#links = list of links to be scraped
pool = Pool(20)
results = pool.map(run_scrape, links)
pool.close()
print("Total Time Processed: "+"--- %s seconds ---" % (time.time() - start_time))
Chrome、ChromeDriver 设置、Selenium 版本
ChromeDriver 79.0.3945.36 (3582db32b33893869b8c1339e8f4d9ed1816f143-refs/branch-heads/3945@{#614})
Google Chrome 79.0.3945.79
Selenium Version: 4.0.0a3
我想知道为什么 chrome 正在关闭但其他进程正在工作?
慕莱坞森
HUX布斯
相关分类