猿问

如何提取每个视频的所有观看次数导致 Selenium 搜索 Youtube?

我想要的是:

  • 能够提取 selenium 在 youtube 搜索结果页面上产生的每个视频的所有视图。

  • 例如:如果我在 youtube 上搜索“来自 Imagine Dragons 的信徒”,它应该会给我所有结果视频的观看次数(例如 - 104M 观看次数、1.5B 观看次数、698M 观看次数等)最多可以说前 20 个视频。

我试过的

from selenium import webdriver


driver=webdriver.Chrome(executable_path='C:\\ProgramData\\chocolatey\\bin\\chromedriver.exe')

search = 'Believer from Imagine Dragons'

driver.get("https://www.youtube.com/results?search_query=" + search)


main = driver.find_elements_by_id("metadata")

for datas in main:

    info = datas.find_elements_by_id("metadata-line")

    for views in info:

        view_counts = views.find_element_by_xpath("""//*[@id="metadata-line"]/span[1]""")

        print('view_counts: ' + str(view_counts.text))

这个的输出:


view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

view_counts: 104M views

我也尝试过


from selenium import webdriver

from selenium.webdriver.common.by import By

from selenium.webdriver.support.ui import WebDriverWait

from selenium.webdriver.support import expected_conditions as EC


driver=webdriver.Chrome(executable_path='C:\\ProgramData\\chocolatey\\bin\\chromedriver.exe')

search = 'Believer from Imagine Dragons'

driver.get("https://www.youtube.com/results?search_query=" + search)



main = WebDriverWait(driver, 10).until(

    EC.presence_of_element_located((By.ID, "metadata"))

)


data = main.find_elements_by_id("metadata-line")


for datas in data:

    views = datas.find_element_by_xpath("""//*[@id="metadata-line"]/span[1]""")

    print(views.text)

这个的输出:


104M views

但是,他们都没有给我我想要的。请帮忙。


未来目标(如果你能帮忙的话):


能够播放该页面上观看次数最多的视频。


慕姐4208626
浏览 113回答 0
0回答

BIG阳

要使用Selenium和从每个文本中提取文本,例如TEXT<span>Python您必须引入WebDriverWait并且visibility_of_all_elements_located()可以使用以下任一定位器策略:使用CSS_SELECTOR和get_attribute("innerHTML"):driver.get("https://www.youtube.com/results?search_query=Believer%20from%20Imagine%20Dragons")print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div#metadata-line span:first-child")))])使用XPATH和文本属性:driver.get("https://www.youtube.com/results?search_query=Believer%20from%20Imagine%20Dragons")print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@id='metadata-line']/span[@class='style-scope ytd-video-meta-block' and contains(., 'views')]")))])控制台输出:['1.5B views', '104M views', '32M views', '93M views', '98M views', '2.3M views', '39M views', '26M views', '1.4B views', '9.6M views', '6.7M views', '748K views', '1.3B views', '11M views', '84M views', '51M views', '13M views', '18M views', '197M views', '7.2M views', '79K views', '3.5M views']注意:您必须添加以下导入:from selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support import expected_conditions as EC
随时随地看视频慕课网APP

相关分类

Python
我要回答