在 python 中的 WebScraping javascript 页面

你好世界,


Python 中的新内容,我正在尝试抓取一个 javascript 页面:https ://search.gleif.org/#/search/


请在下面从我的代码中找到结果(使用请求)


<!DOCTYPE html>

<html>

<head><meta charset="utf-8"/>

<meta content="width=device-width,initial-scale=1" name="viewport"/>

<title>LEI Search 2.0</title>

<link href="/static/icons/favicon.ico" rel="shortcut icon" type="image/x-icon"/>

<link href="https://fonts.googleapis.com/css?family=Open+Sans:200,300,400,600,700,900&amp;subset=cyrillic,cyrillic-ext,greek,greek-ext,latin-ext,vietnamese" rel="stylesheet"/>

<link href="/static/css/main.045139db483277222eb714c1ff8c54f2.css" rel="stylesheet"/></head>

<body>

<div id="app"></div>

<script src="/static/js/manifest.2ae2e69a05c33dfc65f8.js" type="text/javascript"></script>

<script src="/static/js/vendor.6bd9028998d5ca3bb72f.js" type="text/javascript"></script>

<script src="/static/js/main.5da23c5198041f0ec5af.js" type="text/javascript"></script>

</body>

</html>

问题: 而不是检索上面的脚本:

"src="/static/js/manifest.2ae2e69a05c33dfc65f8.js" type="text/javascript""


我想拥有表格的内容以便存储它。

我想刮的桌子

http://img1.mukewang.com/61c3e02d000171a516390629.jpg

aluckdog
浏览 149回答 1
1回答

FFIVE

以下代码是使用PySelenium编写的。import timefrom selenium import webdrivercountry = []legal_name = []lei = []driver = webdriver.Chrome()driver.implicitly_wait(5)for i in range(1,30395):&nbsp; &nbsp; driver.get('https://search.gleif.org/#/search/fulltextFilterId=LEIREC_FULLTEXT&currentPage='+str(i)+'&perPage=50&expertMode=false#results-section')&nbsp; &nbsp; time.sleep(5)&nbsp; &nbsp; country += [i.get_attribute('innerHTML') for i in driver.find_elements_by_xpath('//*[@class="table-cell country"]/a')]&nbsp; &nbsp; legal_name += [i.get_attribute('innerHTML') for i in driver.find_elements_by_xpath('//*[@class="table-cell legal-name"]/a')]&nbsp; &nbsp; lei += [i.get_attribute('innerHTML') for i in driver.find_elements_by_xpath('//*[@class="table-cell lei"]/a')]登录(使用相应的元素更改此设置。)driver.find_element_by_id("UserName").send_keys("xxxx")driver.find_element_by_name("Password").send_keys("yyyy")driver.find_element_by_class("loginButton").click()获取页面内容print(driver.page_source)
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

JavaScript