我正在尝试通过从多个页面进行网络抓取来收集数据。问题是我想将列转置为行,以将抓取的数据作为数据帧获取。
我检查了这个问题并将其应用于我的python代码,但它无法正常工作。
这是我下面的代码:
browser.get('https://fortune.com/global500/2019/walmart')
data =[]
i = 1
while True:
table = browser.find_element_by_css_selector('tbody')
if i > 2:
break
try:
print("Scraping Page no. " + str(i))
i = i + 1
for row in table.find_elements_by_css_selector('tr'):
cols = [cell.text for cell in row.find_elements_by_css_selector('td.dataTable__value--3n5tL.dataTable__valueAlignLeft--3uvNx')]
colsT = data.append(np.array(cols).T.tolist())
try:
WebDriverWait(browser, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a > span.singlePagination__icon--2KbZn"))).click()
time.sleep(3)
except TimeoutException:
break
except Exception as e:
print(e)
break
data1 = pd.DataFrame(data)
print(data1)
以下是我运行的代码的输出:
Scraping Page no. 1
Scraping Page no. 2
0
0 C. Douglas McMillon
1 Retailing
2 General Merchandisers
3 Bentonville, Ark.
4 -
5 25
6 2,200,000
7 Dai Houliang
8 Energy
9 Petroleum Refining
10 Beijing
11 -
12 21
13 619,151
这就是我想要的样子:
0 C. Douglas McMillon Retailing General Merchandisers Bentonville, Ark. - ...
1 Dai Houliang Energy Petroleum Refining Beijing - ...
任何建议或更正将不胜感激。
慕村9548890
烙印99
相关分类