Python webscraper & 丢失的输出数据

我正在尝试从网站上抓取评论并使用 Python (3.7) 和 BeautifulSoup 将它们存储到 csv 中。似乎抓取成功,但是当我写入文件时,只有一列包含完整数据,其余的只是第一个字符。


任何提示都将不胜感激,如果它很明显很抱歉 - 这是一个新的爱好:)


from urllib.request import urlopen as uReq

from bs4 import BeautifulSoup as soup


#URL to scrape

my_url = "https://www.indeed.com/cmp/Capital-One/reviews?fcountry=ALL&lang="


#open connection, grab page

uClient = uReq(my_url)

page_html = uClient


#html parsing

page_soup = soup(page_html, "lxml")


#grab all reviews on page

containers = page_soup.findAll("div",{"cmp-review-container"})

uClient.close()

#write to csv

filename = "indeedreviewtest.csv"

f=open(filename, "w")


headers = "review_id, review_score, role, review_text\n"


f.write(headers)


#loop through each review, collect review ID, rating, role & verbatum

for container in containers:

    reviewid_container = container.div["data-tn-entityid"]

    reviewid = reviewid_container[0]

    score_container = container.div.div.div.meta["content"]

    reviewscore = score_container[0]

    role_container = container.find("span", attrs={"class":"cmp-reviewer- job-title"}).text

    reviewerrole = role_container[0]

    reviewtext_container = container.find("span", attrs={"class":"cmp-review-text"}).text

    reviewtext = reviewtext_container


    f.write(reviewid + "," + reviewscore + "," + reviewerrole.replace(",", "|") + "," + reviewtext.replace(",", "|") + "\n")


f.close()

谢谢!


天涯尽头无女友
浏览 147回答 1
1回答
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python