我使用美丽的汤抓取数据创建一个数据框。然而,有两个问题。
为什么for循环运行了2次?
如何去掉数据框上的括号?
导入 urllib.request 作为 req
from bs4 import BeautifulSoup
import bs4
import requests
import pandas as pd
url = "https://finance.yahoo.com/quote/BF-B/profile?p=BF-B"
root = requests.get(url)
soup = BeautifulSoup(root.text, 'html.parser')
records = []
for result in soup:
name = soup.find_all('h1', attrs={'D(ib) Fz(18px)'})
website = soup.find_all('a')[44]
sector = soup.find_all('span')[35]
industry = soup.find_all('span')[37]
records.append((name, website, sector, industry))
df = pd.DataFrame(records, columns=['name', 'website', 'sector', 'industry'])
df.head()
结果是这样的:
慕桂英3389331
相关分类