使用python从bseindia下载csv文件

我想从“https://www.bseindia.com/corporates/Forth_Results.aspx”下载 Results.csv 我想基本上以数据帧格式获取数据。我使用下面的代码下载了文件,但它得到了一些错误数据。


import requests

import pandas as pd

bse_url = 'https://www.bseindia.com/corporates/Forth_Results.aspx'

r = requests.get(bse_url)

file_name = Results.csv


with open(file_name, 'wb') as f:

    for chunk in r.iter_content(): 

        f.write(chunk)

        f.flush()


扬帆大鱼
浏览 241回答 2
2回答

湖上湖

您可以在 selenium 的帮助下执行此操作,请按照以下步骤操作:第 1 步:下载 chrome 的网络驱动程序:首先检查您的 chrome 版本(浏览器的菜单(三个垂直点)-> 帮助 -> 关于 Google Chrome第二步:根据你的chrome浏览器版本下载驱动(我的是81.0.4044.138)第 3 步:下载后解压缩文件并将chromedriver.exe放在脚本所在的目录中。步骤4:pip install selenium现在使用下面的代码:from selenium import webdriverimport osimport pandas as pd#your website urlsite = 'https://www.bseindia.com/corporates/Forth_Results.aspx'#your driver pathdriver = webdriver.Chrome(executable_path = 'chromedriver.exe')#passing website urldriver.get(site)#wait until whole sites loadtime.sleep(5)#click download icon using xpathdriver.find_element_by_xpath("/html/body/div[1]/form/div[4]/div/div[2]/div/div/div[2]/a/i").click()#closing browserdriver.close()#reading Results.csv from defalut download directorydf = pd.read_csv("c:/users/viupadhy/downloads/Results.csv")df输出:    Security Code   Security Name   Company name    Result Date0   542579  AGOL    Ashapuri Gold Ornament Ltd  24 Jul 20201   500425  AMBUJACEM   AMBUJA CEMENTS LTD. 24 Jul 20202   531223  ANJANI  ANJANI SYNTHETICS LTD.-$    24 Jul 20203   500820  ASIANPAINT  ASIAN PAINTS LTD.   24 Jul 20204   500027  ATUL    ATUL LTD.   24 Jul 20205   512063  AYOME   AYOKI MERCANTILE LTD.   24 Jul 20206   517246  BCCFUBA BCC FUBA INDIA LTD. 24 Jul 20207   540700  BRNL    Bharat Road Network Ltd 24 Jul 20208   519600  CCL CCL PRODUCTS (INDIA) LTD.   24 Jul 20209   531621  CENTERAC    CENTERAC TECHNOLOGIES LTD.  24 Jul 202010  539991  CFEL    Confidence Futuristic Energetech Ltd    24 Jul 202011  500110  CHENNPETRO  CHENNAI PETROLEUM CORPORATION LTD.  24 Jul 202012  534691  COMCL   COMFORT COMMOTRADE LTD. 24 Jul 202013  531216  COMFINTE    COMFORT INTECH LTD.-$   24 Jul 202014  526829  CONFIPET    CONFIDENCE PETROLEUM INDIA LTD. 24 Jul 202015  506395  COROMANDEL  COROMANDEL INTERNATIONAL LTD.   24 Jul 202016  539876  CROMPTON    Crompton Greaves Consumer Electricals Ltd   24 Jul 202017  526269  CRSTCHM CRESTCHEM LTD.  24 Jul 202018  541546  GAYAHWS Gayatri Highways Ltd    24 Jul 202019  500171  GHCL    GHCL LTD.   24 Jul 202020  524590  HEMORGANIC  Hemo Organic Limited    24 Jul 202021  505725  HINDEVER    HINDUSTAN EVEREST TOOLS LTD.    24 Jul 202022  501295  IITL    INDUSTRIAL INVESTMENT TRUST LTD.    24 Jul 202023  513295  IMEC    Imec Services Ltd   24 Jul 202024  541300  INDINFR IndInfravit Trust   24 Jul 202025  500875  ITC ITC LTD.    24 Jul 202026  509715  JAYSHREETEA JAY SHREE TEA & INDUSTRIES LTD. 24 Jul 202027  500228  JSWSTEEL    JSW STEEL LTD.  24 Jul 202028  506184  KANANIIND   KANANI INDUSTRIES LTD.  24 Jul 202029  512036  KAPILCO KAPIL COTEX LTD.    24 Jul 2020... ... ... ... ...

ABOUTYOU

from selenium import webdriverfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.support import expected_conditions as ECimport osimport timeimport pandas as pd#PATH CHECKimport pathlibwhile 1 == 1 :  # This constructs an infinite loop    filename='C:/Users/Administrator/Downloads/Results.csv'    file = pathlib.Path(filename)    if file.exists ():        os.remove('C:/Users/Administrator/Downloads/Results.csv')    #your website url    site = 'https://www.bseindia.com/corporates/Forth_Results.aspx'    #your driver path    driver = webdriver.Chrome(executable_path = 'chromedriver.exe')    #passing website url    driver.get(site)    time.sleep(10)    wait = WebDriverWait(driver, 20)    wait.until(EC.presence_of_element_located((By.ID, 'ContentPlaceHolder1_lnkDownload')))        #click download icon using xpath    el=driver.find_element_by_xpath("/html/body/div[1]/form/div[4]/div/div[2]/div/div/div[2]/a/i")    el.click()    #elem.click()    time.sleep(20)    driver.close()    if file.exists ():        breakdf = pd.read_csv("C:/Users/Administrator/Downloads/Results.csv")print(df)
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python