我是网络抓取的新手,只是找不到解决问题的方法。我卡在登录页面。
import requests
POST_LOGIN_URL = 'https://ocjene.skole.hr/pocetna/prijava' # Login page
REQUEST_URL = 'https://ocjene.skole.hr/pregled/predmeti' # Goal page for scraping
with requests.Session() as session:
session.get(POST_LOGIN_URL) # Loading all cookies...
login_page = session.get(POST_LOGIN_URL) # Login page content (for comparison)
token = session.cookies["csrf_cookie"] # This cookie on chrome has a valid csrf token
payload = {
'csrf_token': token,
'user_login': 'xxx',
'user_password': 'xxx'
}
post = session.post(POST_LOGIN_URL, data=payload) # Logging in...
afterLogin = session.get(REQUEST_URL) # This is where I need to get all the content, but...
print(subject_math.content)
print(login_page.content)
# These two share exact same content, except the csrf token is different
我不确定登录是否成功。我仔细检查了所有内容,表单数据是正确的,我还尝试替换请求标头,如下所示:
post = session.post(POST_LOGIN_URL, data=payload, headers=headers)
我错过了什么?谢谢。
米琪卡哇伊
相关分类