空对象引用上的“java.lang.String org.jsoup.nodes.Element.

我一直在尝试使用 Java 从网站获取字符串。这是我的代码:


protected String doInBackground(String... urls) {

    try {

        gotten_next_date = Jsoup.connect("https://www.vividseats.com/nba-basketball/toronto-raptors-schedule.html")

                    .get().getElementsByClass("productionsDate").first().text();

        full_next = gotten_next_date;


        return full_next;

    } catch (IOException e) {

        return "Unable to retrieve data. URL may be invalid.";

    }

我昨天写了这个并且效果很好,但是当我今天尝试时,由于某种原因它给了我这个错误:


java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.String org.jsoup.nodes.Element.text()' on a null object reference

我不明白为什么会这样。有人可以帮忙吗?


编辑:我相信错误不会因为创建变量而发生,而是因为没有从网站接收元素。我认为这个问题被错误地标记为重复。


MM们
浏览 123回答 1
1回答

蝴蝶刀刀

你所做的应该可以正常工作。我已经运行过一次,但后来它停止工作。问题是网站有一个反抓取机制,如果你在他们的网站上做了太多的请求,它会阻止你。我建议你做的是:添加userAgent()以将自己标识为机器人抓取工具。阅读他们的服务条款以检查您是否被允许抓取他们的网站。向他们发送一封电子邮件,告诉他们您的意图是什么,以及他们是否可以抓取他们网站的某些部分。顺便说一句,如果你想调试正在发生的事情,我只是将 Jsoup 调用更改为:String&nbsp;gotten_next_date&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Jsoup.connect("https://www.vividseats.com/nba-basketball/toronto-raptors-schedule.html").get().html();这将返回所请求页面的 html,如果你看,它没有任何有趣的东西。<!doctype html><html>&nbsp;<head>&nbsp;&nbsp; <meta NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">&nbsp;&nbsp; <meta http-equiv="cache-control" content="max-age=0">&nbsp;&nbsp; <meta http-equiv="cache-control" content="no-cache">&nbsp;&nbsp; <meta http-equiv="expires" content="0">&nbsp;&nbsp; <meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT">&nbsp;&nbsp; <meta http-equiv="pragma" content="no-cache">&nbsp;&nbsp; <meta http-equiv="refresh" content="10; url=/distil_r_captcha.html?requestId=291c6193-eb12-4e96-b1cd-23ba9a75e659&amp;httpReferrer=%2Fnba-basketball%2Ftoronto-raptors-schedule.html">&nbsp;&nbsp; <script type="text/javascript">&nbsp; &nbsp; (function(window){&nbsp; &nbsp; &nbsp; &nbsp; try {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if (typeof sessionStorage !== 'undefined'){&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sessionStorage.setItem('distil_referrer', document.referrer);&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; } catch (e){}&nbsp; &nbsp; })(window);</script>&nbsp;&nbsp; <script type="text/javascript" src="/vvdstsdstl.js" defer></script>&nbsp; <style type="text/css">#d__fFH{position:absolute;top:-5000px;left:-5000px}#d__fF{font-family:serif;font-size:200px;visibility:hidden}#twsyxyabbqdwrxzyzxesxywvwuzbszeeacwd{display:none!important}</style>&nbsp;&nbsp; <script>var w=window;if(w.performance||w.mozPerformance||w.msPerformance||w.webkitPerformance){var d=document;AKSB=w.AKSB||{},AKSB.q=AKSB.q||[],AKSB.mark=AKSB.mark||function(e,_){AKSB.q.push(["mark",e,_||(new Date).getTime()])},AKSB.measure=AKSB.measure||function(e,_,t){AKSB.q.push(["measure",e,_,t||(new Date).getTime()])},AKSB.done=AKSB.done||function(e){AKSB.q.push(["done",e])},AKSB.mark("firstbyte",(new Date).getTime()),AKSB.prof={custid:"632139",ustr:"",originlat:"0",clientrtt:"124",ghostip:"72.247.179.76",ipv6:false,pct:"10",clientip:"79.119.120.57",requestid:"418cf776",region:"26128",protocol:"",blver:14,akM:"b",akN:"ae",akTT:"O",akTX:"1",akTI:"418cf776",ai:"275708",ra:"false",pmgn:"",pmgi:"",pmp:"",qc:""},function(e){var _=d.createElement("script");_.async="async",_.src=e;var t=d.getElementsByTagName("script"),t=t[t.length-1];t.parentNode.insertBefore(_,t)}(("https:"===d.location.protocol?"https:":"http:")+"//ds-aksb-a.akamaihd.net/aksb.min.js")}</script>&nbsp;&nbsp;</head>&nbsp;&nbsp;<body>&nbsp;&nbsp; <div id="distilIdentificationBlock">&nbsp; &nbsp;&nbsp;&nbsp; </div>&nbsp; &nbsp;&nbsp;</body>更新:(来自 zack6849)如果您仔细查看head标签内部,最后一个meta标签暗示您正在被重定向到验证码页面:<meta http-equiv="refresh" content="10; url=/distil_r_captcha.html?requestId=291c6193-eb12-4e96-b1cd-23ba9a75e659&amp;httpReferrer=%2Fnba-basketball%2Ftoronto-raptors-schedule.html">&nbsp;如果您还搜索一下distilIdentificationBlock在 html 中找到的内容,您可以看到它与被阻止的爬虫有关。希望它可以帮助您更好地了解正在发生的事情。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Java