猿问

试图抓住某些元素

我是 Python 中 lxml 模块的新手。我正在尝试从网站解析数据:https ://weather.com/weather/tenday/l/USCA1037:1:US


我正在尝试获取以下内容:


<span classname="narrative" class="narrative">

  Cloudy. Low 49F. Winds WNW at 10 to 20 mph.

</span>

但是,我把我的 xpath 搞混了。


准确地说,这条线的位置是


//*[@id="twc-scrollabe"]/table/tbody/tr[4]/td[2]/span

我尝试如下


import requests


import lxml.html


from lxml import etree




html = requests.get("https://weather.com/weather/tenday/l/USCA1037:1:US")


element_object = lxml.html.fromstring(html.content)  # htmlelement object returns bytes

  # element_object has root of <html>


table = element_object.xpath('//div[@class="twc-table-scroller"]')[0]

day_of_week = table.xpath('.//span[@class="date-time"]/text()')  # returns list of items from "dates-time"

dates = table.xpath('.//span[@class="day-detail clearfix"]/text()')


td = table.xpath('.//tbody/tr/td/span[contains(@class, "narrative")]')

print td


  # print td displays an empty list.  

我希望我的程序也能解析“多云。低 49F。WNW 风速为 10 到 20 mph。”


慕虎7371278
浏览 148回答 2
2回答

交互式爱情

有些<td>有title=说明import requestsimport lxml.htmlhtml = requests.get("https://weather.com/weather/tenday/l/USCA1037:1:US")element_object = lxml.html.fromstring(html.content)table = element_object.xpath('//div[@class="twc-table-scroller"]')[0]td = table.xpath('.//tr/td[@class="twc-sticky-col"]/@title')print(td)结果['Mostly cloudy skies early, then partly cloudy after midnight. Low 48F. Winds SSW at 5 to 10 mph.',&nbsp;&nbsp;'Mainly sunny. High 66F. Winds WNW at 5 to 10 mph.',&nbsp;&nbsp;'Sunny. High 71F. Winds NW at 5 to 10 mph.',&nbsp;&nbsp;'A mainly sunny sky. High 69F. Winds W at 5 to 10 mph.',&nbsp;&nbsp;'Some clouds in the morning will give way to mainly sunny skies for the afternoon. High 67F. Winds WSW at 5 to 10 mph.',&nbsp;&nbsp;'Considerable clouds early. Some decrease in clouds later in the day. High 67F. Winds WSW at 5 to 10 mph.',&nbsp;&nbsp;'Partly cloudy. High near 65F. Winds WSW at 5 to 10 mph.',&nbsp;&nbsp;'Cloudy skies early, then partly cloudy in the afternoon. High 61F. Winds WSW at 10 to 20 mph.',&nbsp;&nbsp;'Sunny skies. High 62F. Winds WNW at 10 to 20 mph.',&nbsp;&nbsp;'Mainly sunny. High 61F. Winds WNW at 10 to 20 mph.',&nbsp;&nbsp;'Sunny along with a few clouds. High 64F. Winds WNW at 10 to 15 mph.',&nbsp;&nbsp;'Mostly sunny skies. High around 65F. Winds WNW at 10 to 15 mph.',&nbsp;&nbsp;'Mostly sunny skies. High 66F. Winds WNW at 10 to 20 mph.',&nbsp;&nbsp;'Mainly sunny. High around 65F. Winds WNW at 10 to 20 mph.',&nbsp;&nbsp;'A mainly sunny sky. High around 65F. Winds WNW at 10 to 20 mph.']HTML中没有<tbody>,但 Web 浏览器可能会在 DevTool 中显示它 - 所以不要tbody在 xpath 中使用。有些文字在,<span></span>但有些在<span><span></span></span>import requestsimport lxml.htmlhtml = requests.get("https://weather.com/weather/tenday/l/USCA1037:1:US")element_object = lxml.html.fromstring(html.content)table = element_object.xpath('//div[@class="twc-table-scroller"]')[0]td = table.xpath('.//tr/td//span/text()')print(td)结果['Tonight', 'APR 21', 'Partly Cloudy', '--', '48', '10', '%', 'SSW 7 mph ', '85', '%',&nbsp;&nbsp;'Mon', 'APR 22', 'Sunny', '66', '51', '10', '%', 'WNW 9 mph ', '67', '%',&nbsp;&nbsp;'Tue', 'APR 23', 'Sunny', '71', '53', '0', '%', 'NW 8 mph ', '59', '%',&nbsp;&nbsp;'Wed', 'APR 24', 'Sunny', '69', '52', '10', '%', 'W 9 mph ', '71', '%',&nbsp;&nbsp;'Thu', 'APR 25', 'Partly Cloudy', '67', '51', '10', '%', 'WSW 9 mph ', '71', '%',&nbsp;&nbsp;'Fri', 'APR 26', 'Partly Cloudy', '67', '51', '10', '%', 'WSW 9 mph ', '69', '%',&nbsp;&nbsp;'Sat', 'APR 27', 'Partly Cloudy', '65', '50', '10', '%', 'WSW 9 mph ', '71', '%',&nbsp; &nbsp;&nbsp;'Sun', 'APR 28', 'AM Clouds/PM Sun', '61', '49', '20', '%', 'WSW 13 mph ', '75', '%',&nbsp;&nbsp;'Mon', 'APR 29', 'Sunny', '62', '48', '10', '%', 'WNW 14 mph ', '63', '%',&nbsp;&nbsp;'Tue', 'APR 30', 'Sunny', '61', '49', '0', '%', 'WNW 14 mph ', '61', '%',&nbsp;&nbsp;'Wed', 'MAY 1', 'Mostly Sunny', '64', '50', '0', '%', 'WNW 12 mph ', '60', '%',&nbsp;&nbsp;'Thu', 'MAY 2', 'Mostly Sunny', '65', '50', '0', '%', 'WNW 12 mph ', '61', '%',&nbsp;&nbsp;'Fri', 'MAY 3', 'Mostly Sunny', '66', '51', '0', '%', 'WNW 13 mph ', '61', '%',&nbsp;&nbsp;'Sat', 'MAY 4', 'Sunny', '65', '51', '0', '%', 'WNW 14 mph ', '62', '%',&nbsp;&nbsp;'Sun', 'MAY 5', 'Sunny', '65', '51', '0', '%', 'WNW 14 mph ', '63', '%']

繁星coding

如果要抓取喜欢的文字Sunny. High 66F. Winds WNW at 5 to 10 mph.,可以从 的title 属性中获取<td>。这应该有效。td&nbsp;=&nbsp;table.xpath('.//tbody/tr/td[@class="description"]/@title')
随时随地看视频慕课网APP

相关分类

Python
我要回答