-
幕布斯7119047
先导入包from lxml import etree然后tree=etree.HTML(detailHtml)detailHtml是网页内容dataNoteList=tree.xpath(u'//td')td 表示标记名称
-
繁星点点滴滴
XPath='//*[@id="j-nav-menu-container"]/div/div/div/div/div/div[2]/div[1]/a/@href'获得a标签的hrefXPath='//*[@id="j-nav-menu-container"]/div/div/div/div/div/div[2]/div[1]/a/text()'获得a标签内容
-
呼如林
response.xpath('//h3/a/descendant-or-self::text()[normalize-space()]')descendant-or-self表明当前node和子代nodesnormal-space()去掉whitespace-only nodes的子代nodes