猿问

AttributeError: 'NoneType' 对象没有属性 'get_text'

我一直在为这段代码苦苦挣扎:


def MainPageSpider(max_pages):

    page = 1

    while page <= max_pages:

        url = 'url' + str(page)

        source_code = requests.get(url)

        plain_text = source_code.text

        soup = bs(plain_text, 'html.parser')

        for link in soup.findAll(attrs={'class':'col4'}):

            href = 'url' + link.a['href']

            title = link.span.text


            PostPageItems(href)

        page += 1



def PostPageItems(post_url):

    source_code = requests.get(post_url)

    plain_text = source_code.text

    soup = bs(plain_text, 'html.parser')

    for items in soup.findAll(attrs={'class':'container'}):

        title2 = items.find('h1', {'class':'title'}).get_text()


        print(title2)





MainPageSpider(1)

每次我尝试从“h1”获取文本时,都会收到此错误:


Traceback (most recent call last):

  File "Xfeed.py", line 33, in <module>

    MainPageSpider(1)

  File "Xfeed.py", line 17, in MainPageSpider

    PostPageItems(href)

  File "Xfeed.py", line 27, in PostPageItems

    test = title2.get_text()

AttributeError: 'NoneType' object has no attribute 'get_text'

但是当我在没有 'get_text()' 的情况下运行它时,我会得到 'h1' HTML:


<h1 class="title">Title 1</h1>

None

None

None

None

<h1 class="title">Title 2</h1>

None

None

None

None

<h1 class="title">Title 3</h1>

None

None

None

None

我真的不明白为什么会出现这个错误,而title = link.span.text我在获取文本时没有任何问题。我只想要文字。


GCT1015
浏览 563回答 3
3回答

largeQ

不是每个container都有h1,所以只需检查是否None返回,然后仅在没有时打印。for items in soup.findAll(attrs={'class':'container'}):&nbsp; &nbsp; &nbsp; &nbsp; title2 = items.find('h1', {'class':'title'})&nbsp; &nbsp; &nbsp; &nbsp; if title2:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(title2.text)

守候你守候我

从没有get_text()它的输出来看, title2 通常None应该因您发布的错误而失败,因为None它没有该get_text()属性。您可以将其拆分为 2 个语句并添加如下检查:title2_item = items.find('h1', {'class':'title'})if title2_item: # Check for None&nbsp; &nbsp; title2 = title2_item.get_text()&nbsp; &nbsp; print(title2)

芜湖不芜

使用仅选择符合条件的元素的 css 选择器重写for item in soup.select('.container h1.title'):&nbsp; &nbsp; &nbsp; &nbsp; title2 = item.text
随时随地看视频慕课网APP

相关分类

Python
我要回答