NameError:未定义名称“项目”

采纳了建议,我能够通过最初的错误,到目前为止非常感谢你们 :) 我快到了我想去的地方。似乎在缩进方面我仍然存在巨大的知识差距。你们真的是编码社区的瑰宝,到目前为止非常感谢你们:)


Here is the current code that has passed those errors and its down to a warning, and not extracting anything.


import requests

from bs4 import BeautifulSoup

import pandas as pd


url = 'https://dc.urbanturf.com/pipeline'

response = requests.get(url)

soup = BeautifulSoup(response.content, 'html.parser')


pipeline_items = soup.find_all('div', attrs={'class': 'pipeline-item'})

rows = []

columns = ['Listing Title', 'Listing url', 'listing image url', 'location', 'Project type', 'Status', 'Size']

for item in pipeline_items:

    # title, image url, listing url

    listing_title = item.a['title']

    listing_url = item.a['href']

    listing_image_url = item.a.img['src']

    for p_tag in item.find_all('p'):

        if not p_tag.h2:

            if p_tag.text == 'Location:':

                p_tag.span.extract()

                property_location = p_tag.text.strip()

            elif p_tag.span.text == 'Project type:':

                p_tag.span.extract()

                property_type = p_tag.text.strip()

            elif p_tag.span.text == 'Status:':

                p_tag.span.extract()

                property_status = p_tag.text.strip()

            elif p_tag.span.text == 'Size:':

                p_tag.span.extract()

                property_size = p_tag.text.strip()

  

    row = [listing_title, listing_url, listing_image_url, property_location, property_type, property_status, property_size]

    rows.append(row)

    df = pd.Dataframe(rows, columns=columns)

    df.to_excel('DC Pipeline Properties.xlsx', index=False)

print('File Saved')

我得到的错误是以下我使用 pycharm 2020.2 也许它是一个糟糕的选择?


row = [listing_title, listing_url, listing_image_url, property_location, property_type, property_status, property_size] NameError: name 'property_location' 未定义


皈依舞
浏览 145回答 4
4回答

尚方宝剑之说

在我看来,您的第二个 for 循环for p_tag in item.find_all('p'):不在第一个 for 循环的范围内,该循环遍历项目...添加到第一个循环中可能有 0 个项目的事实,您得到一个无。只需将 for 循环及其内容放在迭代 pipeline_items 中的项目的 for 循环中。

慕容3067478

问题是pipeline_items = soup.find_all('div', attrs={'class': 'pipline-item'})返回一个空列表。这样做的结果是:for item in pipeline_items:从来没有真正发生过。因此,item永远不会定义 的值。我不确定你到底想做什么。但我看到两个解决方案:缩进for p_tag in item.find_all('p'):以便为每个项目执行它。这样,如果没有项目,它就不会被调用(我想这就是你原本打算做的?)在循环前加if语句判断是否item存在,不存在则跳过循环。哪个最接近复制您的代码当前正在执行的操作,但我认为这不是您希望它执行的操作。

哔哔one

第 17 行及以下需要在 for 循环内才能看到“item”。for item in pipeline_items:&nbsp; &nbsp; # title, image url, listing url&nbsp; &nbsp; &nbsp; &nbsp; listing_title = item.a['title']&nbsp; &nbsp; &nbsp; &nbsp; listing_url = item.a['href']&nbsp; &nbsp; &nbsp; &nbsp; listing_image_url = item.a.img['src']for p_tag in item.find_all('p'):&nbsp; &nbsp;<------------Indent this for loop to be inside the previous for loop.&nbsp; &nbsp; if not p_tag.h2:&nbsp; &nbsp; &nbsp; &nbsp; if p_tag.text == 'Location:':

qq_花开花谢_0

任务完成感谢这里的每一个人,干杯!我遗漏了一些东西。1 确定缩进。2 我错过了第一小节的跨度——如果 p_tag.span.text == 'Location:': 3 我错过了一个包 openpyxl,它在底部被调用以写入 excel。下面 100% 的工作代码,我承诺会变得更好并在我可以的时候提供帮助 :)import requestsfrom bs4 import BeautifulSoupimport pandas as pdurl = 'https://dc.urbanturf.com/pipeline'response = requests.get(url)soup = BeautifulSoup(response.content, 'html.parser')pipeline_items = soup.find_all('div', attrs={'class': 'pipeline-item'})rows = []columns = ['listing title', 'listing url', 'listing image url', 'location', 'Project type', 'Status', 'Size']for item in pipeline_items:&nbsp; &nbsp; # title, image url, listing url&nbsp; &nbsp; listing_title = item.a['title']&nbsp; &nbsp; listing_url = item.a['href']&nbsp; &nbsp; listing_image_url = item.a.img['src']&nbsp; &nbsp; for p_tag in item.find_all('p'):&nbsp; &nbsp; &nbsp; &nbsp; if not p_tag.h2:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if p_tag.span.text == 'Location:':&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; p_tag.span.extract()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; property_location = p_tag.text.strip()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; elif p_tag.span.text == 'Project type:':&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; p_tag.span.extract()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; property_type = p_tag.text.strip()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; elif p_tag.span.text == 'Status:':&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; p_tag.span.extract()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; property_status = p_tag.text.strip()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; elif p_tag.span.text == 'Size:':&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; p_tag.span.extract()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; property_size = p_tag.text.strip()&nbsp; &nbsp; row = [listing_title, listing_url, listing_image_url, property_location, property_type, property_status, property_size]&nbsp; &nbsp; rows.append(row)df = pd.DataFrame(rows, columns=columns)df.to_excel('DC Pipeline Properties.xlsx', index=False)print('File Saved')
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python