我想从网站中排除粗体段落

我使用以下代码来抓取网站：

import requests

from bs4 import BeautifulSoup

resp = requests.get('https://www.ecb.europa.eu/press/pressconf/2018/html/ecb.is180913.en.html')

soup = BeautifulSoup(resp.content, 'html5lib')

article = soup.find('article')

paragraphs = article.find_all('p')

输出看起来像：

[Based on our regular economic and monetary analyses, we decided to keep the key ECB interest rates unchanged. .... to levels that are below, but close to, 2% over the medium term.,

Has QE been used well by the various euro area countries?,

By and large, yes, it's been used well in the sense that the intended effects of the QE – mind, ... It reduced dispersion in growth rates everywhere. An employment situation which is by and large improving almost everywhere, some countries more than others. ,

If your question is meant to say; shouldn't governments have taken advantage of the situation of such low rates to decrease budget deficits, to restore? ... is a good situation for doing that.,

My second question is on reinvestment. ...Have you today explicitly asked the committees to come up with proposals on reinvestments?,

About inflation: I said inflation is going to hover around the present level for the rest of the year and then I gave numbers for next year and 2020. ...will reach our objective over the medium term. ,]

我想排除包含的粗体段落

我尝试编码但未能获得所需的输出。如果您能帮助我，我将不胜感激。

神不在的星期二

浏览 369回答 2

2回答

郎朗坤

用于str()将 bs4 对象转换为字符串..........paragraphs = article.find_all('p')for p in paragraphs:    if '' not in str(p):        print str(p)

当年话下

试试这个extract()功能：article = soup.find('article')paragraphs = article.find_all('p')article.strong.extract()paragraphs_without_bold = article.find_all('p')另请参见本。

随时随地看视频慕课网APP