猿问

使用Python将ID标签添加到HTML标签(BeautifulSoup?)

我有一个包含某些标记HTML文件,我需要ID号码的格式添加到每个标签id="rule_1",id="rule_1.1",id="rule_1.2",id="rule_1.2.1",等。例如,当前的HTML是:


<div style="styles">

    <p class="classname">TEXT</p>

    <p class="classname">TEXT</p>

    <ul style="styles">

        <li>

            <p class="classname">TEXT</p>

        </li>

        <li>

            <p class="classname">TEXT</p>

        </li>

    </ul>

</div>

我需要该HTML看起来像这样:


<div style="styles" id="rule_1">

    <p class="classname" id="rule_1.1">TEXT</p>

    <p class="classname" id="rule_1.2">TEXT</p>

    <ul style="styles" id="rule_1.3">

        <li id="rule_1.3.1">

            <p class="classname" id="rule_1.3.1.1">TEXT</p>

        </li>

        <li id="rule_1.3.2">

            <p class="classname" id="rule_1.3.2.1">TEXT</p>

        </li>

    </ul>

</div>

我可以手动编写这些内容,但我希望使用现有的HTML解析器库。是否可以使用BeautifulSoup或其他模块?


我尝试过这样的事情:


from bs4 import BeautifulSoup as html_parser


with open('outputs/HTML/{}.html'.format(deal), 'r') as read_file:

    html_source = read_file.read()


soup = html_parser(html_source, 'html.parser')

html_tags = soup.find_all(['div', 'p', 'span', 'ul', 'li'])


for each_tag in html_tags:

    each_tag.attrs['id'] = html_tags.index(each_tag)


with open('outputs/HTML/{}-id.html'.format(deal), 'w') as save_file:

    save_file.write(str(soup))

但这只是添加了id="1",id="2"等等。我怎么可以把它像交错1,1.1,1.1.1,等?


繁星coding
浏览 199回答 1
1回答

梦里花落0921

没关系,想通了:curr_tags = {}for each_tag in html_tags:&nbsp; &nbsp; if html_tags.index(each_tag) == 0:&nbsp; &nbsp; &nbsp; &nbsp; each_tag.attrs['id'] = 'rule_1'&nbsp; &nbsp; else:&nbsp; &nbsp; &nbsp; &nbsp; parent_id = each_tag.parent.attrs['id']&nbsp; &nbsp; &nbsp; &nbsp; if parent_id in curr_tags.keys():&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; curr_tags[parent_id] += 1&nbsp; &nbsp; &nbsp; &nbsp; else:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; curr_tags[parent_id] = 1&nbsp; &nbsp; &nbsp; &nbsp; each_tag.attrs['id'] = parent_id + '.{0}'.format(curr_tags[parent_id])
随时随地看视频慕课网APP

相关分类

Python
我要回答