与preg_match_all匹配的 PHP

首页课程实战体系课手记专栏慕课教程

与preg_match_all匹配的 PHP

我的任务是从HTML中提取数据，我需要为HTML中的每组p标签获取数据数组。下面是一个示例。

<p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 63px; white-space: nowrap;">Title </p>

<p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 349px; white-space: nowrap;">1234 </p>

<p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 461px; white-space: nowrap;">$30 </p>

<p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 563px; white-space: nowrap;">$10,000,000 </p>

<p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 777px; white-space: nowrap;">3,000,000 </p>

此 HTML 将重复多次，使“标题”和“1234”标签保持不变，然后在某个点切换到不同的标签。“顶部”和“左侧”值将在整个 HTML 中不断变化。我有能力循环访问现有的“Title”和“1234”标签，以匹配这部分内容。

$title_label = 'Title';

$number_label = '1234';

preg_match_all('%\d{2}px; white-space: nowrap;">$title_label </p>%', $html_content, $array_match);

$array_cost_name = $array_match[1];

$array_return_name = $array_match[2];

$array_number_name = $array_match[3];

然后，我需要 3 个数组来包含最后 3 个标签字段。对于提供的示例 HTML，我希望“$30”、“$10，000，000”和“3，000，000”是每个数组的第一个值。

我不知道如何编写正则表达式来处理这种情况。任何人都可以帮忙吗？

冉冉说

浏览 154回答 3

3回答

森林海

正则表达式不是执行此任务的正确工具，XML解析器要容易得多：$html = '<p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 63px; white-space: nowrap;">Title </p><p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 349px; white-space: nowrap;">1234 </p><p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 461px; white-space: nowrap;">$30 </p><p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 563px; white-space: nowrap;">$10,000,000 </p><p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 777px; white-space: nowrap;">3,000,000 </p>';$doc = new DOMDocument();$doc->loadHTML($html);$xml = simplexml_import_dom($doc);$parts = $xml->xpath('//p[@class="ft01"]/text()'); // find all texts inside p tags, with class ft01$array_cost_name = (string) $parts[2];$array_return_name = (string) $parts[3];$array_number_name = (string) $parts[4];echo $array_cost_name ; // $30echo $array_return_name ; // $10,000,000echo $array_number_name ; // 3,000,000

0 0

守候你守候我

你可以使用一个简单的全局正则表达式 /ace：不换行;”>（.*） <\/p>/ 或沿线的任何内容来获取您要查找的组，然后删除前 2 个项目以仅获取最后 3 个项目。下面是一个示例和一个用于测试它的链接。$html_content = '<p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 63px; white-space: nowrap;">Title </p><p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 349px; white-space: nowrap;">1234 </p><p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 461px; white-space: nowrap;">$30 </p><p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 563px; white-space: nowrap;">$10,000,000 </p><p class="ft01" style="margin: 0; padding: 0; font-size: 16px; font-family: Times; color: #000000; position: absolute; top: 103px; left: 777px; white-space: nowrap;">3,000,000 </p>';preg_match_all('/ace: nowrap;">(.*) <\/p>/', $html_content, $array_match);$array_match = array_slice($array_match[0], 2); ;print_r($array_match);http://sandbox.onlinephpfunctions.com/code/5ac69d44ff8168b4b21133c46dfa9c6db6986b6a

0 0

喵喔喔

通过正则表达式，您可以尝试以下方式：\preg_match_all('/<p.*>(.*)<\/p>/', $html, $out); $result = $out[1];这将捕获标记之间的所有字符。<p></p>

0 0

随时随地看视频慕课网APP

相关分类

PHP: php如何把参数放在Http Request Heade？？？？ 1 回答