使用 dom (php) 解析 img 和 html 代码

我有一个解析 img 和文本的代码。运行php文件中的代码。它只是显示 img src、abc、img src、dfe。而且我的代码不规则。img 标签可能带有链接。


我想解析 img 和下一个html。像这样:


Array

(

    [0] => Array

        (

            [src] => http://www.whatever.com

            [text] =>  abc

    <br>

    <h3>title</h3>

    <div class="content">content <a href="link">my link</a></div>

        )


    [1] => Array

        (

            [src] => http://goingnowhere.com

            [text] =>  def

    <br>

    <h3>title 2</h3>

    <div class="content">content <a href="link">my link</a>


    bla bla bla


    </div>

        )


)

我怎样才能做到这一点?我目前的代码:


<?php $sample_html = '

<img src="http://www.whatever.com" alt="" />

abc

<br>

<h3>title</h3>

<div class="content">content <a href="link">my link</a></div>

<img src="http://goingnowhere.com" alt="">

def

<br>

<h3>title 2</h3>

<div class="content">content <a href="link">my link</a>


bla bla bla


</div>

';


$dom = new DOMDocument();

$dom->loadHTML($sample_html);


$data = array();

$images = $dom->getElementsByTagName('img');

foreach ($images as $image) {

$data[] = array(

'src' => $image->getAttribute('src'),

'text' => trim($image->nextSibling->textContent),

);

}


echo '<pre>';

print_r($data); ?>


回首忆惘然
浏览 211回答 1
1回答

哔哔one

使用 xpath 遍历所有节点并使用两个 img 标签检索数据。<?php $sample_html = '<img src="http://www.whatever.com" alt="" />abc<br><h3>title</h3><div class="content">content <a href="link">my link</a></div><img src="http://goingnowhere.com" alt="">def<br><h3>title 2</h3><div class="content">content <a href="link">my link</a>bla bla bla</div>';$dom = new DOMDocument();@$dom->loadHtml($sample_html);$xpath = new DOMXPath($dom);$snippet = '';$arr = array();$count = $xpath->query('//img')->length;//loop through all img tagsfor($i=0;$i<$count;$i++){&nbsp; &nbsp; $node = $xpath->query('//img')->item($i);&nbsp; &nbsp; $img_src = $node->getAttribute('src');//first image src&nbsp; &nbsp; while ($node = $node->nextSibling) {&nbsp; &nbsp; &nbsp; if (get_class($node) != 'DOMElement') {&nbsp; &nbsp; &nbsp; &nbsp; continue;&nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; if ($node->tagName&nbsp; == 'img') {&nbsp; &nbsp; &nbsp; &nbsp; $snippet .= $dom->saveXML($node);&nbsp; &nbsp; &nbsp; &nbsp; $arr[] = array(&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 'src'=>$img_src,&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 'content'=>$snippet&nbsp; &nbsp; &nbsp; &nbsp; );&nbsp; &nbsp; &nbsp; &nbsp; $img_src = $node->getAttribute('src');//last img src&nbsp; &nbsp; &nbsp; &nbsp; $snippet = '';&nbsp; &nbsp; &nbsp; &nbsp; break;&nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; $snippet .= $dom->saveXML($node);&nbsp; &nbsp; }}//fill last img data$arr[] = array('src'=>$img_src,'content'=>$snippet);
打开App,查看更多内容
随时随地看视频慕课网APP