我正在玩抓取网站技术,对于前链接,它总是返回空描述。原因是它由 JS 填充了以下代码,我们如何处理这些类型的 senarios。
// Frontend JS
P.when('DynamicIframe').execute(function(DynamicIframe){
var BookDescriptionIframe = null,
bookDescEncodedData = "book desc data",
bookDescriptionAvailableHeight,
minBookDescriptionInitialHeight = 112,
options = {},
iframeId = "bookDesc_iframe";
我正在使用 php domxpath 如下
$file = 'sample.html';
$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
// I am saving the returned html to a file and reading the file.
@$dom->loadHTMLFile($file);
$xpath = new DOMXPath($dom);
// This xpath works on chrome console, but not here
// because the content is dynamically created via js
$desc = $xpath->query('//*[@id="bookDesc_iframe"]')
隔江千里