我试图定位脚本内具有“ "@type": "NewsArticle" " 的整个脚本标记。
像这样的东西:
<script type="application\/ld\+json">[^\{]*?{(.*?)\}[^\}]*?<\/script>
我可以使用上面的正则表达式来定位最上面的脚本标签。但我正在寻找 newsArticle JSON 信息,在本例中是第二个,但在某些页面中有 4 个以上 application/ld+json 标签,但 " "@type": "NewsArticle" "始终存在无论如何,在每一页中。所以我正在寻找一个可以针对该特定脚本的脚本。
感谢帮助。
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Organization",
"@id": "https://www.givemesport.com/#gms",
"name": "GiveMeSport",
"url": "https://www.givemesport.com",
"logo": {
"@type": "ImageObject",
"url": "https://gmsrp.cachefly.net/v4/images/logo-gms-black.png"
},
"sameAs":[
"https://www.facebook.com/GiveMeSport",
"https://www.instagram.com/givemesport",
"https://twitter.com/GiveMeSport",
"https://www.youtube.com/user/GiveMeSport"
]
}
</script>
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "NewsArticle",
"mainEntityOfPage": "https://www.givemesport.com/1612447-man-uniteds-scott-mctominay-delighted-fans-with-reaction-after-third-goal-vs-rb-leipzig",
"url": "https://www.givemesport.com/1612447-man-uniteds-scott-mctominay-delighted-fans-with-reaction-after-third-goal-vs-rb-leipzig",
"headline": "Man United's Scott McTominay delighted fans with reaction after third goal vs RB Leipzig",
"datePublished": "2020-10-30T21:52:48.3510000Z",
"dateModified": "2020-10-30T21:52:48.3510000Z",
"description": "Man United's Scott McTominay delighted fans with reaction after third goal vs RB Leipzig",
"articleSection": "Football",
"keywords": ["Football","Manchester United","Marcus Rashford","RB Leipzig","Scott McTominay","UEFA Champions"],
"creator": ["Scott Wilson"],
"thumbnailUrl": "https://gmsrp.cachefly.net/images/20/10/30/03a426c8204af5c8d02282afaeed6189/144.jpg",
"author": {
"@type": "Person",
"name": "Scott Wilson",
"sameAs": "https://www.givemesport.com/scott-wilson-1"
},
森栏
相关分类