我正在尝试从Discogs.com html解析曲目名称。以下是 html 部分的示例:
<tr class=" tracklist_track track" data-track-position="8">
<td class="tracklist_track_pos">8</td>
<td class="track tracklist_track_title ">
<span class="tracklist_track_title">Shapeshifting</span><blockquote><span class="tracklist_extra_artist_span">Vocals – <a href="/artist/764815-Rachel-Dreyer">Rachel Dreyer</a></span></blockquote></td>
<td width="25" class="tracklist_track_duration">
<span>6:02</span>
</td>
</tr>
<tr class=" tracklist_track track" data-track-position="9">
<td class="tracklist_track_pos">9</td>
<td class="track tracklist_track_title ">
<span class="tracklist_track_title">Rose</span><blockquote><span class="tracklist_extra_artist_span">Vocals – <a href="/artist/764814-Silke-Roch">Silke Roch</a></span></blockquote></td>
<td width="25" class="tracklist_track_duration">
<span>5:49</span>
</td>
</tr>
我的目标是提取innerText类tracklist_track_title( "Shapeshifting", "Rose")。
如果我试试这个:document.getElementsByClassName("tracklist_track_title"),我收到阵列,其中包括类track tracklist_track_title,tracklist_track_title和tracklist_extra_artist_span。( "Shapeshifting", "Shapeshifting", "Vocals – Rachel Dreyer", "Rose", "Rose", "Vocals – Silke Roch")
如果我尝试这个:document.getElementsByClassName("track tracklist_track_title"),我只接受track tracklist_track_title课程,但不幸的tracklist_extra_artist_span是课程也包括在内。( "Shapeshifting", "Vocals – Rachel Dreyer", "Rose", "Vocals – Silke Roch").
你能不能给我建议一种只包含tracklist_track_title在结果数组中的方法,或者之后过滤这个数组以摆脱tracklist_extra_artist_span类?( "Shapeshifting", "Rose").
我唯一的想法是使用document.getElementsByClassName("tracklist_track_title")然后过滤结果数组中具有偶数索引的所有元素以摆脱track tracklist_track_title类:
var trackBlock = document.getElementsByClassName("tracklist_track_title");
var trackList = [];
for (var i = 0; i < trackBlock.length; i++) {
if(i % 2 != 0) { // index is not even
trackList.push(trackBlock[i].innerText);
}
}
还有其他想法吗?谢谢!
慕姐4208626
相关分类