猿问

vba 中的网页抓取 - 构造工作数据并从左到右单元格写入

刚刚在这里注册了一个帐户,是的,我是一个真正的菜鸟 - 请对我好一点。现在我面临的挑战是:我正在用 VBA 构建一个网络抓取工具,并找到了一个代码,我根据自己的需要做了一些修改。一切都很完美,而且实际上非常顺利。现在我希望加载到我的 exel 文档中的文本不要太长,而是很宽。我怀疑它与“.Offset(I,j)”有关。我玩过一点,但我只是设法毁了一切。这是我使用的代码:


Dim IE As InternetExplorer

Dim htmldoc As MSHTML.IHTMLDocument 'Document object

Dim eleColtr As MSHTML.IHTMLElementCollection 'Element collection for tr tags

Dim eleColtd As MSHTML.IHTMLElementCollection 'Element collection for td tags

Dim eleRow As MSHTML.IHTMLElement 'Row elements

Dim eleCol As MSHTML.IHTMLElement 'Column elements

Dim ieURL As String 'URL


'Open InternetExplorer

Set IE = CreateObject("InternetExplorer.Application")

IE.Visible = True

'Navigate to webpage

ieURL = "#"

IE.Navigate ieURL

'Wait

Do While IE.Busy Or IE.ReadyState <> 4

 DoEvents

Loop

Set htmldoc = IE.Document 'Document webpage

Set eleColtr = htmldoc.getElementsByTagName("tr") 'Find all tr tags

'This section populates Excel

I = 0 'start with first value in tr collection

For Each eleRow In eleColtr 'for each element in the tr collection

 Set eleColtd = htmldoc.getElementsByTagName("tr")(I).getElementsByTagName("td") 'get all the td elements in that specific tr

 j = 0 'start with the first value in the td collection

 For Each eleCol In eleColtd 'for each element in the td collection

 Sheets("Sheet1").Range("A1").Offset(I, j).Value = eleCol.innerText 'paste the inner text of the td element, and offset at the same time

 j = j + 1 'move to next element in td collection

 Next eleCol 'rinse and repeat

 I = I + 1 'move to next element in td collection

Next eleRow 'rinse and repeat


End Sub ```


一只萌萌小番薯
浏览 91回答 1
1回答

慕村225694

你不需要浏览器。您可以使用更快的 xhr。抓取表格并循环行,然后循环填充预先确定大小的数组的列(请务必删除标题所在的行。它们可以被识别为[colspan='2']在第一个中具有td)。然后转置数组并写入工作表。Option ExplicitPublic Sub TransposeTable()&nbsp; &nbsp; Dim xhr As MSXML2.XMLHTTP60, html As MSHTML.HTMLDocument, table As MSHTML.htmltable&nbsp; &nbsp; 'required VBE (Alt+F11) > Tools > References > Microsoft HTML Object Library ;&nbsp; Microsoft XML, v6 (your version may vary)&nbsp; &nbsp; Set xhr = New MSXML2.XMLHTTP60&nbsp; &nbsp; Set html = New MSHTML.HTMLDocument&nbsp; &nbsp; '&nbsp; 7NXBG2 ;&nbsp; 8QT2E3&nbsp; &nbsp; With xhr&nbsp; &nbsp; &nbsp; &nbsp; .Open "GET", "https://www.chrono24.com/watch/8QT2E3", False&nbsp; &nbsp; &nbsp; &nbsp; .send&nbsp; &nbsp; &nbsp; &nbsp; html.body.innerHTML = .responseText&nbsp; &nbsp; End With&nbsp; &nbsp; Set table = html.querySelector(".specifications table")&nbsp; &nbsp; Dim results(), rowCountToExclude As Long&nbsp; &nbsp; rowCountToExclude = html.querySelectorAll(".specifications table [colspan='2']").Length&nbsp; &nbsp; ReDim results(1 To table.rows.Length - rowCountToExclude, 1 To table.getElementsByTagName("tr")(0).Children(0).getAttribute("colspan"))&nbsp; &nbsp; Dim r As Long, c As Long, outputRow As Long, outputColumn As Long, html2 As MSHTML.HTMLDocument&nbsp; &nbsp; Set html2 = New MSHTML.HTMLDocument&nbsp; &nbsp; For r = 0 To table.getElementsByTagName("tr").Length - 1&nbsp; &nbsp; &nbsp; &nbsp; Dim row As Object&nbsp; &nbsp; &nbsp; &nbsp; Set row = table.getElementsByTagName("tr")(r)&nbsp; &nbsp; &nbsp; &nbsp; html2.body.innerHTML = "<body> <table>" & row.outerHTML & "</table></body> "&nbsp; &nbsp; &nbsp; &nbsp; If html2.querySelectorAll("[colspan='2']").Length = 0 Then&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; outputRow = outputRow + 1: outputColumn = 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; For c = 0 To row.getElementsByTagName("td").Length - 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; results(outputRow, outputColumn) = row.getElementsByTagName("td")(c).innerText&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; outputColumn = outputColumn + 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Next&nbsp; &nbsp; &nbsp; &nbsp; End If&nbsp; &nbsp; &nbsp; &nbsp; Set row = Nothing&nbsp; &nbsp; Next&nbsp; &nbsp; results = Application.Transpose(results)&nbsp; &nbsp; ActiveSheet.Cells(1, 1).Resize(UBound(results, 1), UBound(results, 2)) = resultsEnd Sub
随时随地看视频慕课网APP

相关分类

Html5
我要回答