执行某些步骤后无法获取从网页动态填充的号码

数据调用自：POST http://alta.registries.gov.ab.ca/SpinII/mapserver.aspx内容在被OpenLayers 库使用之前以自定义格式进行编码。所有的解码都位于这个JS文件中。如果你美化了，你可以找一下它的WayTo.Wtb.Format.WTB解码OpenLayers.Class。二进制文件按照 JS 中的如下所示逐字节解码：switch(elementType){    case 1:        var lineColor = new WayTo.Wtb.Element.LineColor();        byteOffset = lineColor.parse(dataReader, byteOffset);        outputElement = lineColor;        break;    case 2:        var lineStyle = new WayTo.Wtb.Element.LineStyle();        byteOffset = lineStyle.parse(dataReader, byteOffset);        outputElement = lineStyle;        break;    case 3:        var ellipse = new WayTo.Wtb.Element.Ellipse();        byteOffset = ellipse.parse(dataReader, byteOffset);        outputElement = ellipse;        break;    ........}我们必须重现这个解码算法才能获得原始数据。我们不需要解码所有对象，我们只想获得正确的偏移量并strings正确提取。这里有一个Python解码部分的脚本，用于解码文件中的数据（输出卷曲):with open("wtb.bin", mode='rb') as file:    encodedData = file.read()    offset = 0    objects = []    while offset < len(encodedData):        elementSize = encodedData[offset]        offset+=1        elementType = encodedData[offset]        offset+=1        if elementType == 0:            break        curElemSize = elementSize        curElemType = elementType        if elementType== 114:            largeElementSize = int.from_bytes(encodedData[offset:offset + 4], "big")            offset+=4            largeElementType = int.from_bytes(encodedData[offset:offset+2], "little")            offset+=2            curElemSize = largeElementSize            curElemType = largeElementType        print(f"type {curElemType} | size {curElemSize}")        offsetInit = offset        if curElemType == 1:            offset+=4        elif curElemType == 2:            offset+=2        elif curElemType == 3:            offset+=20        elif curElemType == 4:            offset+=28        elif curElemType == 5:            offset+=12        elif curElemType == 6:            textLength = curElemSize - 3            objects.append({                "type": "Text",                "x_position": int.from_bytes(encodedData[offset:offset+2], "little"),                "y_position": int.from_bytes(encodedData[offset+2:offset+4], "little"),                "rotation": int.from_bytes(encodedData[offset+4:offset+6], "little"),                "text": encodedData[offset+6:offset+6+(textLength*2)].decode("utf-8").replace('\x00','')            })            offset+=6+(textLength*2)        elif curElemType == 7:            numPoint = int(curElemSize / 2)            offset+=4*numPoint        elif curElemType == 27:            numPoint = int(curElemSize / 4)            offset+=8*numPoint        elif curElemType == 8:            numPoint = int(curElemSize / 2)            offset+=4*numPoint        elif curElemType == 28:            numPoint = int(curElemSize / 4)            offset+=8*numPoint        elif curElemType == 13:            offset+=4        elif curElemType == 14:            offset+=2        elif curElemType == 15:            offset+=2        elif curElemType == 100:            pass        elif curElemType == 101:            offset+=20        elif curElemType == 102:            offset+=2        elif curElemType == 103:            pass        elif curElemType == 104:            highShort = int.from_bytes(encodedData[offset+2:offset+4], "little")            lowShort = int.from_bytes(encodedData[offset+4:offset+6], "little")            objects.append({                "type": "StartNumericCell",                "entity": int.from_bytes(encodedData[offset:offset+2], "little"),                "occurrence": (highShort << 16) + lowShort            })            offset+=6        elif curElemType == 105:            #end cell            pass        elif curElemType == 109:            textLength = curElemSize - 1            objects.append({                "type": "StartAlphanumericCell",                "entity": int.from_bytes(encodedData[offset:offset+2], "little"),                "occurrence":encodedData[offset+2:offset+2+(textLength*2)].decode("utf-8").replace('\x00','')            })            offset+=2+(textLength*2)        elif curElemType == 111:            offset+=40        elif curElemType == 112:            objects.append({                "type": "CoordinatePlane",                "projection_code": encodedData[offset+48:offset+52].decode("utf-8").replace('\x00','')            })            offset+=52        elif curElemType == 113:            offset+=24        elif curElemType == 256:            nameLength = int.from_bytes(encodedData[offset+14:offset+16], "little")            objects.append({                "type": "LargePolygon",                "name": encodedData[offset+16:offset+16+nameLength].decode("utf-8").replace('\x00',''),                "occurence": int.from_bytes(encodedData[offset+2:offset+6], "little")            })            if nameLength > 0:                offset+= 16 + nameLength                if encodedData[offset] == 0:                    offset+=1            else:                offset+= 16            numberOfPoints = int.from_bytes(encodedData[offset:offset+2], "little")            offset+=2            offset+=numberOfPoints*8        elif curElemType == 257:            pass        else:            offset+= curElemSize*2        print(f"offset diff {offset-offsetInit}")        print("--------------------------------")    print(objects)    print(len(encodedData))    print(offset)（旁注：请注意，元素大小采用大端字节序，所有其他值均采用小端字节序）运行这个 repl.it以查看它如何解码文件从那里我们构建了抓取数据的步骤，为了清楚起见，我将描述所有步骤（甚至是您已经弄清楚的步骤）：登录使用以下命令登录网站：GET https://alta.registries.gov.ab.ca/spinii/logon.aspx抓取输入名称/值并添加uctrlLogon:cmdLogonGuest.x，uctrlLogon:cmdLogonGuest.y然后调用POST https://alta.registries.gov.ab.ca/spinii/logon.aspx法律声明法律声明调用对于获取地图值不是必需的，但对于获取项目信息是必需的（帖子中的最后一步）GET https://alta.registries.gov.ab.ca/spinii/legalnotice.aspx抓取input标签名称/值并设置cmdYES.x然后cmdYES.y调用POST https://alta.registries.gov.ab.ca/spinii/legalnotice.aspx地图数据调用服务器地图API：POST http://alta.registries.gov.ab.ca/SpinII/mapserver.aspx有以下数据：{    "mt":"titleresults",    "qt":"lincNo",    "LINCNumber": lincNumber,    "rights": "B", #not required    "cx": 1920, #screen definition    "cy": 1080,}cx/xy是画布尺寸使用上述方法对编码数据进行解码。你会得到：[{'type': 'LargePolygon', 'name': '0010495134 8722524;1;162', 'entity': 23, 'occurence': 628079167, 'line_color_green': 0, 'line_color_red': 129, 'line_color_blue': 129, 'fill_color_green': 255, 'fill_color_red': 255, 'fill_color_blue': 180}, {'type': 'LargePolygon', 'name': '0012170859 8022146;8;99', 'entity': 23, 'occurence': 628048595, 'line_color_green': 0, 'line_color_red': 129, 'line_color_blue': 129, 'fill_color_green': 255, 'fill_color_red': 255, 'fill_color_blue': 180}, {'type': 'LargePolygon', 'name': '0010691822 8722524;1;163', 'entity': 23, 'occurence': 628222354, 'line_color_green': 0, 'line_color_red': 129, 'line_color_blue': 129, 'fill_color_green': 255, 'fill_color_red': 255, 'fill_color_blue': 180}, {'type': 'LargePolygon', 'name': '0012169736 8022146;8;89', 'entity': 23, 'occurence': 628021327, 'line_color_green': 0, 'line_color_red': 129, 'line_color_blue': 129, 'fill_color_green': 255, 'fill_color_red': 255, 'fill_color_blue': 180}, {'type': 'LargePolygon', 'name': '0010694454 8722524;1;179', 'entity': 23, 'occurence': 628191678, 'line_color_green': 0, 'line_color_red': 129, 'line_color_blue': 129, 'fill_color_green': 255, 'fill_color_red': 255, 'fill_color_blue': 180}, {'type': 'LargePolygon', 'name': '0010694362 8722524;1;178', 'entity': 23, 'occurence': 628307403, 'line_color_green': 0, 'line_color_red': 129, 'line_color_blue': 129, 'fill_color_green': 255, 'fill_color_red': 255, 'fill_color_blue': 180}, {'type': 'LargePolygon', 'name': '0010433381 8722524;1;177', 'entity': 23, 'occurence': 628209696, 'line_color_green': 0, 'line_color_red': 129, 'line_color_blue': 129, 'fill_color_green': 255, 'fill_color_red': 255, 'fill_color_blue': 180}, {'type': 'LargePolygon', 'name': '0012169710 8022146;8;88A', 'entity': 23, 'occurence': 628021328, 'line_color_green': 0, 'line_color_red': 129, 'line_color_blue': 129, 'fill_color_green': 255, 'fill_color_red': 255, 'fill_color_blue': 180}, {'type': 'LargePolygon', 'name': '0010694355 8722524;1;176', 'entity': 23, 'occurence': 628315826, 'line_color_green': 0, 'line_color_red': 129, 'line_color_blue': 129, 'fill_color_green': 255, 'fill_color_red': 255, 'fill_color_blue': 180}, {'type': 'LargePolygon', 'name': '0012170866 8022146;8;100', 'entity': 23, 'occurence': 628163431, 'line_color_green': 0, 'line_color_red': 129, 'line_color_blue': 129, 'fill_color_green': 255, 'fill_color_red': 255, 'fill_color_blue': 180}, {'type': 'LargePolygon', 'name': '0010694347 8722524;1;175', 'entity': 23, 'occurence': 628132810, 'line_color_green': 0, 'line_color_red': 129,提取信息如果您想针对特定的目标，lincNumber则需要查找多边形的样式，因为对于“多个”值（例如具有多个项目的值），没有提及lincNumber响应的 id，只有链接引用。以下将获取所选项目：selectedZone = [    t     for t in objects     if t.get("fill_color_green", 255) < 255 and t.get("line_color_red") == 255][0]print(selectedZone)调用您在帖子中提到的网址来获取数据并提取表：GET https://alta.registries.gov.ab.ca/SpinII/popupTitleSearch.aspx?title={selectedZone["occurence"]}完整代码：import requestsfrom bs4 import BeautifulSoupimport pandas as pdlincNumber = "0030278592"#lincNumber = "0010661156"s = requests.Session()# 1) loginr = s.get("https://alta.registries.gov.ab.ca/spinii/logon.aspx")soup = BeautifulSoup(r.text, "html.parser")payload = dict([    (t["name"], t.get("value", ""))    for t in soup.findAll("input")])payload["uctrlLogon:cmdLogonGuest.x"] = 76payload["uctrlLogon:cmdLogonGuest.y"] = 25s.post("https://alta.registries.gov.ab.ca/spinii/logon.aspx",data=payload)# 2) legal noticer = s.get("https://alta.registries.gov.ab.ca/spinii/legalnotice.aspx")soup = BeautifulSoup(r.text, "html.parser")payload = dict([    (t["name"], t.get("value", ""))    for t in soup.findAll("input")])payload["cmdYES.x"] = 82payload["cmdYES.y"] = 3s.post("https://alta.registries.gov.ab.ca/spinii/legalnotice.aspx", data = payload)# 3) map datar = s.post("http://alta.registries.gov.ab.ca/SpinII/mapserver.aspx",    data= {        "mt":"titleresults",        "qt":"lincNo",        "LINCNumber": lincNumber,        "rights": "B", #not required        "cx": 1920, #screen definition        "cy": 1080,    })def decodeWtb(encodedData):    offset = 0    objects = []    iteration = 0    while offset < len(encodedData):        elementSize = encodedData[offset]        offset+=1        elementType = encodedData[offset]        offset+=1        if elementType == 0:            break        curElemSize = elementSize        curElemType = elementType        if elementType== 114:            largeElementSize = int.from_bytes(encodedData[offset:offset + 4], "big")            offset+=4            largeElementType = int.from_bytes(encodedData[offset:offset+2], "little")            offset+=2            curElemSize = largeElementSize            curElemType = largeElementType        offsetInit = offset        if curElemType == 1:            offset+=4        elif curElemType == 2:            offset+=2        elif curElemType == 3:            offset+=20        elif curElemType == 4:            offset+=28        elif curElemType == 5:            offset+=12        elif curElemType == 6:            textLength = curElemSize - 3            offset+=6+(textLength*2)        elif curElemType == 7:            numPoint = int(curElemSize / 2)            offset+=4*numPoint        elif curElemType == 27:            numPoint = int(curElemSize / 4)            offset+=8*numPoint        elif curElemType == 8:            numPoint = int(curElemSize / 2)            offset+=4*numPoint        elif curElemType == 28:            numPoint = int(curElemSize / 4)            offset+=8*numPoint        elif curElemType == 13:            offset+=4        elif curElemType == 14:            offset+=2        elif curElemType == 15:            offset+=2        elif curElemType == 100:            pass        elif curElemType == 101:            offset+=20        elif curElemType == 102:            offset+=2        elif curElemType == 103:            pass        elif curElemType == 104:            offset+=6        elif curElemType == 105:            pass        elif curElemType == 109:            textLength = curElemSize - 1            offset+=2+(textLength*2)        elif curElemType == 111:            offset+=40        elif curElemType == 112:            offset+=52        elif curElemType == 113:            offset+=24        elif curElemType == 256:            nameLength = int.from_bytes(encodedData[offset+14:offset+16], "little")            objects.append({                "type": "LargePolygon",                "name": encodedData[offset+16:offset+16+nameLength].decode("utf-8").replace('\x00',''),                "entity": int.from_bytes(encodedData[offset:offset+2], "little"),                "occurence": int.from_bytes(encodedData[offset+2:offset+6], "little"),                "line_color_green": encodedData[offset + 8],                "line_color_red": encodedData[offset + 7],                "line_color_blue": encodedData[offset + 9],                "fill_color_green": encodedData[offset + 10],                "fill_color_red": encodedData[offset + 11],                "fill_color_blue": encodedData[offset + 13]            })            if nameLength > 0:                offset+= 16 + nameLength                if encodedData[offset] == 0:                    offset+=1            else:                offset+= 16            numberOfPoints = int.from_bytes(encodedData[offset:offset+2], "little")            offset+=2            offset+=numberOfPoints*8        elif curElemType == 257:            pass        else:            offset+= curElemSize*2    return objects# 4) decode custom formatobjects = decodeWtb(r.content)# 5) get the selected areaselectedZone = [    t     for t in objects     if t.get("fill_color_green", 255) < 255 and t.get("line_color_red") == 255][0]print(selectedZone)# 6) get the info about itemr = s.get(f'https://alta.registries.gov.ab.ca/SpinII/popupTitleSearch.aspx?title={selectedZone["occurence"]}')df = pd.read_html(r.content, attrs = {'class': 'bodyText'}, header =0)[0]del df['Add to Cart']del df['View']print(df[:-1])在 repl.it 上运行这个输出  Title Number           Type LINC Number Short Legal   Rights Registration Date Change/Cancel Date0    052400228  Current Title  0030278592  0420091;16  Surface        19/09/2005         13/11/20191    072294084  Current Title  0030278551  0420091;12  Surface        22/05/2007         21/08/20072    072400529  Current Title  0030278469   0420091;3  Surface        05/07/2007         28/08/20073    072498228  Current Title  0030278501   0420091;7  Surface        18/08/2007         08/02/20084    072508699  Current Title  0030278535  0420091;10  Surface        23/08/2007         13/12/20075    072559500  Current Title  0030278477   0420091;4  Surface        17/09/2007         19/11/20076    072559508  Current Title  0030278576  0420091;14  Surface        17/09/2007         09/01/20097    072559521  Current Title  0030278519   0420091;8  Surface        17/09/2007         07/11/20078    072559530  Current Title  0030278493   0420091;6  Surface        17/09/2007         25/08/20089    072559605  Current Title  0030278485   0420091;5  Surface        17/09/2007         23/12/2008objects如果您想获得更多条目，可以查看该字段。如果您想获得有关坐标等项目的更多信息，您可以改进解码器......还可以通过查看包含 lincNumber 的字段来匹配目标周围的其他 lincNumber，name除非其中存在“多个”名称。

执行某些步骤后无法获取从网页动态填充的号码

2回答