从 JSON 中提取数组值并在 python 中删除字符

我有以下 JSON 字符串,我正在尝试将值提取到 python 列表中。我获得了id_list字符串,但我想获得每个值而没有每个值:

使用 python json 库不是一个选项。我的方法(以前从未使用过很多正则表达式):https : //regex101.com/r/qxYe9N/1


我想使用带有 re.filterall(EXPR, jsonstr) 的表达式来接收如下列表:


result = ["B01M8QSY16", "B017XBDBI6", ...more ]


{

  "ajax": {

    "params": {

      "asinMetadataKeys": "adId",

      "featureId": "SimilaritiesCarousel",

      "reftagPrefix": "pd_sbs_60",

      "widgetTemplateClass": "PI::Similarities::ViewTemplates::Carousel::Desktop",

      "imageHeight": 160,

      "linkGetParameters": "{\"pf_rd_s\":\"desktop-dp-sims\",\"pf_rd_m\":\"A3JWKAKR8XB7XF\",\"pd_rd_r\":\"ac83cd73-b019-11e8-99c8-33d23753c678\",\"pf_rd_r\":\"H21WNBAW5EGZX90ND4PN\",\"pf_rd_t\":\"40701\",\"pd_rd_wg\":\"e6DPw\",\"pf_rd_p\":\"946762da-975a-438a-9e2b-a585cbe769b5\",\"pf_rd_i\":\"desktop-dp-sims\",\"pd_rd_w\":\"xg8TH\"}",

      "faceoutTemplateClass": "PI::P13N::ViewTemplates::Product::Desktop::CarouselFaceout",

      "auiDeviceType": "desktop",

      "imageWidth": 160,

      "schemaVersion": 2,

      "productDetailsTemplateClass": "PI::P13N::ViewTemplates::ProductDetails::Desktop::Base",

      "forceFreshWin": 0,

      "productDataFlavor": "Faceout",

      "relatedRequestID": "H21WNBAW5EGZX90ND4PN",

      "maxLineCount": 6

    },

墨色风雨
浏览 253回答 3
3回答

白衣染霜花

只需使用pythons json库import jsonj1 = """{  "ajax": {    "params": {      "asinMetadataKeys": "adId",      "featureId": "SimilaritiesCarousel",      "reftagPrefix": "pd_sbs_60",      "widgetTemplateClass": "PI::Similarities::ViewTemplates::Carousel::Desktop",      "imageHeight": 160,      "faceoutTemplateClass": "PI::P13N::ViewTemplates::Product::Desktop::CarouselFaceout",      "auiDeviceType": "desktop",      "imageWidth": 160,      "schemaVersion": 2,      "productDetailsTemplateClass": "PI::P13N::ViewTemplates::ProductDetails::Desktop::Base",      "forceFreshWin": 0,      "productDataFlavor": "Faceout",      "relatedRequestID": "H21WNBAW5EGZX90ND4PN",      "maxLineCount": 6    },    "id_list": ["B01M8QSY16:", "B017XBDBI6:", "B01GL5MYCE:", "B0751DHYXC:", "B01AHWOH54:", "B01M7XYENW:", "B01N7FKKXV:", "B07C1NLKS5:", "B00R25QZDC:", "B01AJB1VFW:", "B079K773M7:", "B07DX3W41P:", "B01GL5606A:", "B07654YLSB:", "B01GFL6MZE:", "B00WLI5E3M:", "B01CTE28DG:", "B01BELELVC:", "B00ZY7H91M:", "B077TPG2WK:", "B01G503MC6:", "B01LYZFC4V:", "B00ID9UQYK:", "B07C3T52LB:", "B07DX39RNS:", "B076551MZP:", "B0761RWKPQ:", "B00T8FD9YM:", "B07653JBYS:", "B07G316H74:", "B01FSEBC9K:", "B014QKBVH0:", "B01BVA2I4S:", "B01CVOZNAE:", "B07D19JDH9:", "B018ACDMJK:", "B00V0H83YW:", "B07C432PK3:", "B07B9P4T4V:", "B076H4WWLK:", "B077G3Y86F:", "B077Z7XLJF:", "B01NCFB2BB:", "B01M4I7FMC:", "B01BEVFJCM:", "B01FSEBC8G:", "B07DXCTKB6:", "B01NBHYAR0:", "B07DGWJ887:", "B00SLP58SU:", "B01N55H5AE:", "B013AZCPLS:", "B076PC3NYV:", "B01BVA2JHE:", "B07FF38J8C:", "B07DHGTS81:", "B00R25QZHS:"],    "url": "/gp/p13n-shared/faceout-partial",    "id_param_name": "asins"  },  "baseAsin": "B01GL56060",  "name": "desktop-dp-sims_session-similarities",  "set_size": 57}"""d1 = json.loads(j1) id_list = [elem.replace(":", "") for elem in d1["ajax"]['id_list']]id_list输出:['B01M8QSY16', 'B017XBDBI6', ... 'B00R25QZHS']我不得不删除“linkGetParameters : ...”这一行,因为它似乎不符合 json 格式。

千巷猫影

既然你不能使用 JSON 库,你可以试试这个 here 表达式(在 Python3 上测试):result = [ id.strip('":') for id in re.search('"id_list": \[(.*)\],', jsonstr).group(1).split(", ") ](其中jsonstr是包含所有原始 JSON 代码的字符串)。为了更容易理解,上面的代码使用了re.search(不像re.filterall您建议的那样)广泛定位和选择该行,group 缩小选择范围,split 将字符串转换为列表,以及strip 修剪掉每个列表项中不必要的字符给您留下一个 ID 列表,例如您在问题中指定的 ID。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python