我正在尝试从一些网站上抓取数据以进行概念验证项目。目前使用 Python3 和 BS4 来收集所需的数据。我有一本来自三个站点的 URLS 字典。每个站点都需要不同的方法来收集数据,因为它们的 HTML 不同。我一直在使用“Try, If, Else, stack 但我一直遇到问题,如果你能看看我的代码并帮助我修复它那就太好了!
当我添加更多要抓取的站点时,我将无法使用“Try、If、Else”循环通过各种方法来找到抓取数据的正确方法,我如何才能让这段代码面向未来添加未来有多少网站并从其中包含的各种元素中抓取数据?
# Scraping Script Here:
def job():
prices = {
# LIVEPRICES
"LIVEAUOZ": {"url": "https://www.gold.co.uk/",
"trader": "Gold.co.uk",
"metal": "Gold",
"type": "LiveAUOz"},
# GOLD
"GLDAU_BRITANNIA": {"url": "https://www.gold.co.uk/gold-coins/gold-britannia-coins/britannia-one-ounce-gold-coin-2020/",
"trader": "Gold.co.uk",
"metal": "Gold",
"type": "Britannia"},
"GLDAU_PHILHARMONIC": {"url": "https://www.gold.co.uk/gold-coins/austrian-gold-philharmoinc-coins/austrian-gold-philharmonic-coin/",
"trader": "Gold.co.uk",
"metal": "Gold",
"type": "Philharmonic"},
"GLDAU_MAPLE": {"url": "https://www.gold.co.uk/gold-coins/canadian-gold-maple-coins/canadian-gold-maple-coin/",
"trader": "Gold.co.uk",
"metal": "Gold",
"type": "Maple"},
# SILVER
"GLDAG_BRITANNIA": {"url": "https://www.gold.co.uk/silver-coins/silver-britannia-coins/britannia-one-ounce-silver-coin-2020/",
"trader": "Gold.co.uk",
"metal": "Silver",
"type": "Britannia"},
"GLDAG_PHILHARMONIC": {"url": "https://www.gold.co.uk/silver-coins/austrian-silver-philharmonic-coins/silver-philharmonic-2020/",
"trader": "Gold.co.uk",
"metal": "Silver",
"type": "Philharmonic"}
}
三国纷争
相关分类