我写了一个蜘蛛,它返回的数据充满了空格和换行符。换行符还导致extract()
方法以列表形式返回。在触摸选择器之前如何过滤它们?之后过滤这些extract()
称为DRY原则,因为我需要从页面中提取很多数据,这些数据是无属性的,这使得解析它的唯一方法是通过索引。
我该如何过滤?
它会返回错误的数据,像这样
{ 'aired': ['\n ', '\n Apr 3, 2016 to Jun 26, 2016\n '],
'broadcast': [], 'duration': ['\n ', '\n 24 min. per ep.\n '], 'episodes': ['\n ', '\n 13\n '], 'favourites': ['\n ', '\n 22,673\n'], 'genres': ['Action', 'Comedy', 'School', 'Shounen', 'Super Power'], 'image_url': ['https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
'https://myanimelist.cdn-dena.com/images/anime/10/78745.jpg',
叮当猫咪
qq_花开花谢_0
相关分类