项目管道示例¶无价格的价格验证和删除项目¶让我们看看下面的假设管道,它调整了 from itemadapter import ItemAdapter
from scrapy.exceptions import DropItem
class PricePipeline:
vat_factor = 1.15
def process_item(self, item, spider):
adapter = ItemAdapter(item)
if adapter.get('price'):
if adapter.get('price_excludes_vat'):
adapter['price'] = adapter['price'] * self.vat_factor
return item
else:
raise DropItem(f"Missing price in {item}")
将项目写入JSON文件¶下面的管道将所有爬取的项目(从所有蜘蛛)存储到一个单独的管道中 import json
from itemadapter import ItemAdapter
class JsonWriterPipeline:
def open_spider(self, spider):
self.file = open('items.jl', 'w')
def close_spider(self, spider):
self.file.close()
def process_item(self, item, spider):
line = json.dumps(ItemAdapter(item).asdict()) + "\n"
self.file.write(line)
return item
注解 jsonWriterPipeline的目的只是介绍如何编写项管道。如果您真的想将所有的爬取项存储到JSON文件中,那么应该使用 Feed exports . |
Archiver|手机版|笨鸟自学网 ( 粤ICP备20019910号 )
GMT+8, 2024-11-21 20:47 , Processed in 0.069133 second(s), 17 queries .