TELNETCONSOLE_ENABLED¶违约: 一个布尔值,指定 telnet console 将被启用(前提是它的扩展也被启用)。 TEMPLATES_DIR¶违约: 创建新项目时要在其中查找模板的目录 项目名称不得与中自定义文件或目录的名称冲突。 TWISTED_REACTOR¶2.0 新版功能. 违约: 给定的导入路径 如果还没有安装其他反应器,比如当 如果您正在使用
如果已经安装了反应堆,
为了使用Scrapy安装的反应器: import scrapy
from twisted.internet import reactor
class QuotesSpider(scrapy.Spider):
name = 'quotes'
def __init__(self, *args, **kwargs):
self.timeout = int(kwargs.pop('timeout', '60'))
super(QuotesSpider, self).__init__(*args, **kwargs)
def start_requests(self):
reactor.callLater(self.timeout, self.stop)
urls = ['http://quotes.toscrape.com/page/1']
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
for quote in response.css('div.quote'):
yield {'text': quote.css('span.text::text').get()}
def stop(self):
self.crawler.engine.close_spider(self, 'timeout')
使 import scrapy
class QuotesSpider(scrapy.Spider):
name = 'quotes'
def __init__(self, *args, **kwargs):
self.timeout = int(kwargs.pop('timeout', '60'))
super(QuotesSpider, self).__init__(*args, **kwargs)
def start_requests(self):
from twisted.internet import reactor
reactor.callLater(self.timeout, self.stop)
urls = ['http://quotes.toscrape.com/page/1']
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
for quote in response.css('div.quote'):
yield {'text': quote.css('span.text::text').get()}
def stop(self):
self.crawler.engine.close_spider(self, 'timeout')
的默认值 有关其他信息,请参阅 Choosing a Reactor and GUI Toolkit Integration . |
Archiver|手机版|笨鸟自学网 ( 粤ICP备20019910号 )
GMT+8, 2024-12-28 01:43 , Processed in 0.601305 second(s), 17 queries .