笨鸟编程-零基础入门Pyhton教程

 找回密码
 立即注册

请求和响应

发布者: 笨鸟自学网



绑定地址

用于执行请求的传出IP地址的IP。

download_timeout

下载程序在超时前等待的时间(以秒计)。参见: DOWNLOAD_TIMEOUT .

download_latency

自请求启动以来,获取响应所花费的时间,即通过网络发送的HTTP消息。只有在下载响应后,此元键才可用。虽然大多数其他的元键用于控制零碎的行为,但这个元键应该是只读的。

download_fail_on_dataloss

是否在错误的响应上失败。见: DOWNLOAD_FAIL_ON_DATALOSS .

max_retry_times

使用meta key设置每个请求的重试次数。初始化时, max_retry_times 元键优先于 RETRY_TIMES 设置。

停止下载响应

举起一个 StopDownload 对象的处理程序引发的异常 bytes_received 或 headers_received 信号将停止下载给定响应。请参阅以下示例:

import scrapy


class StopSpider(scrapy.Spider):
    name = "stop"
    start_urls = ["https://docs.scrapy.org/en/latest/"]

    @classmethod
    def from_crawler(cls, crawler):
        spider = super().from_crawler(crawler)
        crawler.signals.connect(spider.on_bytes_received, signal=scrapy.signals.bytes_received)
        return spider

    def parse(self, response):
        # 'last_chars' show that the full response was not downloaded
        yield {"len": len(response.text), "last_chars": response.text[-40:]}

    def on_bytes_received(self, data, request, spider):
        raise scrapy.exceptions.StopDownload(fail=False)

会产生以下输出:

2020-05-19 17:26:12 [scrapy.core.engine] INFO: Spider opened
2020-05-19 17:26:12 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2020-05-19 17:26:13 [scrapy.core.downloader.handlers.http11] DEBUG: Download stopped for <GET https://docs.scrapy.org/en/latest/> from signal handler StopSpider.on_bytes_received
2020-05-19 17:26:13 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://docs.scrapy.org/en/latest/> (referer: None) ['download_stopped']
2020-05-19 17:26:13 [scrapy.core.scraper] DEBUG: Scraped from <200 https://docs.scrapy.org/en/latest/>
{'len': 279, 'last_chars': 'dth, initial-scale=1.0">\n  \n  <title>Scr'}
2020-05-19 17:26:13 [scrapy.core.engine] INFO: Closing spider (finished)

默认情况下,结果响应由相应的错误回复处理。要调用它们的回调,就像在本例中一样,传递 fail=False 到 StopDownload 例外。


上一篇:Feed 导出下一篇:链接提取器

Archiver|手机版|笨鸟自学网 ( 粤ICP备20019910号 )

GMT+8, 2024-9-8 09:35 , Processed in 0.022421 second(s), 17 queries .

© 2001-2020

返回顶部