笨鸟编程-零基础入门Pyhton教程 › 首页 ›Scrapy中文手册 › 查看内容

命令行工具

看法¶

Syntax： scrapy view <url>
需要项目： no

在浏览器中打开给定的URL，因为您的废蜘蛛会“看到”它。有时候蜘蛛看到的页面与普通用户不同，所以这可以用来检查蜘蛛“看到”什么，并确认它是你所期望的。

支持的选项：

--spider=SPIDER ：绕过Spider自动检测并强制使用特定Spider
--no-redirect ：不遵循HTTP 3xx重定向（默认为遵循它们）

使用实例：

$ scrapy view http://www.example.com/some/page.html
[ ... browser starts ... ]

壳¶

Syntax： scrapy shell [url]
需要项目： no

为给定的URL（如果给定）启动scrapy shell；如果没有给定URL，则为空。还支持Unix风格的本地文件路径，无论是相对于 ./ 或 ../ 前缀或绝对文件路径。见 Scrapy shell 更多信息。

支持的选项：

--spider=SPIDER ：绕过Spider自动检测并强制使用特定Spider
-c code ：评估shell中的代码，打印结果并退出
--no-redirect ：不遵循HTTP 3xx重定向（默认为遵循它们）；这只影响在命令行上作为参数传递的URL；一旦进入shell， fetch(url) 默认情况下仍将遵循HTTP重定向。

使用实例：

$ scrapy shell http://www.example.com/some/page.html
[ ... scrapy shell starts ... ]

$ scrapy shell --nolog http://www.example.com/ -c '(response.status, response.url)'
(200, 'http://www.example.com/')

# shell follows HTTP redirects by default
$ scrapy shell --nolog http://httpbin.org/redirect-to?url=http%3A%2F%2Fexample.com%2F -c '(response.status, response.url)'
(200, 'http://example.com/')

# you can disable this with --no-redirect
# (only for the URL passed as command line argument)
$ scrapy shell --no-redirect --nolog http://httpbin.org/redirect-to?url=http%3A%2F%2Fexample.com%2F -c '(response.status, response.url)'
(302, 'http://httpbin.org/redirect-to?url=http%3A%2F%2Fexample.com%2F')