V2EX › forgetTb 的所有回复 › 第 1 页 / 共 1 页

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

2018-05-18 10:24:43 +08:00

回复了 forgetTb 创建的主题 › Python › Tornado+scrapy(Twisted) 框架结合 0，支持 newrelic 服务器监控 1，异步非阻塞 2， realtime 实时响应(不经过任何数据库存储)

@golmic 1, 需要的是实时响应，类似 scrapyrt(realtime, nonblocking)这种。但 scrapyrt 是使用 Twisted 做 WebServer，并不能与 newrelic(:一种服务器监控软件)搭配使用。现在想使用 Tornado 做 WebServer(支持 newrelic,且异步）。

2018-05-18 10:20:50 +08:00

@crb912 Tornado 做 webserver （可以用 newrelic 做服务器监控）, 调用 scrapy 项目的爬虫并实时响应返回。（ scrapyrt 是用 Twisted 做 Webserver:不能够与 newrelic 搭配使用 XXX ）

2018-05-16 10:42:22 +08:00

或者说能够使用 tornado.platform.twisted
具体是在 Tornado 框架下
import tornado.platform.twisted
tornado.platform.twisted.install()
from twisted.internet import reactor

调用 Scrapy 启动爬虫(EvenvLoop 事件循环的代码）
dfd = process.crawl(QuotesSpider)
# process.start() # the script will block here until the crawling is finished d.addBoth(lambda _: reactor.stop())
result = dfd.addCallback(self.result_items)
并且直接拿到爬取数据。

参考链接如下：
https://stackoverflow.com/questions/36384286/how-to-integrate-flask-scrapy
http://www.tornadoweb.org/en/stable/twisted.html#twisted-on-tornado
https://doc.scrapy.org/en/latest/topics/practices.html

2016-09-18 13:52:11 +08:00

回复了 ammzen 创建的主题 › 问与答 › 在中国人民银行官网发现这样一行代码

@ammzen ,你是如何解决的，可以具体详细的解说下吗？我在别的网站上也碰到了这个问题。
我用 python 的 requests 去请求一个网址，总是返回那一段代码。是可以设置 selenium 参数开启 javascript 吗？
用浏览器访问时，第一次访问时弹出那框，然后关闭后再刷新，再访问就正常了。