[第三问] 模拟登陆 post XML 格式参数和自动重定向

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

推荐学习书目

› Learn Python the Hard Way

Python Sites

› PyPI - Python Package Index

› http://diveintopython.org/toc/index.html

› Pocoo

值得关注的项目

› PyPy

› Celery

› Jinja2

› Read the Docs

› gevent

› pyenv

› virtualenv

› Stackless Python

› Beautiful Soup

› 结巴中文分词

› Green Unicorn

› Sentry

› Shovel

› Pyflakes

› pytest

Python 编程

› pep8 Checker

Styles

› PEP 8

› Google Python Style Guide

› Code Style from The Hitchhiker's Guide

这是一个创建于 3074 天前的主题，其中的信息可能已经有所发展或是发生改变。

感觉写这次电信网上营业厅的爬虫，把坑踩了个遍，还是不能完美解决各种问题。

通过抓包，整个登陆过程中的 http 的详细过程，大致如下

一.向 LoginServlet 这个 url post 登陆表单;状态码 200
二.再向 Login 这个 url post 一组 XML 参数;状态 302
三.302 重定向三次，指向账号登陆成功页面
四.get 登陆成功页面，完成登陆过程

——————————————————

*0.LoginURL

*1.POST|(LoginServlet) --------- 登陆 form --------- referer:0.LoginURL--------- status_code:200

*2.POST ---------SSORuquetXML --------- refererr:LoginServlet ---------location:3--------status_code:302

*3.GET --------- referer:LoginServlet ---------location:4--------- status_code:302

*4.GET ---------referer:LoginServlet --------- location:5 --------- status_code:302

*5.GET --------- referer:LoginServlet --------- status_code:200

*6.GET ---------acount/init.action --------- referer:5

—————————————————— 我遭不住了。。。

1.XML 格式怎么 post ， python 实现（是以字典格式，百度了发现都是解析 xml 文件之类的）？

2.XML 中有些 ID 参数是怎么生成的（审查元素，并没有相关的 JS 进行处理），我比对了不同账号，发现就一些 ID 参数不同？

3.在 request.session()中， cookies 并不能自动管理？（浏览器抓包是很多 cookies ，而代码实现，只有一条或者没有 Cookies ）

再踩几个坑，再解决不了。。。我就放弃，感谢大家最近几天的耐心解答

2 条回复 • 2016-08-01 18:13:22 +08:00

Huayx9

2016-08-01 17:07:17 +08:00

在 post 操作之后，后面如果有重定向，能由 reuqests 自动完成

Huayx9

2016-08-01 18:13:22 +08:00

我是傻逼。。你们别回复我