V2EX = way to explore
V2EX 是一个关于分享和探索的地方
Sign Up Now
For Existing Member  Sign In
15874103329
V2EX  ›  问与答

[求助] 如何提取出网页标签内所有的属性值

  •  
  •   15874103329 · Dec 22, 2018 · 2354 views
    This topic created in 2697 days ago, the information mentioned may be changed or developed.

    代码是这样的,要如何改动啊

    import requests

    from pyquery import PyQuery as pq

    from urllib.parse import urlencode

    import re

    def dizhi():

    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3573.0 Safari/537.36'}
    
    data = {
        'q': '微信群',
        'typeall': '1',
        'suball': '1',
        'timescope': 'custom:2018 - 12 - 20 - 0: 2018 - 12 - 22 - 0',
        'Refer': 'g'
    }
    
    url = 'https://s.weibo.com/weibo/%25E5%25AE%259D%25E5%25A6%2588%25E7%25BE%25A4?' + urlencode(data)
    
    wangzhi = requests.get(url,headers = headers)
    
    return wangzhi.text
    

    def jiexi(html):

    doc = pq(html)
    
    item = doc('.m3 li')
    
    print(item('img').attr('src'))
    

    def main():

    html = dizhi()
    
    jiexi(html)
    

    if name == 'main':

    main()
    

    打印结果:

    //ww4.sinaimg.cn/thumb150/475ee913ly1fydb7js7inj20orcmvx6q.jpg

    4 replies    2018-12-22 20:59:31 +08:00
    15874103329
        1
    15874103329  
    OP
       Dec 22, 2018
    打印 item 有很多图片,但是获取属性值只打印出了一个,如何将 item 中所有的图片打印出来啊
    ClutchBear
        2
    ClutchBear  
       Dec 22, 2018

    item 所有的图片,
    你要遍历才行
    15874103329
        3
    15874103329  
    OP
       Dec 22, 2018
    @ClutchBear 哦哦,谢谢大佬
    dreambig183
        4
    dreambig183  
       Dec 22, 2018 via Android
    推荐用 scrapy 的 selector.或是直接用 scrapy 框架吧,真的很方便!!!
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   4585 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 41ms · UTC 05:32 · PVG 13:32 · LAX 22:32 · JFK 01:32
    ♥ Do have faith in what you're doing.