python爬虫如何防止被封ip(python爬虫实战一键检测前女友是否屏蔽你的空间)(1)

最近发现有人QQ空间对我展开了屏蔽,咱们也不知道怎么惹到人家了,一气之下写了一个小爬虫看看到底谁把我屏蔽了。写小本本记下来!!!

准备工作

python环境: python3.7.4 第三方库环境: requests lxml threadpool selenium

利用selenium模拟登陆获取cookie并保存到本地

def search_cookie(): # 先检测一下是否运行过 if not __import__('os').path.exists('cookie_dict.txt'): get_cookie_json() with open('cookie_dict.txt', 'r') as f: cookie=json.load(f) return cookie def get_cookie_json(): # 无头selenium登陆 qq_number = input('请输入qq号:') password = __import__('getpass').getpass('请输入qq密码:') from selenium import webdriver login_url = 'https://i.qq.com/' chrome_options =Options() chrome_options.add_argument('--headless') driver = webdriver.Chrome(options=chrome_options) driver.get(login_url) driver.switch_to_frame('login_frame') driver.find_element_by_xpath('//*[@id="switcher_plogin"]').click() time.sleep(1) driver.find_element_by_xpath('//*[@id="u"]').send_keys(qq_number) driver.find_element_by_xpath('//*[@id="p"]').send_keys(password) time.sleep(1) driver.find_element_by_xpath('//*[@id="login_button"]').click() time.sleep(1) cookie_list = driver.get_cookies() cookie_dict = {} for cookie in cookie_list: if 'name' in cookie and 'value' in cookie: cookie_dict[cookie['name']] = cookie['value'] with open('cookie_dict.txt', 'w') as f: json.dump(cookie_dict, f) return True

找到查看好友的接口

进入我的空间,点击 F12 检查界面,将 Network 清空后点击好友界面。

python爬虫如何防止被封ip(python爬虫实战一键检测前女友是否屏蔽你的空间)(2)

首选盲猜好友列表含有friend字段。直接选择搜索发现出来一些数据,挨个查找之后发现好友字段。保存当前获得的 url 供日后查询。

python爬虫如何防止被封ip(python爬虫实战一键检测前女友是否屏蔽你的空间)(3)

破解data里面的加密参数

python爬虫如何防止被封ip(python爬虫实战一键检测前女友是否屏蔽你的空间)(4)

看到只有一个 g_tk 加密参数就很激动,就一个加密!去 Sources 里面搜索 g_tk 取值到底是什么加密,发现是个函数点进去看后发现是个简单的小加密。可以写 python 代码。

python爬虫如何防止被封ip(python爬虫实战一键检测前女友是否屏蔽你的空间)(5)

Python代码如下:

def get_g_tk(): # QQ空间的加密算法 p_skey = cookie['p_skey'] h = 5381 for i in p_skey: h = (h << 5) ord(i) g_tk = h & 214748364 return g_tk

在QQ空间好友栏获取好友列表

拿到加密参数后,接下来我们就只需要进刚才所说的空间好友栏页面将所有的好友的QQ号抓下来,用urllib.parse.urlencode(data)将参数转成我们常见的url后面缀了一长串&&&的形式与原始链接拼接,然后就可以带上cookies发送请求获取json数据,

def get_friends_uin(g_tk): # 获得好友的QQ号信息 yurl = 'https://user.qzone.qq.com/proxy/domain/r.qzone.qq.com/cgi-bin/tfriend/friend_ship_manager.cgi?' data = { 'uin': cookie['ptui_loginuin'], 'do': 1, 'g_tk': g_tk } url = yurl urllib.parse.urlencode(data) res=requests.get(url, headers = headers, cookies = cookie) r = res.text.split('(')[1].split(')')[0] friends_list=json.loads(r)['data']['items_list'] friends_uin=[] for f in friends_list: friends_uin.append(f['uin']) return friends_uin

找到屏蔽我的"狠人"

拿到好友的QQ号之后,咱们就能直接访问好友的空间了,但是好友设置了拒绝访问,一定要拿小本本记下来!

python爬虫如何防止被封ip(python爬虫实战一键检测前女友是否屏蔽你的空间)(6)

def get_blacklist(friends): # 查询被挡好友的QQ号,用小本本记下来! access_denied=[] # 拉黑笔记,小本本记下来! yurl = 'https://user.qzone.qq.com/' for friend in friends: print("开始检查:" str(friend)) url = yurl str(friend) res = requests.get(url,headers=headers,cookies=cookie) tip = etree.HTML(res.text).xpath('/html/body/div/div/div[1]/p/text()') if len(tip) > 0: #if tip[0][:7] == "主人设置了权限": print(str(friend) "把我拉黑了!") access_denied.append(friend) return access_denied

拉黑这帮重色轻友的人!进入自己心灵想进去的地方,拉黑他们!

python爬虫如何防止被封ip(python爬虫实战一键检测前女友是否屏蔽你的空间)(7)

发现只有一个 post 请求,那应该就只能是这个了。

python爬虫如何防止被封ip(python爬虫实战一键检测前女友是否屏蔽你的空间)(8)

看了眼所需要的参数,自己的号,拉黑的号,自己的空间,加上一个无用参数和刚才所获得加密参数。

python爬虫如何防止被封ip(python爬虫实战一键检测前女友是否屏蔽你的空间)(9)

越想越气,写代码!

def pull_black(): # 拉黑,必须拉黑! global cookie cookie = search_cookie() with open('access_denied.txt', 'r') as f: access_denied = f.readlines() for fake_friend in access_denied: fake_friend = fake_friend.split('\n')[0] yurl = "https://user.qzone.qq.com/proxy/domain/w.qzone.qq.com/cgi-bin/right/cgi_black_action_new?" g_tk = get_g_tk() url_data = {'g_tk': g_tk} data = { 'uin': cookie['ptui_loginuin'], 'action': '1', 'act_uin': fake_friend, 'fupdate': '1', 'qzreferrer': 'https://user.qzone.qq.com/1223411083' } url = yurl urllib.parse.urlencode(url_data) res=requests.post(url, headers = headers, data=data, cookies = cookie) print(str(fake_friend) "已被您拉黑") print("都拉黑了!解气!!")

全部代码

import time import json import re import urllib import requests from lxml import etree import threadpool headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'} def search_cookie(): if not __import__('os').path.exists('cookie_dict.txt'): get_cookie_json() with open('cookie_dict.txt', 'r') as f: cookie=json.load(f) return cookie def get_cookie_json(): # 无头selenium登陆 qq_number = input('请输入qq号:') password = __import__('getpass').getpass('请输入qq密码:') from selenium import webdriver from selenium.webdriver.chrome.options import Options login_url = 'https://i.qq.com/' chrome_options =Options() chrome_options.add_argument('--headless') driver = webdriver.Chrome(options=chrome_options) driver.get(login_url) driver.switch_to_frame('login_frame') driver.find_element_by_xpath('//*[@id="switcher_plogin"]').click() time.sleep(1) driver.find_element_by_xpath('//*[@id="u"]').send_keys(qq_number) driver.find_element_by_xpath('//*[@id="p"]').send_keys(password) time.sleep(1) driver.find_element_by_xpath('//*[@id="login_button"]').click() time.sleep(1) cookie_list = driver.get_cookies() cookie_dict = {} for cookie in cookie_list: if 'name' in cookie and 'value' in cookie: cookie_dict[cookie['name']] = cookie['value'] with open('cookie_dict.txt', 'w') as f: json.dump(cookie_dict, f) return True def get_g_tk(): # QQ空间的加密算法 p_skey = cookie['p_skey'] h = 5381 for i in p_skey: h = (h << 5) ord(i) g_tk = h & 214748364 return g_tk def get_friends_uin(g_tk): # 获得好友的QQ号信息 yurl = 'https://user.qzone.qq.com/proxy/domain/r.qzone.qq.com/cgi-bin/tfriend/friend_ship_manager.cgi?' data = { 'uin': cookie['ptui_loginuin'], 'do': 1, 'g_tk': g_tk } url = yurl urllib.parse.urlencode(data) res=requests.get(url, headers = headers, cookies = cookie) r = res.text.split('(')[1].split(')')[0] friends_list=json.loads(r)['data']['items_list'] friends_uin=[] for f in friends_list: friends_uin.append(f['uin']) return friends_uin def get_blacklist(friends): # 查询被挡好友的QQ号,用小本本记下来! access_denied=[] # 拉黑笔记,小本本记下来! yurl = 'https://user.qzone.qq.com/' for friend in friends: print("开始检查:" str(friend)) url = yurl str(friend) res = requests.get(url,headers=headers,cookies=cookie) tip = etree.HTML(res.text).xpath('/html/body/div/div/div[1]/p/text()') if len(tip) > 0: #if tip[0][:7] == "主人设置了权限": print(str(friend) "把我拉黑了!") access_denied.append(friend) return access_denied def pull_black(): # 拉黑,必须拉黑! global cookie cookie = search_cookie() with open('access_denied.txt', 'r') as f: access_denied = f.readlines() for fake_friend in access_denied: fake_friend = fake_friend.split('\n')[0] yurl = "https://user.qzone.qq.com/proxy/domain/w.qzone.qq.com/cgi-bin/right/cgi_black_action_new?" g_tk = get_g_tk() url_data = { 'g_tk': g_tk } data = { 'uin': cookie['ptui_loginuin'], 'action': '1', 'act_uin': fake_friend, 'fupdate': '1', 'qzreferrer': 'https://user.qzone.qq.com/1223411083' } url = yurl urllib.parse.urlencode(url_data) res=requests.post(url, headers = headers, data=data, cookies = cookie) print(str(fake_friend) "已被您拉黑") print("都拉黑了!解气!!") def recording(): # 主函数 global cookie cookie = search_cookie() g_tk = get_g_tk() friends_uin = get_friends_uin(g_tk) access_denied = get_blacklist(friends_uin) print(f"一共有{len(access_denied)}人把你拉黑了!") with open('access_denied.txt', 'w') as f: for a in access_denied: f.write(str(a) '\n') if __name__ == '__main__': # 运行 recording() pull_black()

想了解更多爬虫骚操作吗???想获得从Python零基础入门Python爬虫学习资料吗???

关注 转发。私信小编“爬虫”即可。

,