0x01 前言
自己编程太垃圾了,看看写写
0x02 action
1 2 3 4 5 6 7 8 9
| from urllib.request import urlopen
url="http://www.baidu.com"
r=urlopen(url) with open("mybaidu.html","w",encoding="utf-8") as f: f.write(r.read().decode("utf-8"))
print("over!")
|
打开一个网页好像没什么, wb
模式是二进制写入模式,应该使用 w
模式并指定编码。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| import requests
name=input("输入你想要搜索的内容:") url="https://zh.wikipedia.org/zh-hant/{}".format(name)
headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36' }
r=requests.get(url=url,headers=headers)
print(r.status_code) print(r.text)
with open("test.html","w",encoding="utf-8") as f: f.write(r.text)
|
注意F12的妙用来找到接口
1 2 3 4 5 6 7 8 9 10 11
| import requests
url="https://fanyi.baidu.com/sug" s=input("请输入你要查的单词:") data={ "kw":s } r=requests.post(url=url,data=data)
print(r.json())
|
筛选,对于不会F12的我是搞到事了,之前一直人工找
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| import requests
url = "https://m.douban.com/j/to_app"
params = { "url": "https://m.douban.com/movie/", "source": "m_ad_nav", "copy_open": 1, } headers={ "referer":"https://m.douban.com/movie/", "sec-ch-ua":'"Chromium";v="130", "Google Chrome";v="130", "Not?A_Brand";v="99"', "sec-ch-ua-mobile":"?0", "sec-ch-ua-platform":"Windows", "user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36", } r = requests.get(url=url, params=params,headers=headers) print(r.json())
r.close()
|
r.close()
关掉避免被标记
re模块
简单的学习一下正则,才能熟练的用这个模块,这里推荐一个小站