python脚本锻炼

0x01 前言

自己编程太垃圾了,看看写写

0x02 action

1
2
3
4
5
6
7
8
9
from urllib.request import urlopen

url="http://www.baidu.com"

r=urlopen(url)
with open("mybaidu.html","w",encoding="utf-8") as f:
f.write(r.read().decode("utf-8"))

print("over!")

打开一个网页好像没什么, wb 模式是二进制写入模式,应该使用 w 模式并指定编码。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import requests

name=input("输入你想要搜索的内容:")
url="https://zh.wikipedia.org/zh-hant/{}".format(name)

headers={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36'
}

r=requests.get(url=url,headers=headers)

print(r.status_code)
print(r.text)

with open("test.html","w",encoding="utf-8") as f:
f.write(r.text)

注意F12的妙用来找到接口

1
2
3
4
5
6
7
8
9
10
11
import requests

url="https://fanyi.baidu.com/sug"
s=input("请输入你要查的单词:")
data={
"kw":s
}
r=requests.post(url=url,data=data)

# print(r.text)
print(r.json())

1

筛选,对于不会F12的我是搞到事了,之前一直人工找

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import requests

url = "https://m.douban.com/j/to_app"

params = {
"url": "https://m.douban.com/movie/",
"source": "m_ad_nav",
"copy_open": 1,
}
headers={
"referer":"https://m.douban.com/movie/",
"sec-ch-ua":'"Chromium";v="130", "Google Chrome";v="130", "Not?A_Brand";v="99"',
"sec-ch-ua-mobile":"?0",
"sec-ch-ua-platform":"Windows",
"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36",
}
r = requests.get(url=url, params=params,headers=headers)
print(r.json())

r.close()

r.close()关掉避免被标记

re模块

简单的学习一下正则,才能熟练的用这个模块,这里推荐一个小站

1
https://regex101.com/