前言:
此刻同学们对“python表情包搞笑”都比较重视,小伙伴们都想要了解一些“python表情包搞笑”的相关资讯。那么小编也在网上汇集了一些有关“python表情包搞笑””的相关知识,希望姐妹们能喜欢,我们一起来学习一下吧!今天也是闲的无聊,看到好多沙雕图,就想爬下来
话不多说,先上代码:
import requests
import urllib
import os
import threading
from bs4 import BeautifulSoup
BASE_PAGE_URL=''
#所有页面URL列表
PAGE_URL_LIST=[]
#所有表情的URL列表
FACE_URL_LIST=[]
#全局锁
glock=threading.Lock()
for x in range(1,101):
url=BASE_PAGE_URL+str(x)
PAGE_URL_LIST.append(url)
def producer():
while True:
glock.acquire()
if len(PAGE_URL_LIST) == 0:
glock.release()
break
else:
url_page=PAGE_URL_LIST.pop()
glock.release()
response = requests.get(url_page)
content = (response.content)
soup = BeautifulSoup(content,'lxml')
img_list=soup.find_all('img',attrs={'class':'img-responsive lazy image_dta'})
glock.acquire()
for img in img_list:
url = img['data-original']
if not url.startswith('http'):
url = 'http:'+url
FACE_URL_LIST.append(url)
glock.release()
def custumer():
while True:
glock.acquire()
if len(FACE_URL_LIST) == 0:
glock.release()
continue
else:
face_url = FACE_URL_LIST.pop()
glock.release()
spit_list = face_url.split('/')
filename = spit_list.pop()[:-4]
path=os.path.join('biaoqing',filename)
urllib.request.urlretrieve(face_url,filename=path)
def main():
#创建两个生产者用于爬取
for x in range(3):
th=threading.Thread(target=producer)
th.start()
#创建4个线程作为消费者,用于下载
for x in range(5):
th=threading.Thread(target=custumer)
th.start()
if __name__ == "__main__":
main()
这是爬完后的文件夹
心情一好就爬了100页哈哈~面对五千多张表情包我也是很快乐的
如果想要表情包去和别人PK,关注公众号“幺猫折耳鹿”,回复“表情包”即可获得。
标签: #python表情包搞笑