前言:
现在兄弟们对“python美女”都比较关心,各位老铁们都需要学习一些“python美女”的相关文章。那么小编在网络上收集了一些有关“python美女””的相关资讯,希望同学们能喜欢,兄弟们快快来学习一下吧!网站:
然后在网页中点击右键选择审查元素,然后将选项选到下图所示位置
右键刷新页面,出现下图
选第一项,右边出现一些信息,留着有用,可以先复制下来,如下的项目
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8', 'Accept-Encoding':'gzip, deflate', 'Accept-Language':'zh-CN,zh;q=0.9', 'Cache-Control':'max-age=0', 'Connection':'keep-alive', 'Cookie':'__guid=93692315.2789414934844540000.1575542378114.0908; Hm_lvt_19d597dda6fff717a20d276fa907bd17=1575376217,1575542380; monitor_count=27; Hm_lpvt_19d597dda6fff717a20d276fa907bd17=1575543370', 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'
每个:两边的字符串原本是没有单引号的,是后加的。
之后点击查看源码,发现
每个图片的网址如上图红色框框所示,之后我们只要从网页中提取出这个网址,然后访问,就可以下载图片了。
提取网址的代码如下:
list = []def zhuye(aurl): data= requests.get(aurl, headers=headers) data.raise_for_status() data.encoding = data.apparent_encoding soup = BeautifulSoup(data.text, 'lxml') imgs = soup.select('body > div.wrap > div.TypeList > ul > li> a') for img in imgs: photo = img.get('href') list.append(photo) print(photo)
先设置个list列表用于存储网址,然后定义一个函数,参数为main主函数里传入的初始网址,然后调用requests的get方法来获取网址的源代码,接着用BeautifulSoup库解析,查找对应的网址(如下图右键单击蓝色的网址选择copy,第二项copy selecter,如:‘body > div.wrap > div.TypeList > ul > li:nth-child(2) > a’,把倒数第二项改下‘body > div.wrap > div.TypeList > ul > li > a’,便如上图代码,接着从中用BeautifulSoup的get方法提取出网络地址,存储到list中)。
之后依然是解析网址,下载存储图片,就不一一详细说明了。
全部源码如下:
from bs4 import BeautifulSoupimport requestsheaders={'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8', 'Accept-Encoding':'gzip, deflate', 'Accept-Language':'zh-CN,zh;q=0.9', 'Cache-Control':'max-age=0', 'Connection':'keep-alive', 'Cookie':'__guid=93692315.2789414934844540000.1575542378114.0908; Hm_lvt_19d597dda6fff717a20d276fa907bd17=1575376217,1575542380; monitor_count=27; Hm_lpvt_19d597dda6fff717a20d276fa907bd17=1575543370', 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'}list = []def zhuye(aurl): data = requests.get(aurl, headers=headers) data.raise_for_status() data.encoding = data.apparent_encoding soup = BeautifulSoup(data.text, 'lxml') imgs = soup.select('body > div.wrap > div.TypeList > ul > li> a') for img in imgs: photo = img.get('href') list.append(photo) print(photo)# def xiazai():# path='D:\pythonapk\ceshi'# for itme in list:# dataimg = requests.get(itme, headers=headers)# fp=open(path+itme.split('?')[0][-10:],'wb')# fp.write(dataimg.content)# fp.close() # for itm in range(40): # if itm == 1 : # dataimg = requests.get(itme, headers=headers) # else: # itme=itme+'_itm' # dataimg=requests.get(itme,headers=headers)def main(): for i in range(10): url_path = '{0}.htm'.format(i+1) # print(url_path) zhuye(url_path) path = 'D://pythonapk/ceshi/1' for itme in list: dataimg = requests.get(itme, headers=headers) dataimg.raise_for_status() dataimg.encoding = dataimg.apparent_encoding soup = BeautifulSoup(dataimg.text, 'lxml') xzimg = soup.select('#ArticleId0 > p > a > img') for url in xzimg: xurl = url.get('src') print(xurl) shuju = requests.get(xurl, headers) # print(dataimg.text) fp = open(path + xurl.split('?')[0][-10:], 'wb') fp.write(shuju.content) fp.close()main()
当然这只是下载每一个网址中的首长图片,后续的图片因为时间关系就不下了。
标签: #python美女