龙空技术网

Scrapy框架 实现百科翻页功能

幼姿沫 64

前言:

现在咱们对“scrapy自动翻页”都比较关注,我们都想要知道一些“scrapy自动翻页”的相关知识。那么小编也在网摘上搜集了一些有关“scrapy自动翻页””的相关内容,希望各位老铁们能喜欢,姐妹们一起来学习一下吧!

网址① 网页分析

② 爬虫项目编写

items.py

qsbk.py

import scrapy

from qsbk_project.items import QsbkProjectItem

class QsbkSpider(scrapy.Spider):

name ='qsbk'

allowed_domains = ['qiushibaike.com']

start_urls = [';]

domain_url=';

def parse(self, response):

cross_talks=response.xpath('//div[@class="col1 old-style-col1"]/div')

for cross_talkin cross_talks:

item=QsbkProjectItem()

image=cross_talk.xpath("./div[@class='author clearfix']/a/img/@src").get()

if imageis not None:

item['image']='https:'+image

name=cross_talk.xpath("./div[@class='author clearfix']/a[2]/h2/text()").extract_first().strip()if cross_talk.xpath("./div[@class='author clearfix']/a[2]/h2/text()").extract_first().strip()is not None else None

item['name']=name

age=cross_talk.xpath("./div[@class='author clearfix']/div/text()").extract_first().strip()

item['age']=age

gender=cross_talk.xpath("./div[@class='author clearfix']/div/@class").extract_first().strip()

gender=gender.replace('articleGender','').replace('Icon','')

item['gender']=gender

content=cross_talk.xpath("./a[1]/div[@class='content']/span/text()").getall()

content=''.join(content).strip()

item['content']=content

laugh=cross_talk.xpath("./div[@class='stats']/span[@class='stats-vote']/i[@class='number']/text()").extract_first()

item['laugh']=laugh

comments=cross_talk.xpath("./div[@class='stats']/span[@class='stats-comments']/a/i[@class='number']/text()").extract_first()

item['comments']=comments

yield item

#分页代码展示

next_url=response.xpath('//ul[@class="pagination"]/li[last()]/a/@href').get()

if next_urlis None:

print("糗百爬虫数据爬取完毕了")

return

else:

next_url=QsbkSpider.domain_url+next_url

yield scrapy.Request(next_url,QsbkSpider.parse)

pipelines.py

setting.py

代码结果展示:在控制台的命令输入为

结果截图 一页效果为25个 只复制了九个段子

{"image": ";, "name": "才下眉头却上心头", "age": "31", "gender": " women", "content": "我吃西瓜不吃籽,一点都不吃,有一次吃西瓜我又一个个吐籽,三岁多的儿子吃着西瓜对我说:“妈妈,西瓜籽能吃,你看我就吃下去了!”我就故意逗他:“宝贝,西瓜籽不能吃,要不然西瓜籽就在你肚子里发芽到时候长出来一个大西瓜,从你嘴巴里出来,然后你就每天嘴里叼着一个大西瓜了!”信以为真的儿子立马认真地蹲在垃圾桶旁边挑西瓜籽,我妈怎么哄都没用,所有人吃完西瓜都出去了,他还在仔细的挑,气的我妈对我是又踹又骂的,我就纳闷了,我小时候你们也这样骗过我啊,我拿来骗我儿子代代相传没毛病啊", "laugh": "441", "comments": "11"}

{"image": ";, "name": "夏有凉风秋望月", "age": "20", "gender": " women", "content": "男友:媳妇,为啥新买的洗面奶搓不出 泡 泡 啊,是不是买的假货?我:你用的哪一个?男友:就是那瓶白色的,标签 破了的。我:你特么……那是 劳 资 的免 喜 面 膜!", "laugh": "1783", "comments": "52"}

{"image": ";, "name": "吃了两碗又盛", "age": "41", "gender": " man", "content": "请一天假去办事,可能是因为在外面待太久吧,我冻得der呵的,就感冒了。我现在躺在床上心里非常舒坦,你看我请了个事假,还赠送个病假,真值啊!", "laugh": "1062", "comments": "36"}

{"image": ";, "name": "逍遥ベ冰儿小迷妹", "age": "26", "gender": " women", "content": "闺蜜谈恋爱了,向我讨教恋爱经验。 我资深一笑:必须要抓住男人的心,让他觉得离不开你,一秒不见,他就会发疯地找你。话还没说完,男友给我打来电话,我得意地向闺蜜挑了个眉,打开接听键:媳妇,我刚睡醒,你在哪呢?你又把门反锁了,我出不去啊,都快死了。。。", "laugh": "1454", "comments": "25"}

{"image": ";, "name": "骑着二哈啃黄瓜", "age": "28", "gender": " women", "content": "父亲是老学究,客厅闲聊,意气风发地讲到儒家,孔老先生弟子三千,传道授业解惑,攻在千秋。。。侄儿正玩着玩具呢,看着爷爷讲的唾液横飞,连忙捧场:如家是不错,每天生意兴隆,主要是惠民价格,到处都有它的分店。。。", "laugh": "1547", "comments": "60"}

{"image": ";, "name": "骑着二哈啃黄瓜", "age": "28", "gender": " women", "content": "闺蜜和妈妈视频,家长里短的可嗨了,两个女人聊半个小时聊够了,闺蜜才说:妈,你让爸过来聊两句嘛?闺蜜妈妈扭过头,叫了声:老头子,闺女要和你聊两句。闺蜜爸爸那边低沉的声音传过来:在游戏呢,别打扰我,你们聊够了才想起我,没空,哼╯^╰。。。闺蜜乐了,对着屏幕甩了一句:妈,爸这号沉迷游戏算是废了,我支持你重新练一个。。。", "laugh": "2053", "comments": "84"}

{"image": ";, "name": "忧郁的创可贴", "age": "93", "gender": " women", "content": "周末晚上小侄女来了,肆无忌惮地玩起了平时玩不到的平板电脑,过了很久,我妈忍无可忍说道:你该睡觉了,都九点了。小侄女着急地反驳到:有那么快么,这个钟?我都调慢两个小时了。。。", "laugh": "1294", "comments": "75"}

{"image": ";, "name": "狐小乔", "age": "20", "gender": " women", "content": "房东阿姨收房租,她说:屋里怎么这么冷?你晚上睡得着吗?我心里一喜:莫非房东阿姨要安装暖气?我高兴地说:睡不着也得忍着,谁让咱交不起暖气费呢。阿姨给我们免费通暖气吗?房东阿姨笑呵呵地走了,丢下一句:年轻人努力奋斗,将来买一套带暖气的房子…", "laugh": "1032", "comments": "57"}

{"image": ";, "name": "狐小乔", "age": "20", "gender": " women", "content": "我给闺蜜介绍男朋友,她问我长相咋样,我说:白白净净的,就是有点胖。闺蜜:胖子?你以为老娘嫁不出去了?弄个猪八戒糊弄我?你是不是见不得我好!我辩解:那小哥也不是特别胖,他个子高,坐他家卡宴里有点挤,正准备换一辆悍马……话没说完,闺蜜跳起来大叫:乔啊,哪天见面?我说:我不想用猪八戒糊弄你。闺蜜:别瞎说,我正想着生个佩奇呢……", "laugh": "1405", "comments": "62"}

标签: #scrapy自动翻页