Document middleware design and behavior #4

redapple · 2016-06-28T09:31:39Z

See scrapinghub/scrapylib#45 (comment) for motivation.

It can be counter-intuitive for newcomers that the middleware will let the spider revisit pages if they did not produce any item.

kmike · 2017-03-03T20:00:14Z

FTR: I've recently created a middleware similar to deltafetch, but which is more explicit: https://github.com/TeamHG-Memex/scrapy-crawl-once. It does a similar thing, but in a less automatic way - user needs to set request.meta['crawl_once'] = True. I considered contributing to scrapy-deltafetch instead, but implementations have almost nothing in common (sqlite vs bsddb, items vs meta keys, different options).

arunsayone · 2017-10-09T07:38:11Z

@redapple , Hi I am new here. I have a project in which i used deltafetch,
is there a way, we can specify main url and some sub urls that the spider visit again ?
I am using my spider to scrape data periodically. Can you please help me ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document middleware design and behavior #4

Document middleware design and behavior #4

redapple commented Jun 28, 2016

kmike commented Mar 3, 2017

arunsayone commented Oct 9, 2017 •

edited

Loading

Document middleware design and behavior #4

Document middleware design and behavior #4

Comments

redapple commented Jun 28, 2016

kmike commented Mar 3, 2017

arunsayone commented Oct 9, 2017 • edited Loading

arunsayone commented Oct 9, 2017 •

edited

Loading