Scraping Order for Images

Apologies in advance if this has been asked already.

I want to scrape various URLs – I have created the sitemap and its works great. One thing causing me problems is that I am scraping several images from image galleries and I would like the URLs to be scraped in order 1, 2, 3... etc.

From what I've read, the scraper tool doesn't seem to honour any order. My question, is there any way that I could scrape the very first image displayed and set it as a 'key' image? The order of the rest doesn't matter too much but getting the first image really would help.

Hi there!

For an ordered results, you have to use CouchDB.

I've written about it here:

Thanks for your help iconoclast.

I have installed CouchDB and configured as per your instructions which seems to have worked.

However, I now have a problem – I restarted Chrome and my Sitemaps had disappeared. I reverted back to Local Storage, restarted Chrome and the sitemaps reappeared. I copied the contents of the sitemaps and saved them into a text document. Changed back to CouchDB for storage, imported them and they appear to have saved into CouchDB which I assume is normal.

However, when I now try to actually 'Scrape', the web scraper interface clears and just shows the word 'Loading'. Nothing else appears to happen. Previously, when I scraped a site I could see all of the activity on the site itself, now I just see "Loading" and it does nothing else.

I'm really hoping you can help again.

Does it show anything after scraping is complete? Because if you're trying to see the data while scraping is done using an Element Click selector, nothing will be shown until it's complete (scraping job finished).

I just tried running it again. I left it for a couple of hours and it still just shows "Loading". Previously (when I was storing the data locally), you could see the web pages opening and processing. Whilst there was no scraped data displaying until it finished and then I saved a CSV, you could still see it working on the different pages. This time it's just a "loading" that shows.

I've attached a screenshot .

As a test, I reverted back to local storage and ran the scraper again – it worked fine. I've since reverted back to CouchDB storage and nothing.

Thanks again, your help is greatly appreciated.

To ensure everything is going right, I recommend you to create a test sitemap with a few results in the end.

Please keep in mind that switching between DB types require you to restart browser each time you do it.

P.S. also, when installing CouchDB it's best to restart your PC/Mac as well to have the service running.