Duplicated data with SelectorElementClick

Hi everyone,

I'm new to the subject and I try to list all clubs from the given Url: http://www.ffbb.com/jouer/trouver-un-club?DepartementClub=01

Below is a WIP sitemap I've tried to run multiple times (with many different option combinaisons) and spent hours on. Unfortunately, I still can't figure out a way to make it work.

{"_id":"ffbb-dev","startUrl":["http://www.ffbb.com/jouer/trouver-un-club?DepartementClub=01"],"selectors":[{"id":"clubs","type":"SelectorElement","selector":"ul.trouver-club li","parentSelectors":["page"],"multiple":true,"delay":0},{"id":"name","type":"SelectorText","selector":"div.salle-header span.salle-title","parentSelectors":["clubs"],"multiple":false,"regex":"","delay":0},{"id":"page","type":"SelectorElementClick","selector":"section div div.region div.content","parentSelectors":["_root"],"multiple":true,"delay":"5000","clickElementSelector":"li.pager-next a","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueHTMLText"},{"id":"city","type":"SelectorText","selector":"span.club-ville","parentSelectors":["clubs"],"multiple":false,"regex":"","delay":0}]}

It correctly browses the url and navigate through the 5 pages. However, the result is not what I'm expecting. Instead of all the 37 clubs, I get 5 x the clubs on page 5 = 25 duplicated results

Someone having a look would be a fantastic help. :innocent:

Wishing you all a bright Sunday, I will keep you posted if I find a solution before a good soul finds my bottle at the sea.

Oh I was about to forget.

The URL above is infested with ads and tracking stuff. I've found that if you don't use an adblocker, you should increase the scraping delays.

Hi!

You can save your time and use pagination array, as URL of a website does change once you press Next or page number.

Try to use this URL in your metadata - http://www.ffbb.com/jouer/trouver-un-club?page=[0-4]&DepartementClub=01

Once you have it set, when you start your scrape, it will go through pages from 5 to 1, thus avoiding clicking any page number buttons.

Hi Iconoclast and thank you for your answer.

I forgot to precise that I have also &DepartementClub to iterate on.
I've tried ?DepartementClub=[01-99]&page=[0-5] but it doesn't iterate on two arrays.
Any idea ?

Unfortunately you can use only one array. You can create a sitemap with multiple urls while having same array for page rotation.

You can create a sitemap with multiple urls while having same array for page rotation.

Do you have an exemple of such a sitemap ?

You can add as much URL as you would like just by pressing [ + ] button to the right of Start URL thus having multiple URL sitemap.

To access your sitemap, go to Sitemap (sitemap name) -> Edit Metadata

P.S. i'll try to help you with the pagination using page numbers on page, so you won't have to add multiple URLs

P.P.S. or you can save some time viewing this tutorial, using Link selector for pagination