Scraping with Load More Button not working well with too many items

Hi, I´m working in this page https://www.olx.com.ec/autos_c378?filter=condition_eq_2 . I need to download all information vehicles of this link. The code works well with few vehicles for example if I apply a filter to a province with few vehicles, the download of information is correct, but when I want to download all the information for Ecuador, the code stops shortly and does not download anything. I don't know why this happens; maybe it's due to the number of vehicles. If anyone can help me, I would appreciate it.

{"_id":"olx","startUrl":["https://www.olx.com.ec/autos_c378?filter=condition_eq_2"],"selectors":[{"id":"loadclicker","type":"SelectorElementClick","parentSelectors":["root"],"selector":"li:nth-of-type(n+5) div.IKo3","multiple":true,"delay":"5000","clickElementSelector":".rui-23TLR span","clickType":"clickMore","discardInitialElements":"do-not-discard","clickElementUniquenessType":"uniqueText"},{"id":"link","type":"SelectorLink","parentSelectors":["_root"],"selector":"li.EIR5N:nth-of-type(n+5) a","multiple":true,"delay":0},{"id":"marca","type":"SelectorText","parentSelectors":["link"],"selector":"span[data-aut-id='value_make']","multiple":false,"regex":"","delay":0},{"id":"año","type":"SelectorText","parentSelectors":["link"],"selector":"span[data-aut-id='value_year']","multiple":false,"regex":"","delay":0},{"id":"condicion","type":"SelectorText","parentSelectors":["link"],"selector":"span[data-aut-id='value_condition']","multiple":false,"regex":"","delay":0},{"id":"color","type":"SelectorText","parentSelectors":["link"],"selector":"span[data-aut-id='value_color']","multiple":false,"regex":"","delay":0},{"id":"vendedor","type":"SelectorText","parentSelectors":["link"],"selector":"span[data-aut-id='value_sellertype']","multiple":false,"regex":"","delay":0},{"id":"modelo","type":"SelectorText","parentSelectors":["link"],"selector":"span[data-aut-id='value_model']","multiple":false,"regex":"","delay":0},{"id":"kilometraje","type":"SelectorText","parentSelectors":["link"],"selector":"span[data-aut-id='value_mileage']","multiple":false,"regex":"","delay":0},{"id":"combustible","type":"SelectorText","parentSelectors":["link"],"selector":"span[data-aut-id='value_fuel']","multiple":false,"regex":"","delay":0},{"id":"transmision","type":"SelectorText","parentSelectors":["link"],"selector":"span[data-aut-id='value_vehicletransmission']","multiple":false,"regex":"","delay":0},{"id":"precio","type":"SelectorText","parentSelectors":["link"],"selector":"span._2xKfz","multiple":false,"regex":"","delay":0},{"id":"ubicacion","type":"SelectorText","parentSelectors":["link"],"selector":"._1uzVV span","multiple":false,"regex":"","delay":0},{"id":"fecha_publicacion","type":"SelectorText","parentSelectors":["link"],"selector":"._2DGqt span","multiple":false,"regex":"","delay":0},{"id":"nombre","type":"SelectorText","parentSelectors":["link"],"selector":"h1","multiple":false,"regex":"","delay":0}]}

You're probably hitting the RAM limits in WS/Chrome due to too much data. On my computer Load More starts to fail at around 1,500-2,000 rows. There's a kludge you can use where you separate the Load More clicker from the data scrapers. In the example below, I have also set a limiter to the clicker to stop at around 1,000. You can change that amount in ul > li[data-aut-id='itemBox']:nth-of-type(-n+1000)

You might still hit another limit after that, e.g. on my computer, simple scrapers without Load More or scrollers can fail at around 5,000 to 8,000 rows. You'd probably want to limit your search to generate fewer results and then run separate scraping sessions. E.g. on this site you could limit it by year, or by price range.

Sitemap:
{"_id":"forum-olx-feb","startUrl":["https://www.olx.com.ec/autos_c378?filter=condition_eq_2"],"selectors":[{"id":"Separate Load More","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"ul > li[data-aut-id='itemBox']:nth-of-type(-n+1000)","multiple":true,"delay":"2800","clickElementSelector":"button[data-aut-id='btnLoadMore']","clickType":"clickMore","discardInitialElements":"do-not-discard","clickElementUniquenessType":"uniqueHTML"},{"id":"Click link","type":"SelectorLink","parentSelectors":["_root"],"selector":"ul[data-aut-id='itemsList'] > li > a","multiple":true,"delay":0},{"id":"marca","type":"SelectorText","parentSelectors":["Click link"],"selector":"span[data-aut-id='value_make']","multiple":false,"regex":"","delay":0},{"id":"año","type":"SelectorText","parentSelectors":["Click link"],"selector":"span[data-aut-id='value_year']","multiple":false,"regex":"","delay":0},{"id":"condicion","type":"SelectorText","parentSelectors":["Click link"],"selector":"span[data-aut-id='value_condition']","multiple":false,"regex":"","delay":0},{"id":"color","type":"SelectorText","parentSelectors":["Click link"],"selector":"span[data-aut-id='value_color']","multiple":false,"regex":"","delay":0},{"id":"vendedor","type":"SelectorText","parentSelectors":["Click link"],"selector":"span[data-aut-id='value_sellertype']","multiple":false,"regex":"","delay":0},{"id":"modelo","type":"SelectorText","parentSelectors":["Click link"],"selector":"span[data-aut-id='value_model']","multiple":false,"regex":"","delay":0},{"id":"kilometraje","type":"SelectorText","parentSelectors":["Click link"],"selector":"span[data-aut-id='value_mileage']","multiple":false,"regex":"","delay":0},{"id":"combustible","type":"SelectorText","parentSelectors":["Click link"],"selector":"span[data-aut-id='value_fuel']","multiple":false,"regex":"","delay":0},{"id":"transmision","type":"SelectorText","parentSelectors":["Click link"],"selector":"span[data-aut-id='value_vehicletransmission']","multiple":false,"regex":"","delay":0},{"id":"precio","type":"SelectorText","parentSelectors":["Click link"],"selector":"span._2xKfz","multiple":false,"regex":"","delay":0},{"id":"ubicacion","type":"SelectorText","parentSelectors":["Click link"],"selector":"._1uzVV span","multiple":false,"regex":"","delay":0},{"id":"fecha_publicacion","type":"SelectorText","parentSelectors":["Click link"],"selector":"._2DGqt span","multiple":false,"regex":"","delay":0},{"id":"nombre","type":"SelectorText","parentSelectors":["Click link"],"selector":"h1","multiple":false,"regex":"","delay":0}]}

1 Like

Web Scraper it is not working on inactive ads. Please check and suggest any solution.
Example inactive ad link: https://www.olx.com.ec/item/perfume-ohm-black-de-yanbal-100-ml-iid-1100403292

{"_id":"olxec","startUrl":["https://www.olx.com.ec/item/vendo-gran-vitara-del-2004-iid-1100403292"],"selectors":[{"id":"Price","type":"SelectorText","parentSelectors":["_root"],"selector":"span._2xKfz","multiple":false,"regex":"","delay":0}]}