Unable to scrape after update!

I've been using webscraper for a week. I scrape 240 pages from a specific website daily, and have done so last 7 days.

However, the latest update for this extension (0.4.0 done on 15th April 2019) does not work on my system!
I even tried going to the sample test website and doing simple text extractions. Did not work.
Anyone else facing this problem?

2 Likes

I've been on webscraper for months and I am facing the same issue :confused:

2 Likes

I just updated Chrome to their current version (73) and it seems to work normally again.

Please describe how exactly the extension isn't working. Which features do not work? What chrome version are you running?

I have a scrape that I have been using 4x a day for the last nine months without any issue. It typically scrapes around 200 records, pulling data off of six pages per record (approximately 1200 pages per scrape, 4800 pages a day). As of yesterday's update, the Element click selectors used to navigate between the html pages are no longer functioning. Web Scraper navigates from the list page to the first page of each collection; but instead of navigating to the second (thru sixth) pages, it then skips to the first page of the next collection. I am using the current version of Chrome (73.0.3683.103).

Because our scrape is critical to our business processes, I have reverted to the prior version of Web Scraper to get our business back up and running. I would like to use the current version, but I need to figure out what you've changed in the Element click selector and adjust / retest my scrape configuration. Any suggestions are appreciated.

Here's how the click selector is currently configured, as well as a snapshot of the first three levels of the graph. The delay is at 10000 ms as I was experimenting with different delay lengths to see if it was a timing issue.

Can you please post your sitemap?

Hello, I have the same issue sice the update.

I used to scrap without no problem with this same sitemap.

Here is the website: https://candidat.pole-emploi.fr/offres/recherche?emission=1&motsCles=vendeur&offresPartenaires=false&rayon=10&tri=0

Here is the sitemap:

{"_id":"pole-emploi","startUrl":["https://candidat.pole-emploi.fr/offres/recherche?emission=1&motsCles=vendeur&offresPartenaires=false&range=0-9&rayon=10&tri=1"],"selectors":[{"id":"offres","type":"SelectorLink","parentSelectors":["_root","pagination"],"selector":"h2.t4 a.btn-reset","multiple":true,"delay":0},{"id":"titre","type":"SelectorText","parentSelectors":["offres"],"selector":"h2.t2","multiple":false,"regex":"","delay":0},{"id":"lieu","type":"SelectorText","parentSelectors":["offres"],"selector":"p.t4","multiple":false,"regex":"","delay":0},{"id":"type","type":"SelectorText","parentSelectors":["offres"],"selector":"div.description-aside dd:nth-of-type(1)","multiple":false,"regex":"","delay":0},{"id":"recruteur","type":"SelectorText","parentSelectors":["offres"],"selector":"div.apply-block dd:nth-of-type(1)","multiple":false,"regex":"","delay":0},{"id":"email","type":"SelectorText","parentSelectors":["offres"],"selector":"div.apply-block dd:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"profil","type":"SelectorText","parentSelectors":["offres"],"selector":"div.description p","multiple":false,"regex":"","delay":0},{"id":"pagination","type":"SelectorElementClick","parentSelectors":["_root","offres"],"selector":"p.results-more","multiple":true,"delay":0,"clickElementSelector":"p.results-more a.btn","clickType":"clickMore","discardInitialElements":"do-not-discard","clickElementUniquenessType":"uniqueText"}]}

Here's the scrape referenced in the images above. Unfortunately I can't provide access to the site for troubleshooting, as the data is behind authentication.

{
"_id": "bret_scrape_path_1",
"startUrl": [
"https://localhost.com/erfp/searchRfp.do?action=search&searchCriteriaVo.DateFrom=09%2F19%2F2018"
],
"selectors": [
{
"id": "QUERY_TABLE_ROW",
"type": "SelectorElement",
"parentSelectors": [
"_root"
],
"selector": "#divContentH div table tr:nth-of-type(n+3)",
"multiple": true,
"delay": "7000"
},
{
"id": "RFP_ID",
"type": "SelectorLink",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "a",
"multiple": false,
"delay": 0
},
{
"id": "BRET_ID",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(2)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "PARENT_STATUS",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(3)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "RELATED_RFPs",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(4)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "PROSPECT_NAME",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(5)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "ERFP_STATE",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(6)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "EPM_STATE",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(7)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "BRET_STATE",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(8)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "BRET_STATE_TIMESTAMP",
"type": "SelectorElementAttribute",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(8)",
"multiple": false,
"extractAttribute": "title",
"delay": 0
},
{
"id": "NUM_EEs",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(9)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "TLM_STATE",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(10)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "ETS_STATE",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(11)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "ADP_UW",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(12)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "AP_REGION",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(13)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "REV_CENTER",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(14)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "PAYGROUP",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(15)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "PARENT",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.smcopy:nth-of-type(16)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "FEIN",
"type": "SelectorText",
"parentSelectors": [
"QUERY_TABLE_ROW"
],
"selector": "td.border_bottom.border_right.smcopy",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "CLIENT_DBA",
"type": "SelectorElementAttribute",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.prospectBusiness']",
"multiple": false,
"extractAttribute": "value",
"delay": 0
},
{
"id": "CLIENT_ADDRESS",
"type": "SelectorElementAttribute",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.prospectAddr1']",
"multiple": false,
"extractAttribute": "value",
"delay": 0
},
{
"id": "CLIENT_ADDRESS1",
"type": "SelectorElementAttribute",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.prospectAddr2']",
"multiple": false,
"extractAttribute": "value",
"delay": 0
},
{
"id": "CLIENT_CITY",
"type": "SelectorElementAttribute",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.prospectCity']",
"multiple": false,
"extractAttribute": "value",
"delay": 0
},
{
"id": "CLIENT_STATE",
"type": "SelectorText",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.prospectState'] option:checked",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "CLIENT_ZIP",
"type": "SelectorElementAttribute",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.prospectZip']",
"multiple": false,
"extractAttribute": "value",
"delay": 0
},
{
"id": "LEAD_SOURCE",
"type": "SelectorText",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name ='demographicsVo.leadId'] option:checked",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "LEAD_SOURCE_DETAIL",
"type": "SelectorText",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.leadSourceDetailId'] option:checked",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "BROKER_TYPE",
"type": "SelectorText",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.brokerType'] option:checked",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "BROKER_NAME",
"type": "SelectorText",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.brokerId'] option:checked",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "AGENT_NAME",
"type": "SelectorText",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.brokerAgentId'] option:checked",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "BROKER_LOCATION",
"type": "SelectorText",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.brokerLocationId'] option:checked",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "IS_NOT_EXISTING_CLIENT",
"type": "SelectorElementAttribute",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.hasAdpAffiliation']:nth-of-type(2)",
"multiple": false,
"extractAttribute": "checked",
"delay": 0
},
{
"id": "PARENT_ID",
"type": "SelectorElementAttribute",
"parentSelectors": [
"RFP_ID"
],
"selector": "[name = 'demographicsVo.parentTsCode'] ",
"multiple": false,
"extractAttribute": "value",
"delay": 0
},
{
"id": "CONTACTS_LINK",
"type": "SelectorElementClick",
"parentSelectors": [
"RFP_ID"
],
"selector": "tr:nth-of-type(2)",
"multiple": false,
"delay": "2500",
"clickElementSelector": "td.copy table:contains('Contacts') a",
"clickType": "clickOnce",
"discardInitialElements": true,
"clickElementUniquenessType": "uniqueText"
},
{
"id": "DM_NAME",
"type": "SelectorText",
"parentSelectors": [
"CONTACTS_LINK"
],
"selector": "td tr:nth-of-type(2) td.smcopy:nth-of-type(2)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "OFFICE",
"type": "SelectorText",
"parentSelectors": [
"CONTACTS_LINK"
],
"selector": "tr:nth-of-type(3) td.smcopy:nth-of-type(2)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "REGION",
"type": "SelectorText",
"parentSelectors": [
"CONTACTS_LINK"
],
"selector": "tr:nth-of-type(2) td.smcopy:nth-of-type(6)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "CENSUS_LINK",
"type": "SelectorElementClick",
"parentSelectors": [
"CONTACTS_LINK"
],
"selector": "",
"multiple": false,
"delay": "2000",
"clickElementSelector": "td.copy table:contains('BRET Census') a",
"clickType": "clickOnce",
"discardInitialElements": true,
"clickElementUniquenessType": "uniqueText"
},
{
"id": "CURRENT_CARRIERS",
"type": "SelectorGroup",
"parentSelectors": [
"CENSUS_LINK"
],
"selector": "td.label.border_left td:nth-of-type(3) option",
"delay": 0,
"extractAttribute": ""
},
{
"id": "PEO",
"type": "SelectorText",
"parentSelectors": [
"CENSUS_LINK"
],
"selector": "td.smcopy.border_top:nth-of-type(6)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "SIC_CODE",
"type": "SelectorElementAttribute",
"parentSelectors": [
"CENSUS_LINK"
],
"selector": "input#sic",
"multiple": false,
"extractAttribute": "value",
"delay": 0
},
{
"id": "NAICS_CODE",
"type": "SelectorText",
"parentSelectors": [
"CENSUS_LINK"
],
"selector": "select#naicsCodeSelect option:checked",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "QUESTIONNAIRE_LINK",
"type": "SelectorElementClick",
"parentSelectors": [
"CENSUS_LINK"
],
"selector": "
",
"multiple": false,
"delay": "9000",
"clickElementSelector": "td.copy table:contains('BRET Questionnaire') a",
"clickType": "clickOnce",
"discardInitialElements": true,
"clickElementUniquenessType": "uniqueText"
},
{
"id": "COVERAGE_EMP",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "tr:nth-of-type(2):contains('Employee') td.smcopy.border_right",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "COVERAGE_ES",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "tr:nth-of-type(3):contains('Employee+Spouse') td.smcopy.border_right",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "COVERAGE_EC",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "tr:nth-of-type(4):contains('Employee+Children') td.smcopy.border_right",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "COVERAGE_EF",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "tr:nth-of-type(5):contains('Employee+Family') td.smcopy.border_right",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "COVERAGE_NONE",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "tr:nth-of-type(6):contains('No Coverage') td.smcopy.border_right",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "COVERAGE_TERM",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "tr:nth-of-type(7):contains('Terminated') td.smcopy.border_right",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "AS_FACTOR",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "tr:nth-of-type(2):contains('Age/Sex Factor') td.smcopy.border_right",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "IND_FACTOR",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "tr:nth-of-type(3):contains('Industry Factor') td.smcopy.border_right",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "ASI_SCORE",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "tr:nth-of-type(4):contains('Age/Sex/Industry') td.smcopy.border_right",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "EXISTING_MEDICAL",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "select#question_3.smcopy option:checked",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "MESSAGE_HISTORY",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "td.smcopy:contains('Message History:')",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "NOTES",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "td.smcopy:contains('Notes:')",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "SELF_INSURED",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "[name='question_15'] option:checked",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "RENEWAL_DATE",
"type": "SelectorHTML",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "#questionnaire tr:contains('b. What is the medical renewal date?') td.label:nth-of-type(2)",
"multiple": false,
"regex": "(?<=value=")([0-9]|/)+",
"delay": 0
},
{
"id": "NUM_RENEWALS",
"type": "SelectorHTML",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "#questionnaire tr:contains('c. How many renewals') td.label:nth-of-type(2)",
"multiple": false,
"regex": "(?<=value=")([0-9])+",
"delay": 0
},
{
"id": "LAST_RENEWAL_INCREASE",
"type": "SelectorHTML",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "#questionnaire tr:contains('d. What was the last renewal') td.label:nth-of-type(2)",
"multiple": false,
"regex": "(?<=value=")([0-9])+",
"delay": 0
},
{
"id": "UPCOMING_RENEWAL_INCREASE",
"type": "SelectorHTML",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "#questionnaire tr:contains('e. What is the upcoming renewal') td.label:nth-of-type(2)",
"multiple": false,
"regex": "(?<=value=")([0-9])+",
"delay": 0
},
{
"id": "DISABLED_WC_COUNT",
"type": "SelectorHTML",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "#questionnaire tr:contains('3. How many employees are not actively at work') td.label:nth-of-type(2)",
"multiple": false,
"regex": "(?<=value=")([0-9])+",
"delay": 0
},
{
"id": "COVERAGE_WAIVED_RATIO",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "table:nth-child(9) tbody tr:nth-child(3) td table tbody tr:nth-child(2) td",
"multiple": false,
"regex": "([0-9])+",
"delay": 0
},
{
"id": "COVERAGE_COBRA_RATIO",
"type": "SelectorText",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "td.label:contains('Total COBRA Participants')",
"multiple": false,
"regex": "([0-9])+",
"delay": 0
},
{
"id": "ERFPSUMMARY_LINK",
"type": "SelectorElementClick",
"parentSelectors": [
"QUESTIONNAIRE_LINK"
],
"selector": "",
"multiple": false,
"delay": "1500",
"clickElementSelector": "td.copy table:contains('eRFP Summary') a",
"clickType": "clickOnce",
"discardInitialElements": true,
"clickElementUniquenessType": "uniqueText"
},
{
"id": "SALES_COORDINATOR",
"type": "SelectorText",
"parentSelectors": [
"ERFPSUMMARY_LINK"
],
"selector": "table:contains('eRFP Summary') table:nth-of-type(1) tr:contains('Sales Coordinator') td.smcopy",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "TLM_SALES_LINK",
"type": "SelectorElementClick",
"parentSelectors": [
"ERFPSUMMARY_LINK"
],
"selector": "
",
"multiple": false,
"delay": "2000",
"clickElementSelector": "td.copy table:contains('TLM Sales') a",
"clickType": "clickOnce",
"discardInitialElements": true,
"clickElementUniquenessType": "uniqueText"
},
{
"id": "SALES_EXECUTIVE",
"type": "SelectorText",
"parentSelectors": [
"TLM_SALES_LINK"
],
"selector": "#mainTBody table table tr:contains('SE/SD Name:') td:nth-of-type(2)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "SALES_EXECUTIVE_EMAIL",
"type": "SelectorText",
"parentSelectors": [
"TLM_SALES_LINK"
],
"selector": "#mainTBody table table tr:contains('SE/SD Name:') td:nth-of-type(4)",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "DM_EMAIL",
"type": "SelectorText",
"parentSelectors": [
"TLM_SALES_LINK"
],
"selector": "#mainTBody table table tr:contains('DM Name:') td:nth-of-type(4)",
"multiple": false,
"regex": "",
"delay": 0
}
]
}

bump. wondering if there are any updates.

For me, the following was happening when i started my scrape:

  1. The pop-up window in which scraping is done gets opened.
  2. The scraper goes through the pages exactly like I have setup (clicking on the link, going to the next page etc).
  3. Scraper finishes and the window gets closed.
  4. I click on "Refresh data" button, and it shows no data scraped.

I needed my setup running ASAP and so have shifted to the earlier version of webscraper. That continues to work flawlessly.

1 Like

Go the same trouble here. How do you guys go back to a previous version ? It's really important to me too...

The pagination handler in the sitemap was set up incorrectly and shouldn't have worked in the previous version. The Element click selector has to select the elements that are being loaded within the page and the load more button. You had selected the wrapper element of the button and the button. The recursive selector chain that you had set up was somehow making the sitemap work. If the element click selector is set up correctly then you don't have to make these recursive selector chains. Watch the video tutorial that shows how to set up pagination handling with element click selector - https://www.youtube.com/watch?v=x8bZmUrJBl0

Here is the fixed sitemap:

{"_id":"pole-emploi","startUrl":["https://candidat.pole-emploi.fr/offres/recherche?emission=1&motsCles=vendeur&offresPartenaires=false&range=0-9&rayon=10&tri=1"],"selectors":[{"id":"offres","type":"SelectorLink","parentSelectors":["pagination"],"selector":"h2.t4 a.btn-reset","multiple":true,"delay":0},{"id":"titre","type":"SelectorText","parentSelectors":["offres"],"selector":"h2.t2","multiple":false,"regex":"","delay":0},{"id":"lieu","type":"SelectorText","parentSelectors":["offres"],"selector":"p.t4","multiple":false,"regex":"","delay":0},{"id":"type","type":"SelectorText","parentSelectors":["offres"],"selector":"div.description-aside dd:nth-of-type(1)","multiple":false,"regex":"","delay":0},{"id":"recruteur","type":"SelectorText","parentSelectors":["offres"],"selector":"div.apply-block dd:nth-of-type(1)","multiple":false,"regex":"","delay":0},{"id":"email","type":"SelectorText","parentSelectors":["offres"],"selector":"div.apply-block dd:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"profil","type":"SelectorText","parentSelectors":["offres"],"selector":"div.description p","multiple":false,"regex":"","delay":0},{"id":"pagination","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.media-body","multiple":true,"delay":"1000","clickElementSelector":"p.results-more a.btn","clickType":"clickMore","discardInitialElements":"do-not-discard","clickElementUniquenessType":"uniqueText"}]}

The sitemap must be incorrect. Almost every click selector within the sitemap don't have anything in the "selector" input field. Either you somehow tricked the validation to keep these fields empty or edited the sitemap code. You will have to update the element click selectors. Try putting "parent" in the selector field if you are going through tabs and and not going deeper in some kind of navigation.

If a newer version is really breaking something we are fully committed to fix the issue. Right now it seems that there are only cases with misconfiguration.

Please provide the sitemap that is broken. Without the sitemap we cannot replicate problems.

Indeed, it was my mistake. I have an other problem now : it scrolls normally (as long as there are elements to scroll) but sometimes it crashes (the window where the scraping is being done is closing). Maybe it's because the page I am scrolling is very long ? (it stops after 10-15 minutes of scrolling)

What should I do ?

edit / details : I scroll a page, when I am at the bottom new elements are loaded and on and on... and when all elements are loaded I scrape the links of the elements shown.

It would appear that the asterisk character got stripped when I pasted the sitemap above. The selector properties that you identified as empty are actually populated by an asterisk ("*"), as follows:

Based on your documentation, this does not appear to be a "trick". The 'all' selector is supported in both CSS and jQuery.

The use of "parent" for the selector input field does not work in my sitemap. I get the error "cannot find parent" when I try to select elements on the target page.

Asterisk sign is a supported CSS selector. That said it means that it should select every element within the page. All the child selectors of this selector should be executed within every element therefore. This doesn't seem correct.

The selector in Element Click Selector should select the wrapper element in which child selectors should be executed. Try updating the selector so that it selects the right parent element for the child selectors.

Additional information. Here's an example of the 'OFFICE' text selector, which is a child of the CONTACTS_LINK link selector. Element preview works fine to identify the content, but data preview returns nothing. CONTACTS_LINK is the first failing link in the nested link path, and uses "selector": "tr:nth-of-type(2)".

The best solution would be to avoid the scroll down selector. Try navigating deeper in the site by opening subcategory links or some kind of filter links.

If that is not possible you can try to limit the element count that the selector should extract. Unfortunately currently this requires to manually edit the CSS selector by adding an element limiter selector. You would have to add something like this to the CSS selector :nth-child(-n+10) to select first 10 elements.