Results in incorrect order and in different rows

Hello, I am trying get dates from mismarcadores.com. I saw the tutorial videos from Web Scraper but I didn't get the results I want.

I am trying extract data from Resultados LaLiga Santander 2018/2019 - Fútbol/España but I can't not get in a corect order (how they appear in the web page) and I get three registers, one for local team, one for foreign team and another for the result.

Can you help me?

Mi sitemap is:

{"_id":"ligasantander","startUrl":["https://www.mismarcadores.com/futbol/espana/laliga/resultados/"],"selectors":[{"id":"Local","type":"SelectorText","parentSelectors":["_root"],"selector":"span.padr","multiple":true,"regex":"","delay":0},{"id":"Visitante","type":"SelectorText","parentSelectors":["_root"],"selector":"span.padl","multiple":true,"regex":"","delay":0},{"id":"Resultado","type":"SelectorText","parentSelectors":["_root"],"selector":"td.cell_sa","multiple":true,"regex":"","delay":0}]}

Thanks.

1 Like

Hi,

The offset might happen due to the fact, that because how you have selected the information, the scraper does not recognize which items should be grouped together, that is why, if there is any inconsistencies in the data, you might get a faulty result.

Next time, try using the Element selector first, to group the items together in a manner you would like them to be extracted.

Here is an updated sitemap that should work:

{"_id":"ligasantander","startUrl":["https://www.mismarcadores.com/futbol/espana/laliga/resultados/"],"selectors":[{"id":"Local","type":"SelectorText","parentSelectors":["score-element"],"selector":".team-home","multiple":false,"regex":"","delay":0},{"id":"Visitante","type":"SelectorText","parentSelectors":["score-element"],"selector":".team-away","multiple":false,"regex":"","delay":0},{"id":"Resultado","type":"SelectorText","parentSelectors":["score-element"],"selector":".score","multiple":false,"regex":"","delay":0},{"id":"score-element","type":"SelectorElement","parentSelectors":["_root"],"selector":"tr.stage-finished","multiple":true,"delay":0}]}

Look, this:

{"_id":"mismarcadoresdirecto","startUrl":["https://www.mismarcadores.com/"],"selectors":[{"id":"endirecto","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.event","multiple":true,"delay":0,"clickElementSelector":"div.tabs li:nth-of-type(2) a","clickType":"clickOnce","discardInitialElements":true,"clickElementUniquenessType":"uniqueHTML"},{"id":"pais","type":"SelectorElement","parentSelectors":["pais"],"selector":"div.event__header","multiple":false,"delay":0},{"id":"partidos","type":"SelectorElement","parentSelectors":["endirecto"],"selector":"div.event__match","multiple":true,"delay":0},{"id":"local","type":"SelectorText","parentSelectors":["partidos"],"selector":"div.event__participant.event__participant--home","multiple":false,"regex":"","delay":0},{"id":"visitante","type":"SelectorText","parentSelectors":["partidos"],"selector":"div.event__participant.event__participant--away","multiple":false,"regex":"","delay":0},{"id":"resultado","type":"SelectorText","parentSelectors":["partidos"],"selector":"div.event__scores","multiple":false,"regex":"","delay":0}]}

Hello, thanks for your answer. Now is correct but I am trying to simulate and I don't get your sitemap. I did several tests but I don't get your sitemap. For example, the test most similar to your sitemap is:

"{"_id":"prueba","startUrl":["https://www.mismarcadores.com/futbol/espana/laliga/resultados/"],"selectors":
[{"id":"Local","type":"SelectorText","parentSelectors":["score-element"],"selector":"tr#g_1_U9NUtHGo.odd span.padr","multiple":false,"regex":"","delay":0},{"id":"Visitante","type":"SelectorText","parentSelectors":["score-element"],"selector":"tr#g_1_U9NUtHGo.odd span.padl","multiple":false,"regex":"","delay":0},{"id":"Resultado","type":"SelectorText","parentSelectors":["score-element"],"selector":"tr#g_1_U9NUtHGo.odd td.cell_sa","multiple":false,"regex":"","delay":0},{"id":"score-element","type":"SelectorElement","parentSelectors":["_root"],"selector":"tr#g_1_U9NUtHGo.odd td","multiple":true,"delay":0}]}"

There are a few differences but I don't know how you get ".team-home". I think you see the html code with inspect. I don't do that. I only clic in the column with the team home so I get tr#g_1_U9NUtHGo.odd span.padr.

I follow these steps to create this sitemap. I create de 'Local' selector text, the 'Visitante' selector text and the 'Resultado'. Then I created the score-element and changed the parent of the 'Local, Visitante and Resultado' selector text to 'score-element'. Is this the secuence correct?

Thanks for your help.

Thanks 'socruel' by your answer. At this moment I want to understand firstable the webber answer.

Best regards.

I'm running into the exact same issue. I'm trying for 3 columns (very similar to the multi-item element tutorial in the tut. videos), however even with a step-by-step I end up with entries for each column on separate lines. Here's the map:

{"_id":"grubhub-chinalee","startUrl":["https://www.grubhub.com/restaurant/china-lee-4662-s-yosemite-st-greenwood-village/166590"],"selectors":[{"id":"ItemWrapper","type":"SelectorElement","parentSelectors":["_root"],"selector":".menuItemNew","multiple":true,"delay":0},{"id":"miName","type":"SelectorText","parentSelectors":["ItemWrapper"],"selector":".menuItem-name","multiple":true,"regex":"","delay":0},{"id":"miPrice","type":"SelectorText","parentSelectors":["ItemWrapper"],"selector":".menuItem-displayPrice","multiple":true,"regex":"","delay":0},{"id":"miDesc","type":"SelectorText","parentSelectors":["ItemWrapper"],"selector":".menuItemNew-description--truncate","multiple":true,"regex":"","delay":0}]}

Like magarcia, I'm trying to wrap my head around what's causing the point of failure. Does the element wrapper parent class selector need to be present in the child selectors? I tried that, but it ends up not scraping anything.

Hello Rule72, I tried import your sitemap but I got the message "Grubhub food delivery is not available in your country". So, I need a VPN to see your sitemap in chrome.

Best regards.

I'm having this same issue, was there ever a fix?

Hello, if you haven't done so already, I suggest checking out the "single page multiple record extraction" tutorial video:

Keep in mind that in most cases when using an Element selector, its child selectors shouldn't have Multiple checked. If the selector hierarchy seems to be set up correctly but the issue persists, this is often the problem.