Pls help: It selects more elements than it should

Hey Forum,

I need help again. I tried it, but it doesnt work :confused:
I want to scrape from a domain (1) following contents:
Team A-Team B, First diget of the first content of every bar-graph, Second diget of the first content of every bar-graph.

For example:
First row:

  1. cell: FC Schalke 04 - SV Werder Bremen
  2. cell: 2
  3. cell: 1
    Secend Row:
  4. cell: Bayer Leverkusen - Hannover 96
  5. cell: 2
  6. cell: 1

...and so on...

Its pretty tricky (for me). Because by selecting the results its also select a team-name...dont know why. Can you help me?
Thanks!

(1) http://www.westline-tippspiel.de/tipptrend.php

Nobody Knows an Answer or has a Tipp for me? :frowning:

I can’t sort it out sorry. The way the page is set up i believe you would need regex to scrape. @iconoclast this ones up your alley

yeah, I tried it with regex...but i just have not enough skills for using that. @iconoclast , can you help?

Hi!

It can be done without using RegEx, just needs to be figured out how to do it properly.

In this case, due to a design of the website, it won't be possible to make a perfect grouping, but I've managed to make it like you mentioned.

Your sitemap:

{"_id":"tippspiel","startUrl":["http://www.westline-tippspiel.de/tipptrend.php"],"selectors":[{"id":"group","type":"SelectorElement","parentSelectors":["_root"],"selector":"h3, [id^=chart_div] g g g:nth-of-type(1) text, [id^=chart_div] g g g:nth-of-type(2) text","multiple":true,"delay":0},{"id":"Teams","type":"SelectorText","parentSelectors":["group"],"selector":"_parent_","multiple":false,"regex":"","delay":0}]}

1 Like

Thx for helping!

How did you groupe it without the text-element between the teams and the results (for excample in the current website-version without "VfB Stuttgart" between "TSG 1899 Hoffenheim - VfB Stutgart" and "2:1")? I will learn it, so i tried it. But I cant deselect it.

And: Actually, my plan is, to put the Teams and the result in the same line; for the first excample of the current website-version:
Line 1: Cell 1: "TSG 1899 Hoffenheim - VfB Stutgart" Cell 2: "2" Cell 3: "1"
Line 2: Cell 1: "Borussia Dortmund-Hertha BSC" Cell 2: "3" Cell 3: "1"
...and so on...
Sorry for the questions...but im really interested in the way to scrape it...