Hi,
How to exclude part of element? For example in below sitemap I only want the value "12 Watts" but I only manage to scrape "Power Handling (RMS) De belasting in elektrisch vermogen uitgedrukt in watt dat een luidsprekerspoel kan opnemen voor een langere tijd, zonder de luidsprekerspoel te beschadigen. RMS staat voor Root Mean Square. 12 Watts" .
This is what I see when I inspect the element:
< li class >
< span > Power Handling (RMS)
< i class ="info-wrapper"> ... < /i >
< /span >
12 Watts < /li >
::after
< /li >
This is the selector I use that gives me all the text:. .list-info li:nth-of-type(4)
If I'd want everything BUT the value, i can use: .list-info li:nth-of-type(4) span
So what selector do I use to only get 12 Watts ? without the span
EDIT: in this particular case I can manage with a Regex expression (?<=Square.)[1-9 ]+ (result = 12 ) but there must be a better, more generalized way, no?
In case it might prove helpful, if I use plugin "CSS Selector Finder" plugin for the value in developer console it gives me error: Can't generate CSS selector for non-element node type.
Url: Tang Band W3-1878 woofer kopen - SoundImports
Sitemap:
{"_id":"soundimports_oneitempage","startUrl":["h t t p s://www.soundimports.eu/nl/tang-band-w3-1878.html"],"selectors":[{"id":"specs_element","type":"SelectorElement","parentSelectors":["_root"],"selector":"article.a","multiple":false,"delay":0},{"id":"value-select--not-working","type":"SelectorText","parentSelectors":["specs_element"],"selector":"li:nth-of-type(4) ","multiple":false,"regex":"","delay":0}]}
The selector alternative is new to me, for the regex I'll need to look up what \d does. For now I went with (?<="end of beginning string that I wish to ignore").*$