What if element ID often changes?

cryptoforbreakfast · February 8, 2024, 10:13am

Hi,

I have had this same situation happen multiple times, and I'm starting to wonder if there isn't some very elegant solution to the problem...

(Most) often web designers use generators for the creation of their website. This results in the situation where element ID's are often a randomly generated string of characters. You can set up a scraping job which is to run on a recurring basis, but once every so often the website has undergone some updates, and some or all elements suddenly have a new string ID. However, often the ID is comprised of a fixed part, followed by the random string.

For example: whatever text element I want to scrape

The "ffCeppM" part is random and subject to frequent change, but the "OfferPrice-" part is probably not ever changed.

Can I set up my text selector in such a way that it keeps working even after the random string is changed? Something like:

span[id="OfferPrice-$"]
or
span[class="price-section"]:contains("OfferPrice-")

Could something like this work? Or is it just wishful thinking and would we be doomed to redo the webscraper sitemap every time there's an update on the target site?

Thanks!!

JanAp · February 8, 2024, 10:52am

Hi, you can add a * to the attribute, i.e. span[id*="OfferPrice-"]

This will match all IDs that contain 'OfferPrice-'.

I hope this helps!

cryptoforbreakfast · February 9, 2024, 10:59am

Hi Janis,

That worked! Thank you!