How to extract id from url

mage · February 12, 2021, 4:57am

Hello,

I'm trying to extract ID from url.

For example, in this page Age of Conan Quest - Ruins and Demise
In code, many attributes contains "119", so i tried to use "SelectorElementAttribute", but there is no regex option . So How i can i accomplish this ?

Sitemap:
{"_id":"extract_id","startUrl":["https://web.archive.org/web/20131207042638/http://conan.doomdealer.com:80/quests/quest/119"],"selectors":[{"id":"the-id","type":"SelectorElementAttribute","parentSelectors":["root"],"selector":"div[id*='unit_long_quest_hard']","multiple":false,"extractAttribute":"id","delay":0}]}

Thx

mage · February 12, 2021, 2:15pm

I have found how to :).

This post help me => How to scrape youtube id from meta data? - #3 by leemeng

The id is present in the id of an element:
<div id="unit_long_quest_hard_119"></div>

So, select element type is 'HTML',
then i select the element with:
div[id*='unit_long_quest_hard_'] // *= mean "contain"
and regex is simply:
\d+

the result is '119'

Note: the element returned by HTML selector is the child, not the element selected himself