Update: Somehow solved. How do I get token number affix to the end of the scripted url

JojoChi · April 15, 2022, 3:54pm

Old Title: Part of the targeted URLs are missing when scrapping
I’m scraping from a user-only website.

The sitemap is beyond simple, I just need to get the urls belong to a “vote count” buttons.

When I click “preview” and the URLs data is clearly good to go, but then when I go to scraping, part of the URLs are missing, to be specific, they all lose the same last part of the URL after the third “&”, it's a token number so it's crucial, and it’s also the same as the last part of the parent URL.

I’m confused, why would this happen? I’ve done successful scraping before, on this site, different part though. Anyone here has run into the same problem? Thanks a lot!

Sitemap:
{"_id":"experience","startUrl":["https://mp.weixin.qq.com/cgi-bin/appmsgpublish?sub=list&begin=0&count=10&token=99209221&lang=zh_CN"],"selectors":[{"delay":0,"id":"justNumber","multiple":true,"parentSelectors":["_root"],"selector":"div.weui-desktop-block:nth-of-type(2) a.appmsg-vote","type":"SelectorLink"}]}

Update: I checked the HTML page and figured the original page doesn't have token fixed, that's why it won't be scraped, is there anyway to fix it during scraping?