Almost about to give up. Please Guide Me

Radius · May 22, 2024, 6:43am

Hi everyone,

I've been trying to scrape some data from BeatStars and could really use your help. Here’s what I’m looking to scrape:

Producer Name
Track Name
Instagram Link

This is the website I’m working with: BeatStars Search

Here’s what I’ve done so far:

Created a new sitemap and inserted the URL above.
Created 2 selectors (type: text) and selected multiple producer and Instagram names. This works, but I only get 20 results after scraping. How can I get more results?

I also need the Instagram links, and I'm having trouble with this part. Here’s the process:

Click the producer name.
Scroll down.
The Instagram icon is on the left side.

I tried getting the Instagram URLs with the following steps:

Added a new selector > element click (selected “producer name”).
Added a new selector > link (selected Instagram Icon).

Not all producers have an Instagram icon. For example, visit this producer with an Instagram icon. If you don’t see an Instagram icon after clicking a producer name, try another producer.

Could someone please provide step-by-step instructions, including the parent-child relations?

To summarize, I need:

Producer Name
Track Name
Instagram Link (click the producer name, scroll down, then on the left side is the IG icon)
More than 20 results (Data Preview shows more, but scraping only gives 20)

Many thanks!

JanAp · May 22, 2024, 8:37am

Hi,

Please try this setup:

{"_id":"beatstars","startUrl":["https://www.beatstars.com/search?type=tracks&q=young%20thug"],"selectors":[{"delay":0,"elementLimit":100,"id":"listing-wrapper","multiple":true,"parentSelectors":["_root"],"selector":"mp-card-figure-track.list-item:has(.sponsored-badge.hidden)","type":"SelectorElementScroll"},{"id":"title","multiple":false,"parentSelectors":["listing-wrapper"],"regex":"","selector":".text-m-loud a","type":"SelectorText"},{"id":"producer","linkType":"linkFromHref","multiple":false,"parentSelectors":["listing-wrapper"],"selector":".text-s-silent a","type":"SelectorLink"},{"extractAttribute":"href","id":"instagram","multiple":false,"parentSelectors":["producer"],"selector":"a[href*=\"instagram\"]","type":"SelectorElementAttribute"}]}

Currently, the scroll is limited to 100 listings. You can adjust that by editing the listing-wrapper selector.

Radius · May 22, 2024, 9:23am

Thx, where can I paste the code in?

JanAp · May 22, 2024, 9:27am

Radius · May 22, 2024, 11:09am

I tried to import the script and it gives the producername and trackname, so that's good, however it doesn't give me the Instagram Link yet. The instagram Icon is to be found on the subpage what appears when the producername is clicked. Can you help me with that please?

JanAp · May 22, 2024, 11:51am

It returns the Instagram link. Did you run the actual scrape (Sitemap -> Scrape) or just click the Data preview?

Radius · May 22, 2024, 12:22pm

It works, you were right, I did the data preview. Thanks. One last question, how can I make sure I can start the next 100 listings when I scrape again? Now I have a listing scraped and when I scrape again I want the next listings.

JanAp · May 22, 2024, 2:47pm

It will make more sense to restart the scrape and adjust the element limit for the listing-wrapper selector.

Radius · May 22, 2024, 3:14pm

Thanks., will try it as soon as I am home. Fantastic support!!

Radius · May 22, 2024, 5:54pm

Sorry for asking so many question, It is quite unconfortable to me but I was hoping you could support me on how to adjust the listing-wrapper so the already scraped items are skipped.

JanAp · May 29, 2024, 11:56am

If you start a new scrape, the previous data is automatically discarded, it is not possible to continue the scrape from where you stopped last time.

don2010 · May 29, 2024, 12:35pm

you should export and save separately each instance of sraping.. Afterwards you ill be able to combine it all together..