Please help me i need urgent help

I have a small project but i'm not able to complete this project. Here my target website link: https://brokercheck.finra.org/search/genericsearch/list


Please use this zip code on zip code filed 94102

I need only below image area data

Ok - I might need some help from @iconoclast or @jeremyrem who knows some API

Problem 1: The Search URL doesn't indicate the search so I'm not sure how to make the start URL automatically load your zipcode.

Problem 2: There are a lot of pages to scrape, this will take a bit.

Option 1 - Use Dataminer as an alternative, I can share the recipe
Option 2 - Use the following code but instead of running it, click Data preview on the main element select. This will go through all 1200+ pages before scraping (might overrun your memory? I stopped it at 1200 results

{"_id":"a-a-broker-check-scrape","startUrl":["https://brokercheck.finra.org/search/genericsearch/list"],"selectors":[{"id":"Page-Next","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.listpanel","multiple":true,"delay":0,"clickElementSelector":"a:contains("›")","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueCSSSelector"},{"id":"Name","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"span.md-title span.smaller","multiple":false,"regex":"","delay":0},{"id":"CRD #","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.layout-xs-row div.font-dark-gray span.spacerright","multiple":false,"regex":"","delay":0},{"id":"Firm","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.bold > span.ng-binding","multiple":false,"regex":"","delay":0},{"id":"Location","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.font-dark-gray div.smaller:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"Broker","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.cell.ng-scope","multiple":false,"regex":"","delay":0}]}

Here is a partial scrape, using option #2. You could run this through a data enrichment program like ZapInfo to automatically grab contact details and social profiles

Hi there,
Thanks For your help sorry to say that sitemap not working

Try #1

{"_id":"a-a-broker-check-scrape","startUrl":["https://brokercheck.finra.org/search/genericsearch/list"],"selectors":[{"id":"Page-Next","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.listpanel","multiple":true,"delay":0,"clickElementSelector":"a:contains(\"›\")","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueCSSSelector"},{"id":"Name","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"span.md-title span.smaller","multiple":false,"regex":"","delay":0},{"id":"CRD #","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.layout-xs-row div.font-dark-gray span.spacerright","multiple":false,"regex":"","delay":0},{"id":"Firm","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.bold > span.ng-binding","multiple":false,"regex":"","delay":0},{"id":"Location","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.font-dark-gray div.smaller:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"Broker","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.cell.ng-scope","multiple":false,"regex":"","delay":0}]}

If that doesn't work:

{"_id":"a-a-broker-check-scrape","startUrl":["https://brokercheck.finra.org/search/genericsearch/list"],"selectors":[{"id":"Page-Next","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.listpanel","multiple":true,"delay":0,"clickElementSelector":"a:contains("›")","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueCSSSelector"},{"id":"Name","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"span.md-title span.smaller","multiple":false,"regex":"","delay":0},{"id":"CRD #","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.layout-xs-row div.font-dark-gray span.spacerright","multiple":false,"regex":"","delay":0},{"id":"Firm","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.bold > span.ng-binding","multiple":false,"regex":"","delay":0},{"id":"Location","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.font-dark-gray div.smaller:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"Broker","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.cell.ng-scope","multiple":false,"regex":"","delay":0}]}

Sorry to say that this code not working Boss. I face some problem please see this video:

Please sir create a short video for me. I need this help urgently. I'm now helpless sir.

Yes - Because there is no way to tell it to run the search, that i'm aware of.

Load the sidemap and manually run your search

Now click on "Data Preview" on the page next selector

This will run for 60-120 Minutes (guess)

You'll see it paginate by the data won't be available until it ends.

Sir what do you mean by manual search?


Sir Acctually i don't undestand what do you say. Please explain. I'm only able to collect 12 data

If it doesn't render, here is the download link
https://drive.google.com/file/d/1fg_H1eatJOlbaBnlAOpFRflGq8dyI8KI/view

Looks like the easiest way for you to do this is extract the json response of your search.

  1. Open Chrome & navigate to "https://brokercheck.finra.org/search/genericsearch/list"

  2. Open up the Chrome Dev Tools (Ctrl + Shift + I or right click Inspect)

  3. Goto the Network Tab

  4. Enter Zipcode, or whatever you want to search and click Search

  5. Looks for an address like this

  6. Double click it and it will take you to a page with all your data in json format

After that, you will need to clean up the JSON a bit to standardize it so its parseable

Replace "/**/angular.callbacks._1({"errorMessage":null,"errorCode":0," with "{" at the beginning of the file No Qoutes

At the end of the file delete ");" no qoutes

Copy what you got and goto https://jsonformatter.curiousconcept.com/# and paste it in the box and click Process.

If it says Valid JSON then we are good to goto the next step, if not try to correct the mistakes.

Converting JSON to CSV/EXCEL

There are a number of ways to do this, I prefer using a program called jq, however jq is pretty hard to use for someone who doesnt know how to use sql, large datasets, and linux cli so I'm going to recommend we go on over to https://sqlify.io/convert for this project.

  1. Click Paste raw data
  2. Paste the valid json in the box
  3. Click CSV & Convert to CSV
    4 (Optional) Rename any of the Result field names to better match what they are, DO NOT TOUCH the SOURCE FIELD
  4. Click Save schema and continue
    6 Download file

And you are all set

1 Like

thanks you so much. your trick working Boss

Does that get all 1500 results or just the first page? Would changing nrows or r= change anything?

It should get everything.

From the look of their site, they load up everything in that json response and the website parses it out.

I dont see it requesting any new responses.

Once you extract it, let us know.

Normally on api calls, unless it says limit I dont change it.

Limit & Offset are pretty much are standard and most devs wont change the name

Limit = How many records to display
Offset = Where to start

Limit=10&Offset=0 mean show 10 records from the beginning = 1-10
Limit=10&Offset=10 means show 10 records, 10 from the beginning = 11-20

Max on limit is 999

1 Like

Interesting, in this case there are more than 999 records. Does that change anything?

No, it has all the data in the single json response

Sir please see this video I'm not able to select next page area how can i select this: