I have a small project but i'm not able to complete this project. Here my target website link: https://brokercheck.finra.org/search/genericsearch/list
Please use this zip code on zip code filed 94102
I need only below image area data
I have a small project but i'm not able to complete this project. Here my target website link: https://brokercheck.finra.org/search/genericsearch/list
Ok - I might need some help from @iconoclast or @jeremyrem who knows some API
Problem 1: The Search URL doesn't indicate the search so I'm not sure how to make the start URL automatically load your zipcode.
Problem 2: There are a lot of pages to scrape, this will take a bit.
Option 1 - Use Dataminer as an alternative, I can share the recipe
Option 2 - Use the following code but instead of running it, click Data preview on the main element select. This will go through all 1200+ pages before scraping (might overrun your memory? I stopped it at 1200 results
{"_id":"a-a-broker-check-scrape","startUrl":["https://brokercheck.finra.org/search/genericsearch/list"],"selectors":[{"id":"Page-Next","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.listpanel","multiple":true,"delay":0,"clickElementSelector":"a:contains("›")","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueCSSSelector"},{"id":"Name","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"span.md-title span.smaller","multiple":false,"regex":"","delay":0},{"id":"CRD #","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.layout-xs-row div.font-dark-gray span.spacerright","multiple":false,"regex":"","delay":0},{"id":"Firm","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.bold > span.ng-binding","multiple":false,"regex":"","delay":0},{"id":"Location","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.font-dark-gray div.smaller:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"Broker","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.cell.ng-scope","multiple":false,"regex":"","delay":0}]}
Here is a partial scrape, using option #2. You could run this through a data enrichment program like ZapInfo to automatically grab contact details and social profiles
Try #1
{"_id":"a-a-broker-check-scrape","startUrl":["https://brokercheck.finra.org/search/genericsearch/list"],"selectors":[{"id":"Page-Next","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.listpanel","multiple":true,"delay":0,"clickElementSelector":"a:contains(\"›\")","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueCSSSelector"},{"id":"Name","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"span.md-title span.smaller","multiple":false,"regex":"","delay":0},{"id":"CRD #","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.layout-xs-row div.font-dark-gray span.spacerright","multiple":false,"regex":"","delay":0},{"id":"Firm","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.bold > span.ng-binding","multiple":false,"regex":"","delay":0},{"id":"Location","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.font-dark-gray div.smaller:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"Broker","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.cell.ng-scope","multiple":false,"regex":"","delay":0}]}
If that doesn't work:
{"_id":"a-a-broker-check-scrape","startUrl":["https://brokercheck.finra.org/search/genericsearch/list"],"selectors":[{"id":"Page-Next","type":"SelectorElementClick","parentSelectors":["_root"],"selector":"div.listpanel","multiple":true,"delay":0,"clickElementSelector":"a:contains("›")","clickType":"clickMore","discardInitialElements":false,"clickElementUniquenessType":"uniqueCSSSelector"},{"id":"Name","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"span.md-title span.smaller","multiple":false,"regex":"","delay":0},{"id":"CRD #","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.layout-xs-row div.font-dark-gray span.spacerright","multiple":false,"regex":"","delay":0},{"id":"Firm","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.bold > span.ng-binding","multiple":false,"regex":"","delay":0},{"id":"Location","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.font-dark-gray div.smaller:nth-of-type(2)","multiple":false,"regex":"","delay":0},{"id":"Broker","type":"SelectorText","parentSelectors":["Page-Next"],"selector":"div.cell.ng-scope","multiple":false,"regex":"","delay":0}]}
Sorry to say that this code not working Boss. I face some problem please see this video:
Please sir create a short video for me. I need this help urgently. I'm now helpless sir.
Yes - Because there is no way to tell it to run the search, that i'm aware of.
Load the sidemap and manually run your search
Now click on "Data Preview" on the page next selector
This will run for 60-120 Minutes (guess)
You'll see it paginate by the data won't be available until it ends.
Sir what do you mean by manual search?
If it doesn't render, here is the download link
https://drive.google.com/file/d/1fg_H1eatJOlbaBnlAOpFRflGq8dyI8KI/view
Looks like the easiest way for you to do this is extract the json response of your search.
Open Chrome & navigate to "https://brokercheck.finra.org/search/genericsearch/list"
Open up the Chrome Dev Tools (Ctrl + Shift + I or right click Inspect)
Goto the Network Tab
Enter Zipcode, or whatever you want to search and click Search
Looks for an address like this
Double click it and it will take you to a page with all your data in json format
After that, you will need to clean up the JSON a bit to standardize it so its parseable
Replace "/**/angular.callbacks._1({"errorMessage":null,"errorCode":0," with "{" at the beginning of the file No Qoutes
At the end of the file delete ");" no qoutes
Copy what you got and goto https://jsonformatter.curiousconcept.com/# and paste it in the box and click Process.
If it says Valid JSON then we are good to goto the next step, if not try to correct the mistakes.
Converting JSON to CSV/EXCEL
There are a number of ways to do this, I prefer using a program called jq, however jq is pretty hard to use for someone who doesnt know how to use sql, large datasets, and linux cli so I'm going to recommend we go on over to https://sqlify.io/convert for this project.
And you are all set
thanks you so much. your trick working Boss
Does that get all 1500 results or just the first page? Would changing nrows or r= change anything?
It should get everything.
From the look of their site, they load up everything in that json response and the website parses it out.
I dont see it requesting any new responses.
Once you extract it, let us know.
Normally on api calls, unless it says limit I dont change it.
Limit & Offset are pretty much are standard and most devs wont change the name
Limit = How many records to display
Offset = Where to start
Limit=10&Offset=0 mean show 10 records from the beginning = 1-10
Limit=10&Offset=10 means show 10 records, 10 from the beginning = 11-20
Max on limit is 999
Interesting, in this case there are more than 999 records. Does that change anything?
No, it has all the data in the single json response
Sir please see this video I'm not able to select next page area how can i select this: