Need help with pagination past the given tabs

Describe the problem.

hi,

i am trying to scrape data from a directory. I am given the option to click to 1-9 and a option at the bottom for the rest of the other pages. I want to source data from every single page even past page 9.

I am only scraping data on the first page. im just gathering the name of the business as well as the email. Please check out my map and please offer input thank you so much.

Url: https://www.goodfirms.co/directory/marketing-services/top-digital-marketing-companies?sort_by=Any&rate=&location=us&employees=

Sitemap:
{"_id":"goodfirms","startUrl":["https://www.goodfirms.co/directory/marketing-services/top-digital-marketing-companies?sort_by=Any&rate=&location=us&employees="],"selectors":[{"id":"Business","type":"SelectorLink","parentSelectors":["_root"],"selector":"a.font20","multiple":true,"delay":0},{"id":"Pagination","type":"SelectorElementClick","parentSelectors":["_root","Business"],"selector":"ul.pagination li:nth-of-type(11) a","multiple":true,"delay":0,"clickElementSelector":"ul.pagination a","clickType":"clickOnce","discardInitialElements":false,"clickElementUniquenessType":"uniqueText"}]}

can anyone help with this problem?

Hi, Try this sitemap:

{"_id":"a_test_goodfirms","startUrl":["https://www.goodfirms.co/directory/marketing-services/top-digital-marketing-companies?sort_by=Any&rate=&location=us&employees="],"selectors":[{"id":"Business","type":"SelectorElement","parentSelectors":["_root","Pagination"],"selector":"div.directory-content div.whitebg div.col-md-12.padding","multiple":true,"delay":0},{"id":"Pagination","type":"SelectorLink","parentSelectors":["_root","Pagination"],"selector":"li.next a","multiple":true,"delay":"200"},{"id":"name","type":"SelectorLink","parentSelectors":["Business"],"selector":"h3.font20 a","multiple":false,"delay":0},{"id":"employee_count","type":"SelectorText","parentSelectors":["Business"],"selector":"div.firm-employees","multiple":false,"regex":"","delay":"100"},{"id":"website","type":"SelectorLink","parentSelectors":["Business"],"selector":"div.visit-website-div a.visit-website","multiple":false,"delay":"100"}]}

In this case, you only need to use Link selector for Pagination, selecting the ">" button. Put Pagination as it's own child to chain-link the page. I changed Business to an Element selector as a vessel to hold further child selectors. Business should be a child of root and of Pagination. See this layout in Selector Graph. Business then holds children selectors for text and links to Website URL, etc. I added small delays. You need to add further child selectors under Business to gather more info.

Once you realize that this Website uses a HTTP GET method that explicitly defines the parameter &page=2, such as the below:

https://www.goodfirms.co/directory/marketing-services/top-digital-marketing-companies?sort_by=Any&rate=&location=us&employees=&page=2

you should be able to play with the Start URL and use the "range" format, such as this:

https://www.goodfirms.co/directory/marketing-services/top-digital-marketing-companies?sort_by=Any&rate=&location=us&employees=&page=[1-25]

having peeked into the last ">>" button that reveals the max page number is 25.

You would have to remove the Pagination selector from the earlier sitemap.

See https://www.webscraper.io/documentation#scraping-a-site