Data Crawling Issue

I'm having an issue with data crawling and haven't been able to find an answer in other threads. Can someone please help?

My initial problem was (A) data for specific columns exporting into separate rows when I would like them on the same row.

I followed suggestions around using Element to ensure data for specific columns export into the same row. While this did work in getting the column data onto the same row, I now have the problem (B) where the crawler is only grabbing column data for one row and not the entire data set.

My startURL is: Cambridge, MA Property Records & Home Values | realtor.com®

(1) The above link shows all the STREETS in Cambridge, MA
(2) The STREETS are themselves links, which show all the ADDRESSES and HOME VALUES on that STREET
(3) I want to crawl each STREET and gather the ADDRESSES and HOME VALUES

I've setup the below Sitemap. When ADDRESS and HOME VALUE are set to multiple, the column data are each on their own row, but it does produce all addresses and home values. When I de-select multiple, the column data are on the same row, but only one address is crawled.

{"_id":"Cambridge","startUrl":["Property Record Search, Find Home & Real Estate Records | Claim Your Home on realtor.com® a","multiple":true,"linkType":"linkFromHref"},{"id":"Element","parentSelectors":["Street"],"type":"SelectorElement","selector":"body","multiple":true},{"id":"Address","parentSelectors":["Element"],"type":"SelectorText","selector":"a.item","multiple":true,"regex":""},{"id":"HomeValue","parentSelectors":["Element"],"type":"SelectorText","selector":".col-3 li","multiple":true,"regex":""}]}

You are awesome! Thank you so much!

1 Like