Hi everyone, I am currently working on a project where I need to collect structured data from real estate websites, mainly related to property design, layouts, and software tools used for design visualization. For example, while analyzing projects like Experion Windchants, I want to extract details such as floor plans, design features, and visual elements.
I am using Web Scraper with a JSON sitemap, but facing some issues. The main problem is that my XPath and CSS selectors are not consistently capturing all the required fields like design descriptions, images, and software-related details. Some pages load dynamic content, due to which data collection becomes incomplete.
I have tried adjusting selectors and checking page structure, but results are still inconsistent. My goal is to build a clean dataset for analysis and better understanding of real estate design trends.
Can anyone suggest best practices for handling dynamic elements and improving selector accuracy in such cases?