How to remove the null results

When I scrape, I receive some null results when the filter doesn’t match.

Url: Palo Alto Networks Security Advisories

Sitemap:
{"_id":"paloalto-psirt","startUrl":["https://security.paloaltonetworks.com/?product=PAN-OS&sort=-date&limit=300"],"selectors":[{"id":"Element-T","multiple":true,"parentSelectors":["_root"],"selector":"tbody tr","type":"SelectorElement"},{"id":"CVSS","multiple":false,"parentSelectors":["Element-T"],"regex":"","selector":"b","type":"SelectorText"},{"id":"Name","multiple":false,"parentSelectors":["Element-T"],"regex":"","selector":"td a","type":"SelectorText"},{"id":"Version","multiple":false,"parentSelectors":["Element-T"],"regex":"","selector":"td:nth-of-type(3) .vflx div:contains("PAN-OS 9.1")","type":"SelectorText"},{"id":"Published","multiple":false,"parentSelectors":["Element-T"],"regex":"","selector":"td:nth-of-type(6) span","type":"SelectorText"}]}

@hakal Hi, you can manually specify the Element selector by applying additional jQuery selectors - ':contains' & ':has'.

tbody tr:has(div:contains('PAN-OS 9.1'))

Learn more:

1 Like

Hi,

I tried it, but now am getting all versions as null. Did I make something wrong?

@hakal You can use the original sitemap, just make sure to adjust the 'Element' selector:

{"_id":"paloalto-psirt","startUrl":["https://security.paloaltonetworks.com/?product=PAN-OS&sort=-date&limit=300"],"selectors":[{"id":"Element-T","parentSelectors":["_root"],"type":"SelectorElement","selector":"tbody tr:has(div:contains('PAN-OS 9.1'))","multiple":true},{"id":"CVSS","parentSelectors":["Element-T"],"type":"SelectorText","selector":"b","multiple":false,"regex":""},{"id":"Name","parentSelectors":["Element-T"],"type":"SelectorText","selector":"td a","multiple":false,"regex":""},{"id":"Version","parentSelectors":["Element-T"],"type":"SelectorText","selector":"td:nth-of-type(3) .vflx div:contains('PAN-OS 9.1')","multiple":false,"regex":""},{"id":"Published","parentSelectors":["Element-T"],"type":"SelectorText","selector":"td:nth-of-type(6) span","multiple":false,"regex":""}]}

1 Like

I did it, changing the td:nth-of-type(3) .vflx div:contains('PAN-OS 9.1') to: td:nth-of-type(3) .vflx div:has(vflx:contains('PAN-OS 9.1')).

But by doing it I'll receive all results as null. Can you try it on your machine and share with me the sitemap?

@hakal

1 Like

WORKED now!

Thank you!