Row Order Issue When Columns Have Missing Values

Hi everyone,

First of all, thank you for creating such a great tool! It has really boosted my work efficiency, and I’m finding it extremely useful in daily tasks.

That said, I’ve run into one issue I’m hoping to get some advice on. When some of the columns don’t return values, it seems to throw off the row order, which ends up mixing the results. Because of these missing values, the overall dataset becomes a bit unreliable.

I’ve attached my sitemap below for reference. Please note that the target site requires a membership login in order to view the data.

Has anyone experienced something similar, and is there a way to keep the row order consistent even when some values are missing?

Thanks a lot in advance for any tips or solutions!


{"_id":"thomas","startUrl":["https://www.thomasnet.com/suppliers/search?cov=NA&heading=5122007&searchsource=suppliers&searchterm=ALUMINUM+BILLET&what=Aluminum+Billets"],"selectors":[{"id":"More","parentSelectors":["_root","More"],"paginationType":"linkFromRedirect","type":"SelectorPagination","selector":".width-full.flex-col button.width-max"},{"id":"Company","parentSelectors":["_root"],"type":"SelectorText","selector":".sm\:hide button.bg-none","regex":"","multipleType":"singleColumnWithSeparator"},{"id":"Location","parentSelectors":["_root"],"type":"SelectorText","selector":"a.sm\:hide, .sm\:hide div.txt-smallest","regex":"","multipleType":"singleColumnWithSeparator"},{"id":"Business Brief","parentSelectors":["_root"],"type":"SelectorText","selector":".mar-0 span[data-sentry-component]","regex":"","multipleType":"singleColumnWithSeparator"},{"id":"Business Description","parentSelectors":["_root"],"type":"SelectorText","selector":".search-result-supplier_supplierDescriptionItem__qnZx1 .flex-col p","regex":"","multipleType":"singleColumnWithSeparator"},{"id":"Website","parentSelectors":["_root"],"type":"SelectorText","selector":"a.pad-t-1","regex":"","multipleType":"singleColumnWithSeparator"}]}

Hi, could you post the sitemap as Preformatted text please?

code

Hi Sir,

Thanks for the reply. Can you please look in to the sitemap below:?

{
"_id": "thomas",
"startUrl": [
"Aluminum Billets Suppliers"
],
"selectors": [
{
"id": "More",
"parentSelectors": ["_root", "More"],
"paginationType": "linkFromRedirect",
"type": "SelectorPagination",
"selector": ".width-full.flex-col button.width-max"
},
{
"id": "Company",
"parentSelectors": ["_root"],
"type": "SelectorText",
"selector": ".sm:hide button.bg-none",
"regex": "",
"multipleType": "singleColumnWithSeparator"
},
{
"id": "Location",
"parentSelectors": ["_root"],
"type": "SelectorText",
"selector": "a.sm:hide, .sm:hide div.txt-smallest",
"regex": "",
"multipleType": "singleColumnWithSeparator"
},
{
"id": "Business Brief",
"parentSelectors": ["_root"],
"type": "SelectorText",
"selector": ".mar-0 span[data-sentry-component]",
"regex": "",
"multipleType": "singleColumnWithSeparator"
},
{
"id": "Business Description",
"parentSelectors": ["_root"],
"type": "SelectorText",
"selector": ".search-result-supplier_supplierDescriptionItem__qnZx1 .flex-col p",
"regex": "",
"multipleType": "singleColumnWithSeparator"
},
{
"id": "Website",
"parentSelectors": ["_root"],
"type": "SelectorText",
"selector": "a.pad-t-1",
"regex": "",
"multipleType": "singleColumnWithSeparator"
},
{
"id": "Website Link",
"parentSelectors": ["_root"],
"type": "SelectorLink",
"selector": "a.pad-t-1",
"multiple": true
}
]
}

Hi,

Please try this setup:

{"_id":"thomas","startUrl":["https://www.thomasnet.com/suppliers/search?cov=NA&heading=5122007&searchsource=suppliers&searchterm=ALUMINUM+BILLET&what=Aluminum+Billets"],"selectors":[{"clickActionType":"real","clickElementSelector":"button.width-max","clickElementUniquenessType":"uniqueText","clickType":"clickOnce","delay":0,"discardInitialElements":"do-not-discard","id":"More","multiple":true,"parentSelectors":["listing-wrapper"],"selector":"_parent_","type":"SelectorElementClick"},{"id":"Company","multipleType":"singleColumnWithSeparator","parentSelectors":["listing-wrapper"],"regex":"","selector":".sm\\:hide h2","type":"SelectorText"},{"id":"Location","multipleType":"singleColumnWithSeparator","parentSelectors":["listing-wrapper"],"regex":"","selector":"div:not(:has(.gt-sm\\:hide)) [data-sentry-component=\"SupplierNameLink\"] + div","type":"SelectorText"},{"id":"Business Brief","multipleType":"singleColumnWithSeparator","parentSelectors":["listing-wrapper"],"regex":"","selector":"[data-sentry-component=\"SupplierSearchDetails\"] p","type":"SelectorText"},{"id":"Business Description","multipleType":"singleColumnWithSeparator","parentSelectors":["listing-wrapper"],"regex":"","selector":"[data-sentry-component=\"TrimmedDescription\"] p","type":"SelectorText"},{"id":"Website","multipleType":"singleColumnWithSeparator","parentSelectors":["listing-wrapper"],"regex":"","selector":"a.pad-t-1","type":"SelectorText"},{"elementLimit":0,"id":"listing-wrapper","multiple":true,"parentSelectors":["_root"],"scroll":false,"selector":"[data-sentry-component=\"SearchResultSupplier\"]","type":"SelectorElement"}]}