Scraping hex color value of filled cell in a table

I'm trying to scrap an automobile paint color database and I can gather all the data I need EXCEPT for the hex color value for a column of cells that have filled in colors corresponding to the specific color name of each row. See an example page below

Url: https://paintref.com/cgi-bin/colorcodedisplay.cgi?make=Infiniti&con=k&page=13&rows=50

I need WebScrapper to pick up the hex color value in the source code for the page for all the cells listed under the "Sample" column". Any way to do this?

Sitemap:
{id:"sitemap code"}

You cannot do this within the table parameters, but can get the number by using the Element Attribute selector like this:

{"_id":"paintref","startUrl":["https://paintref.com/cgi-bin/colorcodedisplay.cgi?make=Infiniti&con=k&page=13&rows=50"],"selectors":[{"id":"hex","type":"SelectorElementAttribute","parentSelectors":["_root"],"selector":"center > table[cellpadding] tr:nth-of-type(n+2) td[align]","multiple":true,"extractAttribute":"bgcolor","delay":0}]}

1 Like

Thank you for this @webber, it does seem to be pulling hex codes. I'm having one issue though:

I need all of the data in the table presented on that PaintRef.com website, but for the "Sample" column, I need it to pull the hex code for the cell, not the text inside the cell. Using the sitemap script you provided, I added a table data selector, but when it gathers the data, the hex code values aren't lining up with the rows of data from the table.

Here's the sitemap code I used. Browse the data file it produces, it seems to be producing blank cells in the "hex" column:

{"_id":"test","startUrl":["https://paintref.com/cgi-bin/colorcodedisplay.cgi?make=Infiniti&con=k&page=13&rows=50"],"selectors: [{"id":"hex","type":"SelectorElementAttribute","parentSelectors":["_root"],"selector":"center > table[cellpadding] tr:nth-of-type(n+2) td[align]","multiple":true,"extractAttribute":"bgcolor","delay":0},{"id":"tabled","type":"SelectorTable","parentSelectors":["_root"],"selector":"table[cellpadding]:nth-of-type(4)","multiple":true,"columns":[{"header":"image","name":"image","extract":false},{"header":"[year]","name":"[year]","extract":true},{"header":"make","name":"make","extract":true},{"header":"model","name":"model","extract":true},{"header":"paint color name","name":"paint color name","extract":true},{"header":"code","name":"code","extract":true},{"header":"sample","name":"sample","extract":true},{"header":"Ditzler PPG","name":"Ditzler PPG","extract":false},{"header":"Dupont","name":"Dupont","extract":false},{"header":"RM BASF","name":"RM BASF","extract":false},{"header":"Glasurit","name":"Glasurit","extract":false},{"header":"comment","name":"comment","extract":false}],"delay":0,"tableDataRowSelector":"tr:nth-of-type(n+2)","tableHeaderRowSelector":"tr.head"}]}

Just for reference, this user had analogous problem, trying to grab different element info from a table and have the information consolidated in rows. Trying to figure out how I can do the same thing for my use case