How to use a different delimiter other than comma?

How can I choose to output a different delimiter other than a comma? In other words, how do I use a custom delimiter other than the default comma?

For what? The CSV data download

Yes. I need to replace the comma delimiter with say pipes or a custom delimiter in the CSV download.

You can try to make a macro for an advanced Text editor, like Notepad++, to replace the column separators with something different.

If you open your CSV file, you will see that results go within double quotes, separated by a comma.
You can replace the "," sections with, say, "|" as you mentioned.

Are you the developer? If so, can we not introduce such a fundamental option to choose what delimiter to use?

I am not a WebScraper developer, all I do is help people to get better using it, by my own will.

The delimiter choice option was in WebScraper before, but for some reason it was removed. My guess is that majority of people don't really need it.

Although you can actually change the delimiter by yourself, you will have to open devtools_panel.js file that is located within extension directory, find getDataExportCsvBlob function, and modify two lines of code containing join(",") to any other delimiter you prefer, like join("|") or anything else.

1 Like

It's not clear if this would work, because the comma is not reserved in my data, and in fact, no symbol is reserved. That means that unless Webscraper is smart enough to escape or quote everything, you would not be able to selectively replace the delimiters; commas in CSV fields would also be replaced.

Your thoughts?

Can't verify that it really works, but you could try changing the Windows regional setting for Numbers, as shown here:

https://resrequest.helpspot.com/index.php?pg=kb.page&id=279

I've had clients who had previously sent me WS .csv files where the delimiters are tabs, not commas (technically, .tsv files). They are from Europe and said the files were directly exported from Chrome/WS with no modifcations. So it seems to be a regional setting for delimiters which affects Chrome.

1 Like

@leemeng A good idea - worth a try, but that would affect the desktop globally I imagine? Could that have some undesirable side effects? What other applications or operations on the desktop would it affect?

Did you tried replacing both comma and double quotes together in Notepad++ ?

Like this :

Search : ","
Replace with : ";"

I replaced comma with semi colon on my end but you could of course put whatever separator suits you instead.

The idea is that if the comma is quite common to be found alone, the comma enclosed with both double-quotes should be more rare.

Now if comma enclosed with double-quotes can also be found in your extract, then it won't work but i just wanted to share how i personally managed this.

Hope this helps.

This would fail if you have mixed unquoted and quoted data, which WS tends to produce, e.g.:

1.25,"example1"
3.56,"example2"
2.77,"example3"

Fortunately, Notepad++ supports regex, so you can easy deal with the example above with:
Find: (\d)\,"
Replace: \1;"
(note: Regular expression search mode must be enabled)

which would turn it to:
1.25;"example1"
3.56;"example2"
2.77;"example3"

Ref: http://blog.hakzone.info/posts-and-articles/editors/understanding-regex-with-notepad/comment-page-1/

1 Like

Unfortunately, that is not even a possibility for me, as the files are larger than Notepad++ allows; I can't even open them on Windows. Some are >100MB. Even smaller files that are allowed cause Notepad++ to crash and lag out.

Not to mention how much repetitive manual labour that implies as I have to do a lot of these routinely.

Furthermore there are in-field quotes e.g. ,"24" monitor", and empty fields ,"", which cannot always be differentiated, even when quotes are escaped.

I haven't tested it myself but maybe you could have a look at PilotEdit soft. The free version claims to support files up to 10Gb and it apparently has RegEx + Script automation support (which might also somehow solve the long repetitive passes process).

Might be worth the try...