How to extract text without linebreaks

bret · August 4, 2018, 3:13pm

I try to retrieve the description of several crypto currencies on Coinlib.
I can extract all data without any problems except the description. The description contains HTML and linebreaks and it breaks my .csv file (.csv file contains line breaks and now I can't use it)

Iif I hit refresh and look at the preview everything looks fine. But when I extract the .CSV and open it locally the results are not as expected and there are too many line breaks.

P.S. I''m aware its not yet using pagination correctly I did not get there yet

Url: https://coinlib.io/coin/BTC/Bitcoin
Target data: Description (below the 'Show more" button")

Sitemap:
{"_id":"coinlib1","startUrl":["https://coinlib.io/coins"],"selectors":[{"id":"coin-link","type":"SelectorLink","selector":"div.tbl-currency a","parentSelectors":["_root"],"multiple":true,"delay":0},{"id":"coin-name","type":"SelectorText","selector":"h1","parentSelectors":["coin-link"],"multiple":false,"regex":"","delay":0},{"id":"coin-ticker","type":"SelectorText","selector":"span.abbrev","parentSelectors":["coin-link"],"multiple":false,"regex":"","delay":0},{"id":"market-cap","type":"SelectorText","selector":"div#coin-market-cap.coin-stat-value","parentSelectors":["coin-link"],"multiple":false,"regex":"","delay":0},{"id":"type","type":"SelectorText","selector":"div.tab-pane.active div.col-6:nth-of-type(4)","parentSelectors":["coin-link"],"multiple":false,"regex":"","delay":0},{"id":"algorithm","type":"SelectorText","selector":"div.tab-pane.active div.col-6:nth-of-type(6)","parentSelectors":["coin-link"],"multiple":false,"regex":"","delay":0},{"id":"mineable","type":"SelectorText","selector":"div.tab-pane.active div.col-6:nth-of-type(10)","parentSelectors":["coin-link"],"multiple":false,"regex":"","delay":0},{"id":"premined","type":"SelectorText","selector":"div.comp-top div.col-6:nth-of-type(12)","parentSelectors":["coin-link"],"multiple":false,"regex":"","delay":0},{"id":"reddit-users","type":"SelectorText","selector":"div.comp-top tr:nth-of-type(1) td.align-middle","parentSelectors":["coin-link"],"multiple":false,"regex":"","delay":0},{"id":"reddit-active","type":"SelectorText","selector":"div.comp-top tr:nth-of-type(2) td.align-middle","parentSelectors":["coin-link"],"multiple":false,"regex":"","delay":0},{"id":"description","type":"SelectorText","selector":"div.coin-description","parentSelectors":["coin-link"],"multiple":false,"regex":"","delay":0}]}

Thank you in advance!

bret · August 5, 2018, 10:46am

My .CSV export file looks like this because of the line breaks / enters / empty spaces in the description.

bret · August 9, 2018, 6:22am

Anyone able to help me move forward or point me in the right direction?

Help is much appreciated!

iconoclast · August 9, 2018, 9:41pm

Hi Bret!

There is an easy solution for your issue — after scraping is complete, just copy-paste the resulting text from preview window.

The other option is to force redraw text without <br> tags using Tampermonkey extension.

waiff · January 9, 2021, 11:35am

Hi, I'm having the same issue. But my goal is to try out scraping through chrome extension and if I manage to get everything working as I need, I would move it to the cloud subscription. What would be the solution to the problem then? I assume the Tampermonkey extension is not going to solve that.

Thanks for advice