Wikipedia - images from post

Hello,
Describe the problem:

  1. when I click on an image a window appears and there is an image in it, unfortunately I can't set the image download in this window.
  2. or at least if I could open directly the page with the image where I could download the image itself.

I would like full size pictures.
I couldn't solve either of the points so please advise.

Url:
e.g. Zoo Brno – Wikipedie - click -> Zoo Brno – Wikipedie

Sitemap:
{"_id":"SEZNAM_ZOOLOGICKA","startUrl":["https://cs.wikipedia.org/wiki/Seznam_zoologick%C3%BDch_zahrad_v_%C4%8Cesku"],"selectors":[{"id":"LINKY","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root"],"selector":".mw-content-ltr li > a[title]:nth-of-type(1)","type":"SelectorLink"},{"id":"NAME","multiple":false,"parentSelectors":["LINKY"],"regex":"","selector":"span.mw-page-title-main","type":"SelectorText"},{"id":"CONTENT","multiple":false,"parentSelectors":["LINKY"],"regex":"","selector":"p","type":"SelectorText"},{"id":"CONTENT HTML","multiple":false,"parentSelectors":["LINKY"],"regex":"","selector":"p","type":"SelectorHTML"},{"id":"link IMG","linkType":"linkFromHref","multiple":false,"parentSelectors":["LINKY"],"selector":".infobox a.mw-file-description","type":"SelectorLink"},{"id":"IMG","multiple":false,"parentSelectors":["link IMG"],"selector":"img.mw-mmv-final-image","type":"SelectorImage"}]}

Thank you

Hey, you can scrape the scr attribute and user regex and/or find/replace to create the links in post-processing. See the selector 'images'

{"_id":"SEZNAM_ZOOLOGICKA","startUrl":["https://cs.wikipedia.org/wiki/Seznam_zoologick%C3%BDch_zahrad_v_%C4%8Cesku"],"selectors":[{"id":"LINKY","linkType":"linkFromHref","multiple":true,"parentSelectors":["_root"],"selector":".mw-content-ltr li > a[title]:nth-of-type(1)","type":"SelectorLink"},{"id":"NAME","multiple":false,"parentSelectors":["LINKY"],"regex":"","selector":"span.mw-page-title-main","type":"SelectorText"},{"id":"CONTENT","multiple":false,"parentSelectors":["LINKY"],"regex":"","selector":"p","type":"SelectorText"},{"id":"CONTENT HTML","multiple":false,"parentSelectors":["LINKY"],"regex":"","selector":"p","type":"SelectorHTML"},{"id":"link IMG","linkType":"linkFromHref","multiple":false,"parentSelectors":["LINKY"],"selector":".infobox a.mw-file-description","type":"SelectorLink"},{"id":"IMG","multiple":false,"parentSelectors":["link IMG"],"selector":"img.mw-mmv-final-image","type":"SelectorImage"},{"extractAttribute":"src","id":"images","parentSelectors":["LINKY"],"selector":".mw-content-ltr [class=\"mw-file-description\"] img","type":"SelectorGroup"}]}

Example output for one image:

//upload.wikimedia.org/wikipedia/commons/thumb/6/66/Przewalski%27s_horse%2C_ZOO_Brno.jpg/250px-Przewalski%27s_horse%2C_ZOO_Brno.jpg

To get the full size images:

  1. Add https: to the beginning
  2. remove /thumb
  3. remove from the end /250px-Przewalski%27s_horse%2C_ZOO_Brno.jpg

Wow, great. I already have the links to the images downloaded in the spreadsheet so I'll just edit them :slight_smile:

Thanks a lot.