Dienstag, Mai 31, 2016

Import Web "Gallery of Art" into RStudio

The Web Gallery of Art is Open Data - downloadable as CSV and importable into RStudio. Depending on your system, however, you might find strange characters instead of Umlaute and accents: <f6> instead of ö.

That is a sign of ISO-8859 encoding and probably due to the fact that the developers of the site are from Budapest, Hungary.

Oh, what is this?
The simple solution for that is to read it as it is, as ISO-8859 (and not UTF-8). Some suggest to change 'Default text encoding' in the Preferences , but it is much easier:
  1. Do not use the "Import Dataset" assistant. The encoding option has no effect in some versions.
  2. Use the command line and enter WGAcatalog <- read.csv2("WGAcatalog.csv", sep=";", encoding="iso_8859-1"
Does that work for you?

