Tuesday, April 07, 2009

The missing iso-8859-1 code page

It's definitely not there. Some people have complained about this when parsing XML files using the .NET CF, but I found out about this yesterday night while trying to custom parse an iso-8859-1 encoded KML file. These files are used by Google Earth to add layers of geographical information on top of the displayed map. The content is regular XML with a specialized syntax that is recognized by the Google Earth application and you can find lots of sources of this type of files. Currently I'm developing a native Virtual Earth map browser for Windows mobile and I want to add the option to read these files and dynamically add the information to the map - that's why I met this issue.

When trying to convert a string using the MultiByteToWideChar API, I got an invalid parameter error when using the iso-8859-1 (28591) code page. A very brief search showed me that this is aknown issue and that, apparently, it's the device manufacturer's decision to include a given code page. Thankfully there seems to be a solution that should solve most of the conversion cases: use the windows-1252 code page.

Now I'm glad that I'm custom-parsing the KML file because I can cheat the encoding section and replace the iso-8859-1 with windows-1252 on-the-fly. I'm not sure I would be so successful if using an automated parser (native or managed). Now the question is: why is iso-8859-1 not there?

2 comments:

smartmobili said...

Hi,

Wouldn't be easier to use Google/Gears instead of developping a native app ?

João Paulo Figueira said...

One of the requirements is to use the Virtual Earth tile system, so that leaves Google out of the equation. Also, this has been a great learning experience on stitching maps, zooming in and out, using IImage and a host of other techniques that I will describe on the blog.