Sunday, February 04, 2007

???

It's pretty irritating to have someone else's XML document object model - on a remote server with little in the way of a debugging tool kit - screw up your document by replacing all non-ASCII characters with the ? (Question Mark) character. It's especially annoying if your application is used globally by important clients and their data contains lots of accented characters. But it's better to spit out loads of question marks than to completely F&*^ things up by just chewing off the most significant bit each time a non-ASCII character comes along (ASCII characters are all encoded using the 7 least significant bits of each byte). The first issue is a display problem that will usually only be caught by the human eye. The second one affects other law-abiding XML parsers because they are being fed invalid XML documents. A client is more likely to forgive a couple of question marks on the screen - at least they can see the rest of the data - than they are to forgive an error page because their document couldn't even be parsed.

No comments: