logo
Header graphic 5 of 9

Categories

Archives

Other stuff

Other sites

I wish this site were powered by Django

August 31st, 2006

Django 0.95 has unicode problems, too

Filed under: Django, Python, Technology, Web — jm @ 04:48

I had to revise my post titled “UTF8-encoded Unicode support“, because I found out that django’s unicode support has it’s own problems. Some are, of course, connected to the character-set handling of their database code, but generally they currently handle all strings as binary, assuming that everything works within the DEFAULT_ENCODING setting.

That way, foreign character sets can break functions (#1355) in django and even worse, you can’t hook up a legacy database that uses a different character-set without patching the database driver. At least, current work on SQLAlchemy integration might make that less of a concern. I’m just hoping that django 1.0 will include full unicode support. For more information read the update on the old post.

Update (02/15/2008)

Django has come a long way since this post. An update can be found here.

Every single lie connected to the Iraq war

Filed under: Cutting the crap, Politics — jm @ 02:48

or, more appropiately: “The Administration of Crooks and Liars“, compiled by Mother Jones Magazine. A must read!

August 15th, 2006

Navy diver goes down to 610 meters

Filed under: General, Technology — jm @ 01:48

Holy crap! The suit looks kinda bulky, so it might be hard to get it declared as diving baggage. I still want it, though… (via jwz)

August 12th, 2006

Apples’ lawyers had their brains replaced with bricks

Filed under: Attitude, Cutting the crap — jm @ 18:29

There’s no other explanation for this… I mean… come on, looking like an iPod??

August 11th, 2006

I hate IE6! (or “why my posts jumped all over the place”)

Filed under: Technology, Web — jm @ 20:10

I just fixed a bug in my maurus.net design that recently occurred in IE6/Win. Two days ago I decided to revalidate the site and I had some invalid markup that had crept in and I fixed it. Turns out that I ran into a bug that only occurs in IE6’s standards compliant mode. So congratulations to me, I introduced a bug by fixing my XHTML templates.

I fixed it by applying a width: 100%; CSS rule to the p elements under div.storycontent, the wrapper for all weblog posts’ content.

Yes, I have lots of non-semantic markup in my templates and I feel bad about it, but it’s the best I can do right now. Now that I’ve admitted it, Information scientists all over the world may now burn me at the stake.

More RoR fallout

Filed under: Security, Technology — jm @ 12:02

Rails 1.1.6, backports, and full disclosure. Seems like they only cought one part of the problem in 1.1.5, so they updated again.

Some of the people leaving comments seem to be pissed off at the no-disclosure thing that the RoR core team did yesterday. Also, the update seems to break compatibility with “3rd party engines” (unfortunately, I don’t know that means in Rails-speak), which reminds me of the memory-leak that the PHP developers had to fix with an incompatible change. I hope that this doesn’t have the same impact on Rails-developers as PHP 4.4 had on PHP-developers.

August 10th, 2006

Ruby On Rails security leak

Filed under: General — jm @ 16:50

Rails 1.1.5: Mandatory security patch. Time to poke around and find out what it was, but it seems to be quite serious.

August 08th, 2006

New features

Filed under: General — jm @ 18:48

Finally I had time to port some features from my maurus.net staging host. For some time now, I’ve been experimenting with all kinds of JavaScript toolkits (Prototype in particular). Last week I integrated the excellent WordPress Widgets Plug-In in my theme and rolled out the changeable header graphic (you’ve noticed that, didn’t you? ;-)). Today, in this code update for maurus.net, there’s new JavaScript functionality.

I tend to clutter my articles with lots of notes that are important to create context, but might turn a reader away. So go and meet the new hide-the-notes button.

I know that it’s not particularly impressive, but with all dynamic JavaScript-based features, I think a lot of thought has to go into their usability and graceful fallback abilities, so I’m not very eager to add such things before testing them extensively.

UTF8-encoded Unicode support

Filed under: Attitude, Cutting the crap, Django, Java, Python, Technology — jm @ 02:15

From this post at MySQL DBA:

The utf8 spec says that a utf8 character can take up to 4 bytes, mySQL currently only supports up to 3 bytes.

……holy crap. Let’s summarize how different languages and frameworks support Unicode at the moment:

For some reasons, someone over at Sun decided that Java’s char-data type should have 2 bytes and be represented in modifed UTF8, so that certain characters that would normally require 4 bytes in normal UTF8, now require 6. This actually makes sense to keep compatibility with C programs (modified UTF8 has no “0-bytes” in strings), but makes it hard to support Unicode’s supplementary plane, where characters can have up to 4 bytes. The details can be found in JSR-204. Ever since then… string operations in Java cannot reliably calculate string length, because these methods do actually count valid UTF-16 characters.

PHP… don’t even get me started. PHP just sucks.

Ruby apparently also has problems with moving to Unicode (and this language was designed in Japan)?

So while I’m very glad that I decided to focus on Django and Python, which have has excellent Unicode support, I really feel, more than ever, the need to get the word out that text processing is incredibly hard and that there is no excuse for so many developers and teachers not caring about it.

Update (08/31/2006)

Django’s unicode support apparently also sucks. They try to do the right thing in their MySQL driver (using SET NAMES 'utf8'), but fail to set the connection character set properly, so even if the model is in UTF8, MySQL will treat every incoming string as latin1. This leads to ugly reencoding errors. It seems to work better with PostgreSQL, but it’s still a huge fucking bug. The developers try to get their act together, though and the “unicodification” will probably be done before they hit 1.0. #1356, #1355 and #952 read very badly. At least I have now pointers on what to fix (follow the links), but out-of-the-box you’re fucked if you want to connect your legacy Windows-1252 database to the web using UTF8 with django.

This reminds me a bit of the sad situation with Typo3. At least, with django, it’s not the programming language that’s the core problem.

Update (02/15/2008)

Unlike Java and PHP, Django has come a long way since this post was written. I wrote an update on Django’s Unicode capabilities that can be found here.

August 07th, 2006

Network infrastructure risks

Filed under: Security — jm @ 23:50

After the flaws in WLAN drivers were discovered that effectively make your PC an easy target, there's a new entry on Bruce Schneier's blog talking about the risks that printers pose to a network. What other parts of everyday infrastructure are a currently unmanaged risk?

But I have to say... I really like the "paper-clip idea".

Next Page »