Links

Lists

Latest Updates

Ruby On Rails List
Python list
Advanced Java
The JavaScript List
Apache Users
Full Disclosure
Linux Security

Search the archives!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unicode Question


  • From: fairwinds at eastlink.ca (David Pratt)
  • Subject: Unicode Question
  • Date: Mon, 09 Jan 2006 21:00:14 -0400

Hi. I am working through some tutorials on unicode and am hoping that 
someone can help explain this for me.  I am on mac platform using python 
2.4.1 at the moment.  I am experimenting with unicode with the 3/4 symbol.

I want to prepare strings for db storage that come from normal Windows 
machine (cp1252) so my understanding is to unicode and encode to utf-8 
and to store properly. Since data will be used on the web I would not 
have to change my encoding when extracting from the database. This first 
example I believe simulates this with the 3/4 symbol. Here I want to 
store '\xc2\xbe' in my database.

 >>> tq = u'\xbe'
 >>> tq_utf = tq.encode('utf8')
 >>> tq, tq_utf
(u'\xbe', '\xc2\xbe')

To unicode withat a valiable, my understanding is that I can unicode and 
encode at the same time

 >>> tq = '\xbe'
 >>> tq_utf = unicode(tq, 'utf-8')
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
UnicodeDecodeError: 'utf8' codec can't decode byte 0xbe in position 0: 
unexpected code byte

This is not working for me. Can someone explain why. Many thanks.

Regards,
David