featar (featar) wrote in ljcharm,
featar
featar
ljcharm

One's more about UTF-8

Hello!

I use Charm 1.9.0 on Fedora 8, uk_UA.UTF-8 as system locale - Ukrainian language, cyrillic script with UTF-8 codepage.

Original program recoded my posts in unreadable ones.

Playing with iconv, I have found, that it applies Latin1 -> UTF-8 translation to the text, already encoded in UTF-8.

I have found the following lines in source script:

# -----------------------
# Dealing with Unicode.
# -----------------------

def utf8(s):
    "UTF-8 encode a string, if supported."

    try:
        return unicode(s, "iso-8859-1").encode("UTF-8")
    except:
        return s




It fails to work fine.

Is Charm oriented only on English-speakers?

I propose the next modification. It works.

# -----------------------
# Dealing with Unicode.
# -----------------------
def utf8(s):
    "UTF-8 encode a string, if supported."
    import locale
    loc = locale.getdefaultlocale()
    if loc[1]=='UTF8':
        return s
    else:
        try:
            return unicode(s, loc[1]).encode("UTF-8")
        except:
            return s


  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your IP address will be recorded 

  • 3 comments