VBA Unicode done right, redux

Some time ago I wrote a short article on forcing VBA to use some form of unicode to allow simple insertion of non-latin1 text into VBA modules. It sort of worked, for the most part, kind of. Well, it didn't. The problem is down to the VBA editor being locked to the local encoding of the machine it is running on. You can only type characters from BIG5 in China, Shift-JIS in Japan, and so on. If you have a need to make a VBA module that uses strings suitable for all locales ... you're pretty much stuffed.

Except you're not. If you do it properly, you can have any unicode character displayed in any VBA locale. So how do you do it properly? Well, you're supposed to use ChrW$() to generate unicode characters individually. Yes. Really.

Say you want to include the Chinese character 汉 in your latin1 text. The only way to do it is to use a string like this:

    MyString = "This is my string with a " &
    ChrW$(&H6C49) & " in it."

Imagine if you have entire sentences made up of such characters. Doesn't bear thinking about, does it? However, I've made a simple way to work with it. The trick is to write your strings outside the VBA editor using a unicode encoding (e.g. UTF-8). For example, take this little proto-module:

Sub Tomato
    MyForm.MyLabel.Caption = "Chinese or the Sinitic language(s) (汉语/漢語, pinyin: Hànyǔ; 中文, pinyin: Zhōngwén) can be considered a language or a language family and is originally the indigenous languages spoken by the Han Chinese in China."
End Sub

That won't work with VBA due to the Chinese characters. However, they display fine in unicode. Hrm. What to do. Well, you can use my little utility to fix it. It's a simple .NET app that should run on just about anything (Mono included, although I've not tested it).

Using it is very simple. Just paste your unicode string into the top box and click the Unicode to VBA button. The output should look like this:

You can also go the other way, so copy the already converted string back out of your VBA module into the top box and hit VBA to Unicode. You should see this:

That's basically it. Once the converted string is copied into your VBA, it will look like this.

Sub Tomato
    MyForm.MyLabel.Caption = "Chinese or the Sinitic language(s) (" & ChrW$(&H6C49) & ChrW$(&H8BED) & "/" & ChrW$(&H6F22) & ChrW$(&H8A9E) & ", pinyin: Hàny" & ChrW$(&H1D4) & "; " & ChrW$(&H4E2D) & ChrW$(&H6587) & ", pinyin: Zh" & ChrW$(&H14D) & "ngwén) can be considered a language or a language family and is originally the indigenous languages spoken by the Han Chinese in China."
End Sub

That should be fine for inclusion directly in VBA and all will work properly.

You can download VBAUnicodeConverter.exe directly, or the source files in VBAUnicodeCoverter.zip. There is also the original Perl script that you can find in convert.pl. There are no particular restrictions on the source, except that you should credit me if you nick bits of it. Not that there's much to nick.

Issue displaying chinese characters using a msgbox

Hello Craiga,

I need what you have just written. However I don´t get the chinese characters when using a msgbox.

This works fine...
Range("A1") = "This is my string with a " & ChrW$(&H6C49) & " in it."

This doesn´t
msgbox("This is my string with a " & ChrW$(&H6C49) & " in it.")
The excel message displays "This is my string with a ? in it."

I would really appreciate any help on this issue.
Kind regards,

Rodrigo Gomez
skype: rodrogg
rodrigo@arcadiaconsulting.es

Using MessageBoxW

The normal MsgBox function uses the Windows system font, and as such is tied to the OS's code page. You can display unicode characters in message boxes using the MessageBoxW function from User32.dll. Include this code before you use it:
Private Declare Function User32MsgBox Lib "user32" Alias "MessageBoxW" _
    (Optional ByVal hWnd As Long, Optional ByVal Prompt As Long, _
     Optional ByVal Title As Long, Optional ByVal Buttons As Long) As Long

Function MessageBoxW(cPrompt As String, _
    Optional cButtons As VbMsgBoxStyle = vbOKOnly, _
    Optional cTitle As String) As Long

    MessageBoxW = User32MsgBox(0, StrPtr(cPrompt), StrPtr(cTitle), cButtons)

End Function
You can then call MessageBoxW using just the text, or with the optional caption and button styles like this:
MessageBoxW "This is my string with a " & ChrW$(&H6C49) & " in it."

MessageBoxW "Can you read this character: " & ChrW$(&H6C49) & "?", _
            vbYesNo, "Test Message Box"

MessageBoxW does not work for me

Hello Craig,

Thank you for your solution. However it doesn´t work for me.
You are right, message box characters depend on the operating system. We have tried on a chinese pc and it´s displayed correctly.

I checked User32.dll and it´s in C:\WINDOWS\system32.
Any clue?.
Once more, I really appreciate your time.
Regards,

Rodrigo Gomez
skype: rodrogg
rodrigo@arcadiaconsulting.es

Interesting ...

I've put together a Word 2003 document that illustrates the MessageW example. You can download it here. See if that works for you. I've tested it on Word 2007 and Word 2003.