Most likely the version of awk that you have simply doesn't have support for Unicode at all.

Virtually every interpreted language you can think of using (Perl, PHP, Python) is going to have an XML parser. Given, that xmlstarlet program you're using is pretty handy.
_________________________
Bitt Faulk