If you need to deal in-depth with Unicode, ditch awk and get something that is designed to handle Unicode. Seriously. awk barely handles 8-bit characters, much less variable length ones.
_________________________
Bitt Faulk