posted on: 2015-10-22 08:24:26
This is a short java program that uses a regular expression to change unicode escape characters into their corresponding char value.

Here is the example:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegShow{
    
    public static void main (String[] args) throws java.lang.Exception {
        String a = "the\\u0073e unicode letter\\u0073 are odd";
        System.out.println(a);
        Pattern p = Pattern.compile("\\\\u(\\d{4})");
        Matcher m = p.matcher(a);
        StringBuffer buff = new StringBuffer();
        while(m.find()){
            m.appendReplacement(buff, (char)Integer.parseInt(m.group(1),16) + "");
        }
        m.appendTail(buff);
        System.out.println(buff.toString());
    }

}

The output is then:

the\u0073e unicode letter\u0073 are odd
these unicode letters are odd

The reg ex needs to start with "\\u". Then it needs four digits. The four digits are in a group. The matcher is used to find each match, get the group, convert to a character and append the output to a StringBuffer.

Comments

Name: