找到你要的答案

Q:Java - Compiler parses unicode source file incorrectly

Q:java编译器解析Unicode源文件错误

Consider the following scenario

example.txt:

ÄäÖöÜü

Java source:

try (FileInputStream fileInputStream = new FileInputStream("example.txt");
     InputStreamReader inputStreamReader = new InputStreamReader(fileInputStream, StandardCharsets.UTF_8);
     BufferedReader bufferedReader = new BufferedReader(inputStreamReader)) {

    String stringLoadedFromOutside = bufferedReader.readLine();
    String stringConstructedInside = "ÄäÖöÜü";

    System.out.println("string constant: " + stringConstructedInside);
    System.out.println("loaded string: " + stringLoadedFromOutside);
    System.out.println("equal: " + stringConstructedInside.equals(stringLoadedFromOutside));
} catch (IOException e) {
    e.printStackTrace();
}

Both files are encoded in the UTF-8.

This outputs:

string constant: ÄäÖöÜü
loaded string: ÄäÖöÜü
equal: false

How can I prevent the compiler from turning my unicode in the source into the wrong string?

考虑以下情况

example.txt:

ÄäÖöÜü

java源代码:

try (FileInputStream fileInputStream = new FileInputStream("example.txt");
     InputStreamReader inputStreamReader = new InputStreamReader(fileInputStream, StandardCharsets.UTF_8);
     BufferedReader bufferedReader = new BufferedReader(inputStreamReader)) {

    String stringLoadedFromOutside = bufferedReader.readLine();
    String stringConstructedInside = "ÄäÖöÜü";

    System.out.println("string constant: " + stringConstructedInside);
    System.out.println("loaded string: " + stringLoadedFromOutside);
    System.out.println("equal: " + stringConstructedInside.equals(stringLoadedFromOutside));
} catch (IOException e) {
    e.printStackTrace();
}

文件是UTF-8编码。

这个输出:

string constant: ÄäÖöÜü
loaded string: ÄäÖöÜü
equal: false

我怎么能阻止编译器的源我的Unicode变成错误的字符串?

answer1: 回答1:

The problem is that the compiler assumes the system's standard charset, and the UTF-8 is apparently not the system's standard charset for you.

When using javac directly:

javac -encoding utf8 MySourceFile.java

When using gradle, you can use:

  • For all java compiler tasks:

    tasks.withType(JavaCompile) {
        options.encoding = 'utf8'
    }
    
  • For a single task:

    compileJava.options.encoding = 'utf8'
    

Now the code would output:

string constant: ÄäÖöÜü
loaded string: ÄäÖöÜü
equal: true

问题是,编译器假定系统的标准字符集,而UTF-8显然是不适合你的系统的标准字符集。

当使用javac直接:

javac -encoding utf8 MySourceFile.java

使用工具时,您可以使用:

  • For all java compiler tasks:

    tasks.withType(JavaCompile) {
        options.encoding = 'utf8'
    }
    
  • For a single task:

    compileJava.options.encoding = 'utf8'
    

现在代码将输出:

string constant: ÄäÖöÜü
loaded string: ÄäÖöÜü
equal: true
java  unicode  compilation