Java中的编码学习笔记-原创手记-慕课网

Java中的编码

gbk编码中文占用2个字节，英文占1个字节;
utf-8编码中文占用3个字节。，英文占用1个字节;
Java是双字节编码 (utf-16be) utf -16be 中文占2个字节，英文占2个字节

具体代码块

package com.zhb.java;

public class EncodeDemo {
    public static void main (String[] args) throws Exception {
        String s ="你好abc";
        //gbk编码 中文占用2个字节，英文占1个字节
        System.out.println("-----gbk编码-----");
        byte[] bytes1 =s.getBytes("gbk");
        for (byte b : bytes1) {
            //把字节（转换成 int）以16进制方式显示
            // & 0xff 把前24个0 去掉，只留后8位具体解释看后面
            System.out.print(Integer.toHexString(b & 0xff)+" ");
        }

        System.out.println("\n-----utf8编码-----");
        //utf-8编码 中文占用3个字节。，英文占用1个字节
        byte [] bytes2 = s.getBytes("utf-8");
        for (byte b : bytes2) {
            System.out.print(Integer.toHexString(b & 0xff)+" ");
        }
        System.out.println("\n-----utf16be编码-----");
        //Java是双字节编码 utf-16be
        // utf -16be 中文占2个字节，英文占2个字节
        byte [] bytes3 = s.getBytes("utf-16be");
        for (byte b : bytes3) {
            System.out.print(Integer.toHexString(b & 0xff)+" ");
        }
        System.out.println();
        /**
         * 当你的字节序列是某种编码时，这个时候想把字节序列变成
         * 字符串，也需要这种编码方式，否则出现乱码
         */
        String str1 = new String (bytes3);
        //出现乱码，与你设置的编码格式有关
        System.out.println(str1);
        String str2 = new String (bytes3,"utf-16be");
        System.out.println(str2);

    }

}

代码块执行结果

-----gbk编码-----
c4 e3 ba c3 61 62 63
-----utf8编码-----
e4 bd a0 e5 a5 bd 61 62 63
-----utf16be编码-----
4f 60 59 7d 0 61 0 62 0 63
O`Y} a b c
你好 a b c

&0xff 的意义

我对 0xff 还是不是特别了解，为什么加上它就是可以去掉前24位呢？

我于是就把上面的代码处理了下
&0xff 未&0xff
:-------- --------:
e4 ffffffe4

这是为什么呢？f代表什么呢？于是我查了下二进制和16进制之间的转换

十六进制二进制
:-------- --------:
0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
A 1010
B 1011
C 1100
D 1101
E 1110
F 1111

可以知道前边的6个f 就是二进制的24个1.
如果&0xff 就可以去掉前面的24个1 同时保留后8位

我们知道&运算
1 & 1 = 1， 0 & 1 = 0
那么0xff 就是0000 0000 0000 0000 0000 0000 1111 1111
前24位不管是什么值只要&上都是0 ，后面8位是什么值显示什么值