字符集就是一组规定了的字和符号。例如,GB2312-1980,它含ASCII字符,日文假名(字母),俄文字母,7千左右的中文字。编码,就是给字符集中每一字符各有一个独一无二的数字号码,相当于定义一组向量,一列是字和符号形状,另一列是它的码。代码页 code page 是IBM 的传统术语,就是“一张字符编码表”,当然这个“张”可以很大也 可以很小。例如 IBM PC (OEM) code page,中文GBK code page 。Code page is the traditional IBM term used for a specific character encoding table: a mapping in which a sequence of bits, usually a single octet representing integer values 0 through 255, is associated with a specific character. IBM and Microsoft often allocate a code page number to a character set even if that charset is better known by another name.GB2312 code page 是双bytes 码,两字节大于 0xA0A0 的表. 也就是说code page 里可能含有部分空白(少数码,没有字符)。UTF 是unicode的传送码,即unicode编码后的编码。UTF的编码方法很简单,用算术表达式计算就可以了,看3字节的Utf-8数据没意思。unicode 与字符集对应。Utf-8与unicode值对应。计算机内码就是指令码,数据和地址。