前言
java序列化与反序列化应该是非常基本的知识点,但细想起来还是一头雾水, 不知道序列化与反序列化到底底层是如何实现的,所以特意花了些时间整理这篇文章。所以你如果还只是停留在使用和知道这么一个知识点那么这篇文章对你有一定帮助,看完这篇文章你能得到哪些东西呢?我的本文也是根据这些东西层层递进进行书写,归总为如下几条: 1.序列化与反序列化的概念 2.序列化与反序列化的实现与原理 3.为什么要序列化与反序列化,它的应用场景是什么? 4.序列化与反序列化底层是如何实现的? 5.阅读序列化反序列化源代码
废话不多说,撸起袖子就是开写。
1.序列化与反序列化的概念
序列化:Java**对象**转换为**字节序列**的过程。反序列化:Java把**字节序列**恢复为**对象**的过程。
2.序列化与反序列化的实现与原理
对象可序列化条件:
一个类的对象要想序列化成功,必须满足两个条件:
该类必须实现 java.io.Serializable 对象。
该类的所有属性必须是可序列化的。如果有一个属性不是可序列化的,则该属性必须注明是短暂的。
如果你想知道一个 Java 标准类是否是可序列化的,请查看该类的文档。检验一个类的实例是否能序列化十分简单, 只需要查看该类有没有实现 java.io.Serializable接口。下面是一个序列化的代码:
// 序列化代码
// Employee.javaclass Employee implements Serializable { public String name; public String address; //该属性为不可序列化的,所以声明为短暂的 public transient int SSN; public Number number;}// 测试类public class DeserializeTest{ public static void main(String[] args) { Employee e = new Employee(); e.name = "MikeHuang"; e.address = "XXXXXXXXXXXX"; e.SSN = 12345678; e.number = 110; try { FileOutputStream fileOut = new FileOutputStream("/tmp/employee.ser"); ObjectOutputStream out = new ObjectOutputStream(fileOut); out.writeObject(e); out.close(); fileOut.close(); System.out.printf("Serialized data is saved in /tmp/Employee.ser"); }catch(IOException i) { i.printStackTrace(); } }}
上面测试代码执行后会在/tmp目录下多出一个employee.ser文件,我于是好奇打开了这文件,看里面都是什么内容。里面还是能够看出一些信息的比如说类名、可序列化的属性名、可序列化属性的值与类型。你可以自行打开查看。那么通过上面的代码我们总结一下,如何实现对象序列化: a.必须实现 java.io.Serializable,所有属性必须是可序列化的,属性不是可序列化的,则该属性必须注明是短暂的 b.通过ObjectOutputStream对象的writeObject方法将对象转换为字节序列。writeObject的源码,会在5.源码上贴出。
// 反序列化代码
// 反序列化 public static void main(String[] args) { Employee e = null; try { FileInputStream fileIn = new FileInputStream("/tmp/employee.ser"); ObjectInputStream in = new ObjectInputStream(fileIn); e = (Employee) in.readObject(); in.close(); fileIn.close(); }catch(IOException i) { i.printStackTrace(); return; }catch(ClassNotFoundException c) { System.out.println("Employee class not found"); c.printStackTrace(); return; } System.out.println("Deserialized Employee..."); System.out.println("Name: " + e.name); System.out.println("Address: " + e.address); System.out.println("SSN: " + e.SSN); System.out.println("Number: " + e.number); }
通过上面的代码,执行后SSN的值输出为0,这是因为该属性声明为暂时的,所以它是不可序列化的属性,也就没有保存在employee.ser中,你可以通过打开搜索SSN关键字来确认。在反序列化时,该属性的值110也就没有,而是0.通过上面的代码我们可以知道要将**字节序列转化为对象,需要使用ObjectInputStream的readObject方法**,具体readObject源码我们会在5处进行贴出来。
3.为什么要序列化与反序列化,它的应用场景是什么以及注意事项
说白了就是这东西能干嘛用?有什么便利提供给我们。总的来说可以归结为以下几点:
(1)永久性保存对象,保存对象的字节序列到本地文件或者数据库中;
(2)通过序列化以字节流的形式使对象在网络中进行传递和接收;
(3)通过序列化在进程间传递对象;
注意事项:
1.序列化时,只对对象的状态进行保存,而不管对象的方法;
2.当一个父类实现序列化,子类自动实现序列化,不需要显式实现Serializable接口;
3.当一个对象的实例变量引用其他对象,序列化该对象时也把引用对象进行序列化;
4.并非所有的对象都可以序列化,至于为什么不可以,有很多原因了,比如:
安全方面的原因,比如一个对象拥有private,public等field,对于一个要传输的对象,比如写到文件,或者进行RMI传输等等,在序列化进行传输的过程中,这个对象的private等域是不受保护的;
资源分配方面的原因,比如socket,thread类,如果可以序列化,进行传输或者保存,也无法对他们进行重新的资源分配,而且,也是没有必要这样实现;
5.声明为static和transient类型的成员数据不能被序列化。因为static代表类的状态,transient代表对象的临时数据。
6.序列化运行时使用一个称为 serialVersionUID 的版本号与每个可序列化类相关联,该序列号在反序列化过程中用于验证序列化对象的发送者和接收者是否为该对象加载了与序列化兼容的类。为它赋予明确的值。显式地定义serialVersionUID有两种用途:
在某些场合,希望类的不同版本对序列化兼容,因此需要确保类的不同版本具有相同的serialVersionUID;
在某些场合,不希望类的不同版本对序列化兼容,因此需要确保类的不同版本具有不同的serialVersionUID。
7.Java有很多基础类已经实现了serializable接口,比如String,Vector等。但是也有一些没有实现serializable接口的;
8.如果一个对象的成员变量是一个对象,那么这个对象的数据成员也会被保存!这是能用序列化解决深拷贝的重要原因;
4.序列化与反序列化底层是如何实现的?
这一节,我们带着问题去一个一个的解开谜团,看到代码的最终本质,从而加深我们队序列化反序列化的理解。
a.为什么一定要实现这个java.io.Serializable才能序列化?
我们可以通过去除Employee的Serializable实现,你会发现执行报异常,
异常如下:
java.io.NotSerializableException: Employee at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1180) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:346) at DeserializeDemo.saveSer(DeserializeDemo.java:25) at DeserializeDemo.main(DeserializeDemo.java:64)
那么我们就去看下writeObject0中的代码片段如下:
// remaining cases if (obj instanceof String) { writeString((String) obj, unshared); } else if (cl.isArray()) { writeArray(obj, desc, unshared); } else if (obj instanceof Enum) { writeEnum((Enum) obj, desc, unshared); } else if (obj instanceof Serializable) { writeOrdinaryObject(obj, desc, unshared); } else { if (extendedDebugInfo) { throw new NotSerializableException( cl.getName() + "\n" + debugInfoStack.toString()); } else { throw new NotSerializableException(cl.getName()); } }
会判断要被序列化的类是否是String、Enum、Array和Serializable类型,如果不是则直接抛出NotSerializableException。
b.String、Enum、Array都实现了Serializable为啥要单独拿出来进行序列化呢?
其实这个也很简单,因为ObjectOutPutStream给String、Enum、Array对象数据结构已经做了特殊的序列化的方法,
而除了上述三个外,唯一能够实现的就是通过实现Serializable来达到序列化。
c.Serializable这个东西一定要,那必须了解一下,这里面到底是个啥样的?
/** * 说明文字已经去掉了,如果要看可以自行查看源码, * 其实这里的说明也说明了如何实现序列化。 * @author unascribed * @see java.io.ObjectOutputStream * @see java.io.ObjectInputStream * @see java.io.ObjectOutput * @see java.io.ObjectInput * @see java.io.Externalizable * @since JDK1.1 */public interface Serializable {}
这只是一个空接口,实现这个接口只是为了标识为可序列化,所有实现了这个接口的对象,都会有一个serialVersionUID,这个东西使用与确定序列化与反序列化是否匹配的一个标识。具体的说明在 Serializable接口
中有说明,我把这部分贴出来如下,如果需查看全部,请进入源码自行查看:
/*** This readResolve method follows the same invocation rules and * accessibility rules as writeReplace.<p> * * The serialization runtime associates with each serializable class a version * number, called a serialVersionUID, which is used during deserialization to * verify that the sender and receiver of a serialized object have loaded * classes for that object that are compatible with respect to serialization. * If the receiver has loaded a class for the object that has a different * serialVersionUID than that of the corresponding sender's class, then * deserialization will result in an {@link InvalidClassException}. A * serializable class can declare its own serialVersionUID explicitly by * declaring a field named <code>"serialVersionUID"</code> that must be static, * final, and of type <code>long</code>:<p> * * <PRE> * ANY-ACCESS-MODIFIER static final long serialVersionUID = 42L; * </PRE> * * If a serializable class does not explicitly declare a serialVersionUID, then * the serialization runtime will calculate a default serialVersionUID value * for that class based on various aspects of the class, as described in the * Java(TM) Object Serialization Specification. However, it is <em>strongly * recommended</em> that all serializable classes explicitly declare * serialVersionUID values, since the default serialVersionUID computation is * highly sensitive to class details that may vary depending on compiler * implementations, and can thus result in unexpected * <code>InvalidClassException</code>s during deserialization. Therefore, to * guarantee a consistent serialVersionUID value across different java compiler * implementations, a serializable class must declare an explicit * serialVersionUID value. It is also strongly advised that explicit * serialVersionUID declarations use the <code>private</code> modifier where * possible, since such declarations apply only to the immediately declaring * class--serialVersionUID fields are not useful as inherited members. Array * classes cannot declare an explicit serialVersionUID, so they always have * the default computed value, but the requirement for matching * serialVersionUID values is waived for array classes. * */
举例:String
private static final long serialVersionUID = -6849794470754667710L;
d.Employee类中没有书写,那么它是什么时候加上版本号的呢?
当实现java.io.Serializable接口的类没有显式地定义一个serialVersionUID变量时候,Java序列化机制会根据编译的Class自动生成一个serialVersionUID作序列化版本比较用,这种情况下,如果Class文件(类名,方法明等)没有发生变化(增加空格,换行,增加注释等等),就算再编译多次,serialVersionUID也不会变化的。如果不显示的去写版本号,那么就可能造成反序列化时,因为类改变了(怎加了方法,修改了方法名等)而生成了不一样的版本号,那么原先序列化的字节序列将无法转成该版本的对象,因为版本不一致嘛。所以一定要显示的去设置版本号。
e.如果定制序列化策略,该如何实现呢?
回答这个问题前,我们先来看下数组(ArrayList)这个类。
public class ArrayList<E> extends AbstractList<E> implements List<E>, RandomAccess, Cloneable, java.io.Serializable{ private static final long serialVersionUID = 8683452581122892189L; /** * The array buffer into which the elements of the ArrayList are stored. * The capacity of the ArrayList is the length of this array buffer. */ private transient Object[] elementData; /** * The size of the ArrayList (the number of elements it contains). * * @serial */ private int size;}
private transient Object[] elementData;说明这个数据是临时数据,不能序列化的,但实际上操作,我们却能够序列化。这是为什么?
在序列化过程中,如果被序列化的类中定义了writeObject 和 readObject 方法,虚拟机会试图调用对象类里的 writeObject 和 readObject 方法,进行用户自定义的序列化和反序列化。
如果没有这样的方法,则默认调用是 ObjectOutputStream 的 defaultWriteObject 方法以及 ObjectInputStream 的 defaultReadObject 方法。
用户自定义的 writeObject 和 readObject 方法可以允许用户控制序列化的过程,比如可以在序列化的过程中动态改变序列化的数值。
5.阅读序列化反序列化源代码
// 序列化源码
/** * Write the specified object to the ObjectOutputStream. The class of the * object, the signature of the class, and the values of the non-transient * and non-static fields of the class and all of its supertypes are * written. Default serialization for a class can be overridden using the * writeObject and the readObject methods. Objects referenced by this * object are written transitively so that a complete equivalent graph of * objects can be reconstructed by an ObjectInputStream. * * <p>Exceptions are thrown for problems with the OutputStream and for * classes that should not be serialized. All exceptions are fatal to the * OutputStream, which is left in an indeterminate state, and it is up to * the caller to ignore or recover the stream state. * * @throws InvalidClassException Something is wrong with a class used by * serialization. * @throws NotSerializableException Some object to be serialized does not * implement the java.io.Serializable interface. * @throws IOException Any exception thrown by the underlying * OutputStream. */ public final void writeObject(Object obj) throws IOException { if (enableOverride) { writeObjectOverride(obj); return; } try { writeObject0(obj, false); } catch (IOException ex) { if (depth == 0) { writeFatalException(ex); } throw ex; } } /** * Underlying writeObject/writeUnshared implementation. */ private void writeObject0(Object obj, boolean unshared) throws IOException { boolean oldMode = bout.setBlockDataMode(false); depth++; try { // handle previously written and non-replaceable objects int h; if ((obj = subs.lookup(obj)) == null) { writeNull(); return; } else if (!unshared && (h = handles.lookup(obj)) != -1) { writeHandle(h); return; } else if (obj instanceof Class) { writeClass((Class) obj, unshared); return; } else if (obj instanceof ObjectStreamClass) { writeClassDesc((ObjectStreamClass) obj, unshared); return; } // check for replacement object Object orig = obj; Class cl = obj.getClass(); ObjectStreamClass desc; for (;;) { // REMIND: skip this check for strings/arrays? Class repCl; desc = ObjectStreamClass.lookup(cl, true); if (!desc.hasWriteReplaceMethod() || (obj = desc.invokeWriteReplace(obj)) == null || (repCl = obj.getClass()) == cl) { break; } cl = repCl; } if (enableReplace) { Object rep = replaceObject(obj); if (rep != obj && rep != null) { cl = rep.getClass(); desc = ObjectStreamClass.lookup(cl, true); } obj = rep; } // if object replaced, run through original checks a second time if (obj != orig) { subs.assign(orig, obj); if (obj == null) { writeNull(); return; } else if (!unshared && (h = handles.lookup(obj)) != -1) { writeHandle(h); return; } else if (obj instanceof Class) { writeClass((Class) obj, unshared); return; } else if (obj instanceof ObjectStreamClass) { writeClassDesc((ObjectStreamClass) obj, unshared); return; } } // remaining cases if (obj instanceof String) { writeString((String) obj, unshared); } else if (cl.isArray()) { writeArray(obj, desc, unshared); } else if (obj instanceof Enum) { writeEnum((Enum) obj, desc, unshared); } else if (obj instanceof Serializable) { writeOrdinaryObject(obj, desc, unshared); } else { if (extendedDebugInfo) { throw new NotSerializableException( cl.getName() + "\n" + debugInfoStack.toString()); } else { throw new NotSerializableException(cl.getName()); } } } finally { depth--; bout.setBlockDataMode(oldMode); } }
备注:
(1)将对象实例相关的类元数据输出。
(2)递归地输出类的超类描述直到不再有超类。
(3)类元数据完了以后,开始从最顶层的超类开始输出对象实例的实际数据值。
(4)从上至下递归输出实例的数据
//反序列化源码
/** * Read an object from the ObjectInputStream. The class of the object, the * signature of the class, and the values of the non-transient and * non-static fields of the class and all of its supertypes are read. * Default deserializing for a class can be overriden using the writeObject * and readObject methods. Objects referenced by this object are read * transitively so that a complete equivalent graph of objects is * reconstructed by readObject. * * <p>The root object is completely restored when all of its fields and the * objects it references are completely restored. At this point the object * validation callbacks are executed in order based on their registered * priorities. The callbacks are registered by objects (in the readObject * special methods) as they are individually restored. * * <p>Exceptions are thrown for problems with the InputStream and for * classes that should not be deserialized. All exceptions are fatal to * the InputStream and leave it in an indeterminate state; it is up to the * caller to ignore or recover the stream state. * * @throws ClassNotFoundException Class of a serialized object cannot be * found. * @throws InvalidClassException Something is wrong with a class used by * serialization. * @throws StreamCorruptedException Control information in the * stream is inconsistent. * @throws OptionalDataException Primitive data was found in the * stream instead of objects. * @throws IOException Any of the usual Input/Output related exceptions. */ public final Object readObject() throws IOException, ClassNotFoundException { if (enableOverride) { return readObjectOverride(); } // if nested read, passHandle contains handle of enclosing object int outerHandle = passHandle; try { Object obj = readObject0(false); handles.markDependency(outerHandle, passHandle); ClassNotFoundException ex = handles.lookupException(passHandle); if (ex != null) { throw ex; } if (depth == 0) { vlist.doCallbacks(); } return obj; } finally { passHandle = outerHandle; if (closed && depth == 0) { clear(); } } } /** * Underlying readObject implementation. */ private Object readObject0(boolean unshared) throws IOException { boolean oldMode = bin.getBlockDataMode(); if (oldMode) { int remain = bin.currentBlockRemaining(); if (remain > 0) { throw new OptionalDataException(remain); } else if (defaultDataEnd) { /* * Fix for 4360508: stream is currently at the end of a field * value block written via default serialization; since there * is no terminating TC_ENDBLOCKDATA tag, simulate * end-of-custom-data behavior explicitly. */ throw new OptionalDataException(true); } bin.setBlockDataMode(false); } byte tc; while ((tc = bin.peekByte()) == TC_RESET) { bin.readByte(); handleReset(); } depth++; try { switch (tc) { case TC_NULL: return readNull(); case TC_REFERENCE: return readHandle(unshared); case TC_CLASS: return readClass(unshared); case TC_CLASSDESC: case TC_PROXYCLASSDESC: return readClassDesc(unshared); case TC_STRING: case TC_LONGSTRING: return checkResolve(readString(unshared)); case TC_ARRAY: return checkResolve(readArray(unshared)); case TC_ENUM: return checkResolve(readEnum(unshared)); case TC_OBJECT: return checkResolve(readOrdinaryObject(unshared)); case TC_EXCEPTION: IOException ex = readFatalException(); throw new WriteAbortedException("writing aborted", ex); case TC_BLOCKDATA: case TC_BLOCKDATALONG: if (oldMode) { bin.setBlockDataMode(true); bin.peek(); // force header read throw new OptionalDataException( bin.currentBlockRemaining()); } else { throw new StreamCorruptedException( "unexpected block data"); } case TC_ENDBLOCKDATA: if (oldMode) { throw new OptionalDataException(true); } else { throw new StreamCorruptedException( "unexpected end of block data"); } default: throw new StreamCorruptedException( String.format("invalid type code: %02X", tc)); } } finally { depth--; bin.setBlockDataMode(oldMode); } }
下面是知识扩展:有兴趣的同学可以看看,非常的不错。
美团技术团队:序列化与反序列化