Kryo and FST Serialization

Using efficient Java serialization (Kryo and FST) in Dubbo

Table of contents

  • Serialization talk
  • Enable Kryo and FST
  • Register the class to be serialized
  • No parameter constructor and Serializable interface
  • Serialization performance analysis and testing
    • test environment
    • Test script
    • Comparison of byte sizes generated by different serializations in Dubbo RPC
    • Comparison of response time and throughput of different serializations in Dubbo RPC
  • future

Serialization talk

dubbo RPC is the core high-performance, high-throughput remote call method in the dubbo system. I like to call it a multiplexed TCP long connection call. Simply put:

  • Long connection: avoiding the need to create a new TCP connection each time, improving the response speed of the call
  • Multiplexing: A single TCP connection can alternately transmit multiple request and response messages, reducing the waiting idle time of the connection, thereby reducing the number of network connections under the same concurrent number and improving system throughput.

dubbo RPC is mainly used for remote calls between two dubbo systems, especially suitable for Internet scenarios with high concurrency and small data.

Serialization also plays a vital role in the response speed, throughput, and network bandwidth consumption of remote calls, and is one of the most critical factors for us to improve the performance of distributed systems.

In dubbo RPC, multiple serialization methods are supported at the same time, for example:

  1. Dubbo serialization: Ali has not yet developed a mature and efficient java serialization implementation, and Ali does not recommend using it in a production environment
  2. Hessian2 serialization: Hessian is a cross-language efficient binary serialization method. But here is actually not the original hessian2 serialization, but the hessian lite modified by Ali, which is the default serialization method enabled by dubbo RPC
  3. JSON serialization: There are currently two implementations, one is to use Ali’s fastjson library, and the other is to use the simple json library implemented by dubbo, but the implementation is not particularly mature, and the text sequence of json The serialization performance is generally not as good as the above two binary serializations.
  4. Java serialization: It is mainly implemented by using the Java serialization that comes with the JDK, and the performance is not ideal.

In general, the performance of the four main serialization methods decreases from top to bottom. For dubbo RPC, which pursues high-performance remote calls, there are actually only two high-efficiency serialization methods, 1 and 2, that are more suitable, and the first dubbo serialization is still immature, so only 2 is actually available. So dubbo RPC uses hessian2 serialization by default.

But hessian is an older serialization implementation, and it is cross-language, so it is not optimized for java alone. In fact, dubbo RPC is a remote call from Java to Java. In fact, there is no need to adopt cross-language serialization (of course, cross-language serialization is certainly not excluded).

In recent years, various new efficient serialization methods have emerged one after another, constantly refreshing the upper limit of serialization performance, the most typical ones include:

  • Specifically for the Java language: Kryo, FST, etc.
  • Cross-language: Protostuff, ProtoBuf, Thrift, Avro, MsgPack, etc.

The performance of most of these serialization methods is significantly better than hessian2 (even including the immature dubbo serialization).

In view of this, we introduce two efficient Java serialization implementations, Kryo and FST, for dubbo to gradually replace hessian2.

Among them, Kryo is a very mature serialization implementation, which has been widely used in Twitter, Groupon, Yahoo and many famous open source projects (such as Hive and Storm). While FST is a newer serialization implementation, it still lacks enough mature use cases, but I think it is still very promising.

In production-oriented applications, I recommend Kryo as the preferred choice for now.

Enable Kryo and FST

Using Kryo and FST is very simple, just add the corresponding dependencies first: More plugins: Dubbo SPI Extensions


Then add an attribute in the XML configuration of dubbo RPC:

<dubbo:protocol name="dubbo" serialization="kryo"/>
<dubbo:protocol name="dubbo" serialization="fst"/>

Register the class to be serialized

To make Kryo and FST fully perform with high performance, it is best to register those classes that need to be serialized in the dubbo system. For example, we can implement the following callback interface:

public class SerializationOptimizerImpl implements SerializationOptimizer {

    public Collection<Class> getSerializableClasses() {
        List<Class> classes = new LinkedList<Class>();
        classes. add(BidResponse. class);
        classes. add(Device. class);
        classes. add(Geo. class);
        classes. add(Impression. class);
        return classes;

Then add in XML configuration:

<dubbo:protocol name="dubbo" serialization="kryo" optimizer="org.apache.dubbo.demo.SerializationOptimizerImpl"/>

After registering these classes, serialization performance may be greatly improved, especially for small numbers of nested objects.

Of course, when serializing a class, many classes may be cascaded, such as Java collection classes. In response to this situation, we have automatically registered common classes in the JDK, so you don’t need to register them repeatedly (of course, there is no effect if you register repeatedly), including:

Gregorian Calendar
String Builder

Since registering classes to be serialized is only for performance optimization purposes, it doesn’t matter if you forget to register some classes. In fact, even without registering any classes, the performance of Kryo and FST is generally better than that of hessian and dubbo serialization.

Of course, someone may ask why not use configuration files to register these classes? This is because there are often a large number of classes to be registered, resulting in lengthy configuration files; and without good IDE support, writing and refactoring configuration files are much more troublesome than java classes; finally, these registered classes are generally It is not necessary to make dynamic modifications after the project is compiled and packaged.

In addition, some people will also think that manually registering the serialized class is a relatively cumbersome work, can it be marked with annotation, and then the system will automatically discover and register. But the limitation of annotation here is that it can only be used to mark classes that you can modify, and many classes referenced in serialization are likely to be things that you cannot modify (such as third-party libraries or JDK system classes or classes of other projects ). In addition, adding annotation after all slightly “polluted” the code, making the application code a little bit more dependent on the framework.

In addition to annotation, we can also consider other ways to automatically register serialized classes, such as scanning the class path, automatically discovering classes that implement the Serializable interface (even including Externalizable) and registering them. Of course, we know that there may be a lot of Serializable classes on the classpath, so we can also consider using package prefixes to limit the scanning range to a certain extent.

Of course, in the automatic registration mechanism, it is especially necessary to consider how to ensure that both the service provider and the consumer register classes in the same order (or ID) to avoid misalignment. After all, the number of classes that can be discovered and registered at both ends may be the same. are different.

No parameter constructor and Serializable interface

If the class to be serialized does not contain a parameterless constructor, the performance of Kryo’s serialization will be greatly reduced, because at this time we will use Java’s serialization to transparently replace Kryo’s serialization at the bottom layer. Therefore, it is a best practice to add a no-argument constructor for each serialized class as much as possible (of course, if a java class does not customize a constructor, it will have a no-argument constructor by default).

In addition, Kryo and FST do not require the serialized class to implement the Serializable interface, but we still recommend that every serialized class implement it, because this can maintain compatibility with Java serialization and dubbo serialization. In addition It also makes it possible for us to adopt some of the above automatic registration mechanisms in the future.

Serialization performance analysis and testing

In this article, we mainly discuss serialization, but when doing performance analysis and testing, we do not deal with each serialization method separately, but put them in dubbo RPC for comparison, because this is more realistic.

test environment

Roughly as follows:

  • Two independent servers
  • Quad-core Intel(R) Xeon(R) CPU E5-2603 0 @ 1.80GHz
  • 8G memory
  • The network between virtual machines passes through a 100M switch
  • CentOS 5
  • JDK 7
  • Tomcat 7
  • JVM parameters -server -Xms1g -Xmx1g -XX:PermSize=64M -XX:+UseConcMarkSweepGC

Of course, the test environment is limited, so the current test results may not be very authoritative and representative.

Test script

Keeping close to dubbo’s own benchmarks:

10 concurrent clients continuously making requests:

  • Pass in a nested complex object (but the amount of individual data is small), do not do any processing, and return as it is
  • Pass in 50K strings, do not do any processing, and return as they are (TODO: the result has not been listed yet)

Run a 5-minute performance test. (Quoting dubbo’s own test considerations: “It mainly examines the performance of serialization and network IO, so the server does not have any business logic. The reason for taking 10 concurrency is to consider that the rpc protocol may have a high CPU usage rate under high concurrency. to the bottleneck.”)

Comparison of byte sizes generated by different serializations in Dubbo RPC

The size of the number of bytes generated by serialization is a relatively deterministic indicator, which determines the network transmission time and bandwidth occupation of the remote call.

The results for complex objects are as follows (lower numbers are better):

Serialization ImplementationRequest BytesResponse Bytes
Dubbo Serialization430186
Java Serialization963630

Comparison of different serialization response time and throughput in Dubbo RPC

Remote call methodAverage response timeAverage TPS (transactions per second)
REST: Jetty + JSON7.8061280
REST: Tomcat + JSON2.0824796
REST: Netty + JSON2.1824576
Dubbo: FST1.2118244
Dubbo: kyro1.1828444
Dubbo: dubbo serialization1.436982
Dubbo: hessian21.496701
Dubbo: fastjson1.5726352



Test Summary

As far as the current results are concerned, we can see that Kryo and FST have significantly improved compared to the original serialization method in Dubbo RPC, regardless of the size of generated bytes, average response time and average TPS.


In the future, when Kryo or FST is mature enough in dubbo, we will probably change the default serialization of dubbo RPC from hessian2 to one of them.

Last modified January 2, 2023: Enhance en docs (#1798) (95a9f4f6c)