Beyond socket options: making the Linux TCP stack truly extensible

The Transmission Control Protocol (TCP) is one of the most important protocols in today’s Internet. It was designed to be extensible for various use cases. A client can propose to use an extension over a given TCP connection by sending an option that identifies this extension during the three-way handshake, while a few other options such as RFC5482 can be sent directly without negotiation. That's the theory that all networking students learn in networking textbooks. In practice, deploying a TCP extension is much more difficult as the maintainers of client stacks often wait until servers implement a given extension and server maintainers look at clients in the same manner. It often takes several years to actually deploy an option at a large scale.  In this paper, we focus on the Linux TCP stack since it is one of the most widely used TCP stacks, given its utilisation on the servers and Android devices. 

Thanks to eBPF,  a powerful technology in  recent Linux kernel versions, a different deployment model for TCP options is possible. This is illustrated in the figure below. If an application wants to use a specific TCP extension, it can inject the corresponding code inside the underlying TCP stack.

We demonstrates that this technique allows to deploy new TCP extensions. As an illustration, here are two of the TCP extensions discussed in the paper.
The Linux TCP implementation supports a wide range of congestion control schemes as pluggable modules. Most servers select one of these congestion controllers and use it for all their connections. The first TCP extension described in this paper is a simple TCP option that can be sent by a client to request the utilisation of a specific congestion control scheme by the server. As the TCP option is exchanged during the three-way handshake, it affects the entire connection. The figure below shows the impact of a specific congestion control scheme on the round-trip-time experienced by a long TCP connection.

Another example is a TCP option that is used by a client, e.g. a smartphone, during the three-way handshake, to bound the value of the initial congestion window on the server. This feature could be used to let a smartphone advertise the congestion window that servers should use in function of the network conditions (2G, 3G, 4G, …). Android smartphones already tune their TCP windows based on the current characteristics of the link layer, but this information is not communicated to the server. The figure below compares the page load time with different values of the option proposed by the client.
With the approach described in this paper, it becomes possible to innovate again in the Linux TCP stack. With such flexible TCP options, it could even be possible to deploy applications that perform A/B testing with the underlying TCP stack or adapt it to their needs.
 

This paper awarded by the second place of Best Paper at IFIP NETWORKING 2019 May 20-22, 2019 – Warsaw, Poland

About the authors :

Hoang Viet TRAN is a researcher within the IP Networking Lab at UCLouvain.

Olivier Bonaventure, Professor of Computer Science, UCLouvain, Department of Computing Science and Engineering of the Institute

Published on June 12, 2019