RC04: Typechecking TCP states in Rust

November 25, 2023

I'm attending the Fall 2 batch at Recurse Center! Posts in this series cover things I'm working on or find interesting during my time here.

I’ve been working on writing a toy TCP implementation in Rust to learn more about how TCP works (and to become more proficient at Rust). Unlike UDP, TCP is connection oriented. Before a TCP socket can be used to send and receive data, the socket first needs to establish a connection with the remote host by initiating a three-way handshake. The socket is also expected to close the connection once there’s no more data to send.

What do TCP states tell us?

To model these connection establishment, data transfer, and connection termination phases of the socket, TCP has various states, as defined in RFC793, that indicate its position in the lifecycle. These positions indicate the operations that the socket is allowed to do and what specific packets it might need to await from the remote host. For example, the TCP socket cannot send application data to the remote host before it is in the Established state. Additionally, state transitions occur in response to a well-defined set of events, triggered either by application-level or user actions, or an incoming segment from the remote server.

tcp states

A simplified TCP state diagram. Source: Wikipedia’s page on TCP.

A first approach

My initial instinct at implementing TCP states was to use a Rust enum. Then, I’d use an if-statement or match-statement to determine the current state before proceeding.

enum TcpState {
    Closed,
    SynSent,
    Established,
    // .. & other states.
}

struct TcpStream {
    source: u16,
    state: TcpState,
}

impl TcpStream {
    fn write(&mut self) -> Result<usize, &'static str> {
        match self.state {
            Closed => Err("Invalid state."),
            Established => todo!("Implement writing data here."),
            // .. & other states
        }
    }
}

This works, but it’ll require the code to check the TCP connection’s state every time before performing an action. This also results in additional nesting and CPU operations performed (at runtime). And if a user attempts to call a method that doesn’t work with the current state, they’ll get a runtime error, which is almost always eventually less fun to deal with than a compile time error.

Modelling TCP states using type parameters

While reading Rust for Rustaceans, I encountered an intriguing pattern that enabled encoding and enforcing allowed methods using the type checker¹. This is done using type parameters.

pub struct Closed;
pub struct Established {
    source: u16,
    // & other data.
}

pub struct TcpStream<State> {
    socket_addr_v4: SocketAddrV4,
    state: State,
}

impl<T> TcpStream<T> {
    pub fn peer_addr(&self) -> io::Result<SocketAddrV4> {
        Ok(self.socket_addr_v4)
    }
}

impl TcpStream<Closed> {
    pub fn connect<T: ToSocketAddrs>(
        addr: T,
    ) -> io::Result<TcpStream<Established>> {
        // Connect to remote server.
    }
}

impl TcpStream<Established> {
    fn close(&mut self) -> Result<(), &'static str> {
        // Close connection
    }
}

impl io::Write for TcpStream<Established> {
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
        // Send data to remote host.
    }
}

In the example code above, TcpStream is a struct that expects a type parameter State, which we use to represent the TCP states Closed and Established. The method peer_addr is available to TcpStream in any state, while close can only be called by TcpState when it’s Established. Attempting to call close in other non-Established TcpStates will result in a type error during compilation.

This is rather neat since we’ve eliminated the runtime checking of TCP state altogether and it’s now much clearer which methods are allowed in which TcpStream states. An additional bonus is that the allowed state transitions are now enforced by the typechecker.

Thus far, one issue that I’ve encountered from this approach is that the compiler prohibits a specialized implementation for the Drop trait ². For example, writing

impl Drop for TcpStream<Established> {
    fn drop(&mut self) {
        // Implementation here.
    }
}

will result in an error. There seem to be good reasons for this (see here and here), and I haven’t quite figured out a workaround for this yet that I like.

If this sounds interesting, Cliff Biffle gives a more detailed and thorough walkthrough on this pattern in his blog post here.

See Chapter 3 of the book, under the section “Type System Guidance”. In it (and in Biffle’s blog post above), the author also gives an example using PhantomData which is useful if there’s no state-specific data to store.↩
As far as I’m aware, this only applies to the Drop trait. Implementing write for TcpStream<Established>, as in the example above, doesn’t raise any issues.↩

Hi! I’m Stacey. Welcome to my technical blog. Outside of computers, I also love brewing Japanese teas 🍵, reading fiction, and discovering random word origins.