Difference between session.timeout.ms and max.poll.interval.ms for Kafka

The latest version of Kafka we have two session.timeout.ms and max.poll.interval.ms. Prior to Kafka 0.10.0 we only had session.timeout.ms. So, why Kafka has session.timeout.ms and max.poll.interval.ms?

Initially, Kafka checked the heartbeats of the consumer and calls to poll() using session.timeout.ms and it was tightly coupled.

Problem of tightly coupled

If the heartbeat and calls to the poll() are coupled then we will need to set session.timeout.ms for both the cases. Let say a message takes longer than 2 minutes to process then we will have to set session.timeout.ms to longer than 2 minutes. Now, if the consumer dies, it also takes longer than 2 minutes to detect the failed consumer.

Solution of the coupled problem

To solve the above issue Kafka decouples polling and heartbeat with two settings session.timeout.ms and max.poll.interval.ms. Now we have two threads running, the heartbeat thread and the processing thread. Kafka introduced a timeout for each. session.timeout.ms is for the heartbeat thread and max.poll.interval.ms is for the processing thread.

Now if set heartbeat thread session.timeout.ms to 10 seconds and processing thread max.poll.interval.ms 2 minutes. If the processing thread dies, it takes 2 minutes to detect this. However, if the whole consumer dies (and a dying processing thread crashes the whole consumer including the heartbeat thread), it takes only 10 seconds to detect it.