Symptoms

A backup to a remote server fails with the following error:

-1 : Pipe read error: 110 (Connection timed out)

Cause

The backup session was terminated due to a hit TCP connection timeout. This may be caused by either a Network Hardware Firewall (like Cisco ASA or similar) configuration or insufficient TCPKeepAlive configuration on the nodes.

  • Scenario 1: A backup session is waiting for some local operation to complete and not sending any data. After the TCP timeout is reached, the Hardware Firewall closes communication for that session.
  • Scenario 2: A backup session is waiting for some local operation to complete and not sending any data. After tcp_keepalive_time + tcp_keepalive_probes * tcp_keepalive_intvl, the neighbor node closes the session.

Resolution

TCP Keepalive settings should be tuned on both source and destination nodes.

The default values are:

# cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
# cat /proc/sys/net/ipv4/tcp_keepalive_probes
9

tcp_keepalive_time should be the same on both nodes, and it should be less than TCP timeout configured on the Hardware Firewall. This will instruct the nodes to start KeepAlive communication before Hardware Firewall timeout.

For example, if the Hardware Firewall is configured with 10 minutes TCP timeout:

[root@pcs ~]# echo 540 > /proc/sys/net/ipv4/tcp_keepalive_time

The above command will instruct the kernel to start sending KeepAlive probes after 9 minutes.

To increase the Keepalive duration:

[root@pcs ~]# echo 100 > /proc/sys/net/ipv4/tcp_keepalive_probes
[root@pcs ~]# echo 100 > /proc/sys/net/ipv4/tcp_keepalive_intvl

Please note, tcp_keepalive_intvl should be the same on both nodes, and it should be less then TCP timeout configured on the Hardware Firewall

The value of tcp_keepalive_probes should be chosen depending on the application needs. 100 KeepAlive packets with 100 seconds interval will result in more than two and a half hours of a live connection, which should be sufficient for most cases.

Note: correct /etc/sysctl.conf to make these changes permanent.

net.ipv4.tcp_keepalive_time = 540
net.ipv4.tcp_keepalive_probes = 100
net.ipv4.tcp_keepalive_intvl = 100

Internal content