Symptoms
A backup to a remote server fails with the following error:
-1 : Pipe read error: 110 (Connection timed out)
Cause
The backup session was terminated due to a hit TCP connection timeout. This may be caused by either a Network Hardware Firewall (like Cisco ASA or similar) configuration or insufficient TCPKeepAlive configuration on the nodes.
- Scenario 1: A backup session is waiting for some local operation to complete and not sending any data. After the TCP timeout is reached, the Hardware Firewall closes communication for that session.
- Scenario 2: A backup session is waiting for some local operation to complete and not sending any data. After
tcp_keepalive_time
+tcp_keepalive_probes
*tcp_keepalive_intvl
, the neighbor node closes the session.
Resolution
TCP Keepalive settings should be tuned on both source and destination nodes.
The default values are:
# cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
# cat /proc/sys/net/ipv4/tcp_keepalive_probes
9
tcp_keepalive_time
should be the same on both nodes, and it should be less than TCP timeout configured on the Hardware Firewall. This will instruct the nodes to start KeepAlive communication before Hardware Firewall timeout.
For example, if the Hardware Firewall is configured with 10 minutes TCP timeout:
[root@pcs ~]# echo 540 > /proc/sys/net/ipv4/tcp_keepalive_time
The above command will instruct the kernel to start sending KeepAlive probes after 9 minutes.
To increase the Keepalive duration:
[root@pcs ~]# echo 100 > /proc/sys/net/ipv4/tcp_keepalive_probes
[root@pcs ~]# echo 100 > /proc/sys/net/ipv4/tcp_keepalive_intvl
Please note, tcp_keepalive_intvl
should be the same on both nodes, and it should be less then TCP timeout configured on the Hardware Firewall
The value of tcp_keepalive_probes
should be chosen depending on the application needs. 100 KeepAlive packets with 100 seconds interval will result in more than two and a half hours of a live connection, which should be sufficient for most cases.
Note: correct /etc/sysctl.conf
to make these changes permanent.
net.ipv4.tcp_keepalive_time = 540
net.ipv4.tcp_keepalive_probes = 100
net.ipv4.tcp_keepalive_intvl = 100