全栈猎人
首发于全栈猎人
使用setsocketopt, SO_KEEPALIVE 进行链接保活

使用setsocketopt, SO_KEEPALIVE 进行链接保活

原文发表在:

setsockopt, SO_KEEPALIVE and Heartbeatsholmeshe.me

心跳包主要有两个作用,对于后台应用,心跳包可以用于监控客户端状态,当客户端断连后能及时释放链接和对应的系统,业务资源;对于客户端,心跳则是用于防止链接资源被中间节点(比如NAT)释放,从而达成链接保活的目的。

本文将讲解如何用setsockopt()配置socket 选项,SO_KEEPALIVE, TCP_KEEPIDLE, TCP_KEEPINTVL and TCP_KEEPCNT来发送心跳包;并且讨论使用心跳包来进行链接保活的通用原则。

实验环境:

OS: Unbutu 16.04

gcc: 5.4.0

链接保活

在很多情况下断连是无法察觉的,比如NAT记录超时。一个NAT记录包含了4元组信息(源ip,源端口,目标ip,目标端口)。由于内存限制,路由设备会在超时后随时删除一些被用于地址转换的记录。于是即使任何一侧都没有发送FIN或者RST包,失去NAT记录的的链接就实际上被切断了。

重连是有代价的。首先是握手的(3XRTT)时候,用户必须傻等;然后重连后无缝恢复体验也需要增加额外的开发量。

为了避免额外的握手和其引入的RTT,HTTP增加了KEEP-ALIVE机制,所以多个HTTP短会话能够复用同一个TCP长链接。但这是另外一个故事了。

接下来我们来用两组程序来说明心跳保活的详细机制,先来看一个服务端的代码,

...include

#define BUF_SIZE 256

int main(int argc, char *argv[])
{
  int sfd, rfd, portno, clilen;
  char buffer[BUF_SIZE];
  struct sockaddr_in serv_addr, cli_addr;
  int n;

  if (argc < 2) { perror("ERROR: no port\n"); exit(1); }

  sfd = socket(AF_INET, SOCK_STREAM, 0);
  if (sfd < 0) { perror("ERROR: socket()"); exit(1); }

  bzero((char *) &serv_addr, sizeof(serv_addr));
  portno = atoi(argv[1]);
  serv_addr.sin_family = AF_INET;
  serv_addr.sin_addr.s_addr = INADDR_ANY;
  serv_addr.sin_port = htons(portno);

  if (bind(sfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) { perror("ERROR: bind()"); exit(1); }

  listen(sfd, 5);

  while (1) {
    clilen = sizeof(cli_addr);
    rfd = accept(sfd, (struct sockaddr *) &cli_addr, &clilen);
    if (rfd < 0) { perror("ERROR: accept()"); exit(1); }

    while (1) {
      n = read(rfd, buffer, BUF_SIZE);
      if (n <= 0) { printf("read() ends\n"); break; }

      printf("received: %s %d\n", buffer, n);
      snprintf(buffer, BUF_SIZE, "got it");
      n = write(rfd, buffer, BUF_SIZE);
      if (n < 0) { perror("ERROR: write()"); exit(1); }
    }
  }

  return 0;
}

简单起见,这里没有使用多路服用,所以服务端一次只能接受一个客户端的链接。

客户端,

...include

int main(int argc, char *argv[]) {
  int sfd, portno, n;

  struct sockaddr_in srvaddr;
  struct hostent *host;

  if (argc < 3) { fprintf(stderr,"usage: %s ip port\n", argv[0]); exit(0); }

  portno = atoi(argv[2]);
  sfd = socket(AF_INET, SOCK_STREAM, 0);
  if (sfd < 0) { perror("ERROR: socket()"); exit(0); }

  int flags =1;
  if (setsockopt(sfd, SOL_SOCKET, SO_KEEPALIVE, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPALIVE"); exit(0); };

  flags = 10;
  if (setsockopt(sfd, SOL_TCP, TCP_KEEPIDLE, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPIDLE"); exit(0); };

  host = gethostbyname(argv[1]);
  if (host == NULL) { fprintf(stderr,"ERROR: host does not exist"); exit(0); }

  bzero((char *) &srvaddr, sizeof(srvaddr));
  srvaddr.sin_family = AF_INET;
  srvaddr.sin_port = htons(portno);

  bcopy((char *)host->h_addr, (char *)&srvaddr.sin_addr.s_addr, host->h_length);

  if (connect(sfd, (struct sockaddr *)&srvaddr, sizeof(srvaddr)) < 0) { perror("ERROR: connect()"); exit(0); }

  sleep(100000);

  return 0;
}

当设置了上述的socket选项,客户端用connetc()完成三次握手,然后直接调用sleep()放弃CPU。

如果对网络编程不太熟,可以先看这篇文章。

然后,我们来看看网络交互,

sudo tcpdump -i wlp3s0 dst net 192.168.1.71 or src net 192.168.1.71 and not dst port 22 and not src port 22


// ========================> start handshakes
12:21:42.437163 IP 192.168.1.66.43066 > 192.168.1.71.6666: Flags [S], seq 3002564942, win 29200, options [mss 1460,sackOK,TS val 7961984 ecr 0,nop,wscale 7], length 0
12:21:42.439960 IP 192.168.1.71.6666 > 192.168.1.66.43066: Flags [S.], seq 3450454053, ack 3002564943, win 28960, options [mss 1460,sackOK,TS val 2221927 ecr 7961984,nop,wscale 7], length 0
12:21:42.440088 IP 192.168.1.66.43066 > 192.168.1.71.6666: Flags [.], ack 1, win 229, options [nop,nop,TS val 7961985 ecr 2221927], length 0
// ========================> end handshakes
12:21:52.452057 IP 192.168.1.66.43066 > 192.168.1.71.6666: Flags [.], ack 1, win 229, options [nop,nop,TS val 7964488 ecr 2221927], length 0
12:21:52.454443 IP 192.168.1.71.6666 > 192.168.1.66.43066: Flags [.], ack 1, win 227, options [nop,nop,TS val 2224431 ecr 7961985], length 0
12:22:02.468056 IP 192.168.1.66.43066 > 192.168.1.71.6666: Flags [.], ack 1, win 229, options [nop,nop,TS val 7966992 ecr 2224431], length 0
12:22:02.470458 IP 192.168.1.71.6666 > 192.168.1.66.43066: Flags [.], ack 1, win 227, options [nop,nop,TS val 2226935 ecr 7961985], length 0
12:22:12.484119 IP 192.168.1.66.43066 > 192.168.1.71.6666: Flags [.], ack 1, win 229, options [nop,nop,TS val 7969496 ecr 2226935], length 0
12:22:12.489786 IP 192.168.1.71.6666 > 192.168.1.66.43066: Flags [.], ack 1, win 227, options [nop,nop,TS val 2229440 ecr 7961985], length 0
这里我去掉了无关的ARP输出,如果你对tcpdump不熟,可以先看看这篇文章。

有了基本的上手经验后,现在开始解释心跳保活机制。

1)SO_KEEPALIVE 用于开启(或者关闭)心跳;

int flags =1;
if (setsockopt(sfd, SOL_SOCKET, SO_KEEPALIVE, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPALIVE"); exit(0); };

2)开启心跳的一端(这里是客户端)发送ACK包用于心跳(👁 Flags [.]);

3)收到心跳包后,另一(服务)端回复ACK(👁 Flags [.]);

4)TCP_KEEPIDLE 定义了心跳包的间隔(👁 时间戳)。

flags = 10;
if (setsockopt(sfd, SOL_TCP, TCP_KEEPIDLE, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPIDLE"); exit(0); };

这里要注意,整个过程 服务端的 read() 一直阻塞, 说明心跳包对接受者是透明的。

发现断连

除开NAT记录超时,还有各种断连的情况(比如网线送了)。服务端需要及时捕捉这个异常,从而可以释放资源,调用清理过程并且/或者通知其他客户端。这就是通常心跳包是在服务端开启的原因。

5)TCP_KEEPINTVL定义了异常心跳(对方无回音)的频率;

6)TCP_KEEPCNT定义了异常心跳的阀值(超过后则断开链接)。

然后我们稍微修改服务和客户端的代码来验证这个特性。

在服务端,我们加上上述的socket选项,

...include

#define BUF_SIZE 256

int main(int argc, char *argv[])
{
  int sfd, rfd, portno, clilen;
  char buffer[BUF_SIZE];
  struct sockaddr_in serv_addr, cli_addr;
  int n;
  int flags = 1;

  if (argc < 2) { perror("ERROR: no port\n"); exit(0); }

  sfd = socket(AF_INET, SOCK_STREAM, 0);
  if (sfd < 0) { perror("ERROR: socket()"); exit(0); }

  flags = 10;
  if (setsockopt(sfd, SOL_TCP, TCP_KEEPIDLE, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPIDLE"); exit(0); };

  flags = 5;
  if (setsockopt(sfd, SOL_TCP, TCP_KEEPCNT, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPCNT"); exit(0); };

  flags = 5;
  if (setsockopt(sfd, SOL_TCP, TCP_KEEPINTVL, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPINTVL"); exit(0); };

  bzero((char *) &serv_addr, sizeof(serv_addr));
  portno = atoi(argv[1]);
  serv_addr.sin_family = AF_INET;
  serv_addr.sin_addr.s_addr = INADDR_ANY;
  serv_addr.sin_port = htons(portno);

  if (bind(sfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) { perror("ERROR: bind()"); exit(0); }

  listen(sfd, 5);
  while (1) {
    clilen = sizeof(cli_addr);
    rfd = accept(sfd, (struct sockaddr *) &cli_addr, &clilen);
    if (rfd < 0) { perror("ERROR: accept()"); exit(0); }

    while (1) {
      flags =1;
      if (setsockopt(rfd, SOL_SOCKET, SO_KEEPALIVE, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPALIVE"); exit(0); };

      n = read(rfd, buffer, BUF_SIZE);
      if (n <= 0) { printf("read() ends\n"); break; }

      printf("received: %s %d\n", buffer, n);
      snprintf(buffer, BUF_SIZE, "got it");
      n = write(rfd, buffer, BUF_SIZE);
      if (n < 0) { perror("ERROR: write()"); exit(0); }
    }
  }

  return 0;
} 

然后将客户端修改为,

include

int main(int argc, char *argv[]) {
  int sfd, portno, n;
  struct sockaddr_in srvaddr;
  struct hostent *host;
  
  if (argc < 3) { fprintf(stderr,"usage: %s ip port\n", argv[0]); exit(0); }
  
  portno = atoi(argv[2]);
  sfd = socket(AF_INET, SOCK_STREAM, 0);
  if (sfd < 0) { perror("ERROR: socket()"); exit(0); }
  
  host = gethostbyname(argv[1]);
  if (host == NULL) { fprintf(stderr,"ERROR: host does not exist"); exit(0); }
  
  bzero((char *) &srvaddr, sizeof(srvaddr));
  srvaddr.sin_family = AF_INET;
  srvaddr.sin_port = htons(portno);
  bcopy((char *)host->h_addr, (char *)&srvaddr.sin_addr.s_addr, host->h_length);
  
  if (connect(sfd, (struct sockaddr *)&srvaddr, sizeof(srvaddr)) < 0) { perror("ERROR: connect()"); exit(0); }
  
  sleep(100000);
  
  return 0;
}

tcpdump输出

// ========================> handshakes are omitted here
20:04:12.535386 IP 192.168.1.66.49232 > 192.168.1.71.6666: Flags [.], ack 1, win 229, options [nop,nop,TS val 12312604 ecr 9154395], length 0
20:04:22.538591 IP 192.168.1.71.6666 > 192.168.1.66.49232: Flags [.], ack 1, win 227, options [nop,nop,TS val 9161936 ecr 12312604], length 0
20:04:22.570817 IP 192.168.1.66.49232 > 192.168.1.71.6666: Flags [.], ack 1, win 229, options [nop,nop,TS val 12315113 ecr 9154395], length 0
// ========================> we unplug the network connection here
20:04:32.586590 IP 192.168.1.71.6666 > 192.168.1.66.49232: Flags [.], ack 1, win 227, options [nop,nop,TS val 9164448 ecr 12315113], length 0
20:04:37.594590 IP 192.168.1.71.6666 > 192.168.1.66.49232: Flags [.], ack 1, win 227, options [nop,nop,TS val 9165700 ecr 12315113], length 0
20:04:42.602590 IP 192.168.1.71.6666 > 192.168.1.66.49232: Flags [.], ack 1, win 227, options [nop,nop,TS val 9166952 ecr 12315113], length 0
20:04:47.610591 IP 192.168.1.71.6666 > 192.168.1.66.49232: Flags [.], ack 1, win 227, options [nop,nop,TS val 9168204 ecr 12315113], length 0
20:04:52.618596 IP 192.168.1.71.6666 > 192.168.1.66.49232: Flags [.], ack 1, win 227, options [nop,nop,TS val 9169456 ecr 12315113], length 0

这里我们设置了异常心跳包的间隔为5秒(👁 时间戳),阀值为5

flags = 5;
if (setsockopt(sfd, SOL_TCP, TCP_KEEPCNT, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPCNT"); exit(0); };

flags = 5;
if (setsockopt(sfd, SOL_TCP, TCP_KEEPINTVL, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPINTVL"); exit(0); };

所以在5个心跳包没有被回复后,一直阻塞的

n = read(rfd, buffer, BUF_SIZE);

返回。所以不像心跳包本身,断连检测是会通知到监听方的。

一些思考

啥时候应该禁用心跳

在移动互联网,定期发送网络包会使天线长期保持激活态。当这个动作发生在后台运行的app时,无端突增耗电可能会吓到用户。所以这种情况下我会选择重连。

啥时候可以不使用心跳

当后台一直很繁忙时,业务包本身就是链接没断的证明。所以,只用在长时间没有业务包上来时从服务端干掉链接即可。

如果想更精细一点,也可以在一个客户端长期静默后在开启心跳。这里要注意如果需要在运行中开启心跳选项,需要使用accept()返回的文件描述符,rdf 来表示已经建立的链接。其他选项的值则可以从sdf继承。

系统参数

上述选项也可以用procfssysctl来改系统参数。

TCP_KEEPIDLE -> /net/ipv4/tcp_keepalive_time
TCP_KEEPCNT -> /net/ipv4/tcp_keepalive_probes
TCP_KEEPINTVL -> /net/ipv4/tcp_keepalive_intvl

参考资料

hpbn.co/
tldp.org/HOWTO/html_sin
gist.github.com/physacc
notes.shichao.io/unp/ch

文章被以下专栏收录