Linux 实例常用内核网络参数介绍与常见问题处理

  • A+
所属分类:Linux
高性能企业级服务器首台5折

本文总结了常见的 Linux 内核参数及相关问题。修改内核参数前,您需要:

  • 从实际需要出发,最好有相关数据的支撑,不建议随意调整内核参数。
  • 了解参数的具体作用,且注意同类型或版本环境的内核参数可能有所不同。
  • 备份 ECS 实例中的重要数据。参阅文档创建快照

查看和修改 Linux 实例内核参数

方法一、通过 

1
/proc/sys/

 目录

查看内核参数:使用 

1
cat

 查看对应文件的内容,例如执行命令 

1
cat /proc/sys/net/ipv4/tcp_tw_recycle

 查看 

1
net.ipv4.tcp_tw_recycle

 的值。

修改内核参数:使用 

1
echo

 修改内核参数对应的文件,例如执行命令 

1
echo "0" > /proc/sys/net/ipv4/tcp_tw_recycle

 将 

1
net.ipv4.tcp_tw_recycle

 的值修改为 0。

注意

  • 1
    /proc/sys/

     目录是 Linux 内核在启动后生成的伪目录,其目录下的 

    1
    net

     文件夹中存放了当前系统中开启的所有内核参数、目录树结构与参数的完整名称相关,如 

    1
    net.ipv4.tcp_tw_recycle

    ,它对应的文件是 

    1
    /proc/sys/net/ipv4/tcp_tw_recycle

    ,文件的内容就是参数值。

  • 方法一修改的参数值仅在当次运行中生效,系统重启后会回滚历史值,一般用于临时性的验证修改的效果。若需要永久性修改,请参阅方法二

方法二、通过 

1
sysctl.conf

 文件

查看内核参数:执行命令 

1
sysctl -a

 查看当前系统中生效的所有参数,如下所示:

  1. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_app_win </span><span class="pun">=</span> <span class="lit">31</span>
  2. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_adv_win_scale </span><span class="pun">=</span> <span class="lit">2</span>
  3. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_tw_reuse </span><span class="pun">=</span> <span class="lit">0</span>
  4. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_frto </span><span class="pun">=</span> <span class="lit">2</span>
  5. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_frto_response </span><span class="pun">=</span> <span class="lit">0</span>
  6. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_low_latency </span><span class="pun">=</span> <span class="lit">0</span>
  7. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_no_metrics_save </span><span class="pun">=</span> <span class="lit">0</span>
  8. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_moderate_rcvbuf </span><span class="pun">=</span> <span class="lit">1</span>
  9. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_tso_win_divisor </span><span class="pun">=</span> <span class="lit">3</span>
  10. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_congestion_control </span><span class="pun">=</span><span class="pln"> cubic</span>
  11. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_abc </span><span class="pun">=</span> <span class="lit">0</span>
  12. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_mtu_probing </span><span class="pun">=</span> <span class="lit">0</span>
  13. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_base_mss </span><span class="pun">=</span> <span class="lit">512</span>
  14. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_workaround_signed_windows </span><span class="pun">=</span> <span class="lit">0</span>
  15. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_challenge_ack_limit </span><span class="pun">=</span> <span class="lit">1000</span>
  16. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_limit_output_bytes </span><span class="pun">=</span> <span class="lit">262144</span>
  17. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_dma_copybreak </span><span class="pun">=</span> <span class="lit">4096</span>
  18. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_slow_start_after_idle </span><span class="pun">=</span> <span class="lit">1</span>
  19. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">cipso_cache_enable </span><span class="pun">=</span> <span class="lit">1</span>
  20. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">cipso_cache_bucket_size </span><span class="pun">=</span> <span class="lit">10</span>
  21. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">cipso_rbm_optfmt </span><span class="pun">=</span> <span class="lit">0</span>
  22. 1
    <span class="pln">net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">cipso_rbm_strictvalid </span><span class="pun">=</span> <span class="lit">1</span>

修改内核参数

  1. 执行命令 
    1
    /sbin/sysctl -w kernel.parameter="example"

     修改参数,如

    1
    sysctl -w net.ipv4.tcp_tw_recycle="0"

  2. 执行命令 
    1
    vi /etc/sysctl.conf

     修改 

    1
    /etc/sysctl.conf

     文件中的参数。

  3. 执行命令 
    1
    /sbin/sysctl -p

     使配置生效。

注意:调整内核参数后内核处于不稳定状态,请务必重启实例。

Linux 网络相关内核参数引发的常见问题及处理

Linux 实例 NAT 哈希表满导致 ECS 实例丢包

此处涉及的内核参数:

  • 1
    net.netfilter.nf_conntrack_buckets
  • 1
    net.nf_conntrack_max

问题现象

ECS Linux 实例出现间歇性丢包,无法连接实例,通过 tracert、mtr 等工具排查,外部网络未见异常。同时,如下图所示,在系统日志中重复出现大量(

1
table full, dropping packet.

)错误信息。

  1. 1
    <span class="typ">Feb</span>  <span class="lit">6</span> <span class="lit">16</span><span class="pun">:</span><span class="lit">05</span><span class="pun">:</span><span class="lit">07</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> nf_conntrack</span><span class="pun">:</span><span class="pln"> table full</span><span class="pun">,</span><span class="pln"> dropping packet</span><span class="pun">.</span>
  2. 1
    <span class="typ">Feb</span>  <span class="lit">6</span> <span class="lit">16</span><span class="pun">:</span><span class="lit">05</span><span class="pun">:</span><span class="lit">07</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> nf_conntrack</span><span class="pun">:</span><span class="pln"> table full</span><span class="pun">,</span><span class="pln"> dropping packet</span><span class="pun">.</span>
  3. 1
    <span class="typ">Feb</span>  <span class="lit">6</span> <span class="lit">16</span><span class="pun">:</span><span class="lit">05</span><span class="pun">:</span><span class="lit">07</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> nf_conntrack</span><span class="pun">:</span><span class="pln"> table full</span><span class="pun">,</span><span class="pln"> dropping packet</span><span class="pun">.</span>
  4. 1
    <span class="typ">Feb</span>  <span class="lit">6</span> <span class="lit">16</span><span class="pun">:</span><span class="lit">05</span><span class="pun">:</span><span class="lit">07</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> nf_conntrack</span><span class="pun">:</span><span class="pln"> table full</span><span class="pun">,</span><span class="pln"> dropping packet</span><span class="pun">.</span>

原因分析

ip_conntrack 是 Linux 系统内 NAT 的一个跟踪连接条目的模块。ip_conntrack 模块会使用一个哈希表记录 TCP 协议 established connection 记录,当这个哈希表满了的时候,便会导致 

1
nf_conntrack: table full, dropping packet

错误。Linux 系统会开辟一个空间用来维护每一个 TCP 链接,这个空间的大小与 

1
nf_conntrack_buckets

1
nf_conntrack_max

 相关,后者的默认值是前者的 4 倍,而前者在系统启动后无法修改,所以一般都是建议调大 

1
nf_conntrack_max

注意:系统维护连接比较消耗内存,请在系统空闲和内存充足的情况下调大 

1
nf_conntrack_max

,且根据系统的情况而定。

解决思路

  1. 使用管理终端登录实例。
  2. 执行命令 
    1
    # vi /etc/sysctl.conf

     编辑系统内核配置。

  3. 修改哈希表项最大值参数:
    1
    net.netfilter.nf_conntrack_max = 655350

  4. 修改超时参数:
    1
    net.netfilter.nf_conntrack_tcp_timeout_established = 1200

    ,默认情况下 timeout 是 432000(秒)。

  5. 执行命令 
    1
    # sysctl -p

     使配置生效。

Time wait bucket table overflow 报错

此处涉及的内核参数:

  • 1
    net.ipv4.tcp_max_tw_buckets

问题现象

Linux 实例 

1
/var/log/message

 日志全是类似 

1
kernel: TCP: time wait bucket table overflow

 的报错信息,提示 

1
time wait bucket table

 溢出,如下:

  1. 1
    <span class="typ">Feb</span> <span class="lit">18</span> <span class="lit">12</span><span class="pun">:</span><span class="lit">28</span><span class="pun">:</span><span class="lit">38</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> TCP</span><span class="pun">:</span><span class="pln"> time <span class="hljs-built_in">wait</span> bucket table overflow</span>
  2. 1
    <span class="typ">Feb</span> <span class="lit">18</span> <span class="lit">12</span><span class="pun">:</span><span class="lit">28</span><span class="pun">:</span><span class="lit">44</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> printk</span><span class="pun">:</span> <span class="lit">227</span><span class="pln"> messages suppressed</span><span class="pun">.</span>
  3. 1
    <span class="typ">Feb</span> <span class="lit">18</span> <span class="lit">12</span><span class="pun">:</span><span class="lit">28</span><span class="pun">:</span><span class="lit">44</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> TCP</span><span class="pun">:</span><span class="pln"> time <span class="hljs-built_in">wait</span> bucket table overflow</span>
  4. 1
    <span class="typ">Feb</span> <span class="lit">18</span> <span class="lit">12</span><span class="pun">:</span><span class="lit">28</span><span class="pun">:</span><span class="lit">52</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> printk</span><span class="pun">:</span> <span class="lit">121</span><span class="pln"> messages suppressed</span><span class="pun">.</span>
  5. 1
    <span class="typ">Feb</span> <span class="lit">18</span> <span class="lit">12</span><span class="pun">:</span><span class="lit">28</span><span class="pun">:</span><span class="lit">52</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> TCP</span><span class="pun">:</span><span class="pln"> time <span class="hljs-built_in">wait</span> bucket table overflow</span>
  6. 1
    <span class="typ">Feb</span> <span class="lit">18</span> <span class="lit">12</span><span class="pun">:</span><span class="lit">28</span><span class="pun">:</span><span class="lit">53</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> printk</span><span class="pun">:</span> <span class="lit">351</span><span class="pln"> messages suppressed</span><span class="pun">.</span>
  7. 1
    <span class="typ">Feb</span> <span class="lit">18</span> <span class="lit">12</span><span class="pun">:</span><span class="lit">28</span><span class="pun">:</span><span class="lit">53</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> TCP</span><span class="pun">:</span><span class="pln"> time <span class="hljs-built_in">wait</span> bucket table overflow</span>
  8. 1
    <span class="typ">Feb</span> <span class="lit">18</span> <span class="lit">12</span><span class="pun">:</span><span class="lit">28</span><span class="pun">:</span><span class="lit">59</span><span class="pln"> i</span><span class="pun">-***</span><span class="pln"> kernel</span><span class="pun">:</span><span class="pln"> printk</span><span class="pun">:</span> <span class="lit">319</span><span class="pln"> messages suppressed</span><span class="pun">.</span>

执行命令 

1
netstat -ant|grep TIME_WAIT|wc -l

 统计处于 TIME_WAIT 状态的 TCP 连接数,发现处于 TIME_WAIT 状态的 TCP 连接非常多。

原因分析

参数 

1
net.ipv4.tcp_max_tw_buckets

 可以调整内核中管理 TIME_WAIT 状态的数量,当实例中处于 TIME_WAIT 及需要转换为 TIME_WAIT 状态连接数之和超过了 

1
net.ipv4.tcp_max_tw_buckets

 参数值时,message 日志中将报错 

1
time wait bucket table

,同时内核关闭超出参数值的部分 TCP 连接。您需要根据实际情况适当调高 

1
net.ipv4.tcp_max_tw_buckets

,同时从业务层面去改进 TCP 连接。

解决思路

  1. 执行命令 
    1
    netstat -anp |grep tcp |wc -l

     统计 TCP 连接数。

  2. 执行命令 
    1
    vi /etc/sysctl.conf

    ,查询 

    1
    net.ipv4.tcp_max_tw_buckets

     参数。如果确认连接使用很高,容易超出限制。

  3. 调高参数 
    1
    net.ipv4.tcp_max_tw_buckets

    ,扩大限制。

  4. 执行命令 
    1
    # sysctl -p

     使配置生效。

Linux 实例中 FIN_WAIT2 状态的 TCP 链接过多

此处涉及的内核参数:

  • 1
    net.ipv4.tcp_fin_timeout

问题现象

FIN_WAIT2 状态的 TCP 链接过多。

原因分析

  • HTTP 服务中,Server 由于某种原因会主动关闭连接,例如 KEEPALIVE 超时的情况下。作为主动关闭连接的 Server 就会进入 FIN_WAIT2 状态。
  • TCP/IP 协议栈中,存在半连接的概念,FIN_WAIT2 状态不算做超时,如果 Client 不关闭,FIN_WAIT_2 状态将保持到系统重启,越来越多的 FIN_WAIT_2 状态会致使内核 Crash。
  • 建议调小 
    1
    net.ipv4.tcp_fin_timeout

     参数,减少这个数值以便加快系统关闭处于 

    1
    FIN_WAIT2

     状态的 TCP 连接。

解决思路

  1. 执行命令 
    1
    vi /etc/sysctl.conf

    ,修改或加入以下内容:

    1. 1
      <span class="pln"> net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_syncookies </span><span class="pun">=</span> <span class="lit">1</span>
    2. 1
      <span class="pln"> net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_fin_timeout </span><span class="pun">=</span> <span class="lit">30</span>
    3. 1
      <span class="pln"> net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_max_syn_backlog </span><span class="pun">=</span> <span class="lit">8192</span>
    4. 1
      <span class="pln"> net</span><span class="pun">.</span><span class="pln">ipv4</span><span class="pun">.</span><span class="pln">tcp_max_tw_buckets </span><span class="pun">=</span> <span class="lit">5000</span>
  2. 执行命令 
    1
    # sysctl -p

     使配置生效。

    注意:由于 

    1
    FIN_WAIT2

     状态的 TCP 连接会进入 

    1
    TIME_WAIT

     状态,请同时参阅 time wait bucket table overflow 报错

Linux 实例中出现大量 CLOSE_WAIT 状态的 TCP 连接

问题现象

执行命令 

1
netstat -atn|grep CLOSE_WAIT|wc -l

 发现当前系统中处于 

1
CLOSE_WAIT

 状态的 TCP 连接非常多。

原因分析

关闭 TCP 连接时,TCP 连接的两端都可以发起关闭连接的请求,若对端发起了关闭连接,但本地没有关闭连接,那么该连接就会处于 CLOSE_WAIT 状态。虽然该连接已经处于半开状态,但是已经无法和对端通信,需要及时的释放掉该链接。建议从业务层面及时判断某个连接是否已经被对端关闭,即在程序逻辑中对连接及时关闭检查。

解决思路

编程语言中对应的读、写函数一般包含了检测 CLOSE_WAIT TCP 连接功能,例如:

Java 语言

  1. 通过 
    1
    read

     方法来判断 I/O 。当 read 方法返回 

    1
    -1

     时则表示已经到达末尾。

  2. 通过 
    1
    close

     方法关闭该链接。

C 语言

  1. 检查 
    1
    read

     的返回值。

    • 若等于 0 则可以关闭该连接。
    • 若小于 0 则查看 errno,若不是 AGAIN 则同样可以关闭连接。

客户端配置 NAT 后仍无法访问 ECS 或 RDS 远端服务器

此处涉及的内核参数:

  • 1
    net.ipv4.tcp_tw_recycle
  • 1
    net.ipv4.tcp_timestamps

问题现象

客户端配置 NAT 后无法访问远端 ECS、RDS,包括配置了 SNAT 的 VPC ECS 。同时无法访问连接其他 ECS 或 RDS 等云产品,抓包检测发现远端对客户端发送的 SYN 包没有响应。

原因分析

若远端服务器的内核参数 

1
net.ipv4.tcp_tw_recycle

 和 

1
net.ipv4.tcp_timestamps

 的值都为 1,则远端服务器会检查每一个报文中的时间戳(Timestamp),若 Timestamp 不是递增的关系,不会响应这个报文。配置 NAT 后,远端服务器看到来自不同的客户端的源 IP 相同,但 NAT 前每一台客户端的时间可能会有偏差,报文中的 Timestamp 就不是递增的情况。

解决思路

  • 远端服务器为 ECS 时,修改参数 
    1
    net.ipv4.tcp_tw_recycle

     为 0。

  • 远端服务器为 RDS 等 PaaS 服务时。RDS 无法直接修改内核参数,需要在客户端上修改参数 
    1
    net.ipv4.tcp_tw_recycle

     和 

    1
    net.ipv4.tcp_timestamps

     为 0。

文档涉及的 Linux 内核参数说明

参数 说明
net.ipv4.tcp_max_syn_backlog 该参数决定了系统中处于 

1
SYN_RECV

 状态的 TCP 连接数量。

1
SYN_RECV

 状态指的是当系统收到 SYN 后,作了 SYN+ACK 响应后等待对方回复三次握手阶段中的最后一个 ACK 的阶段。

net.ipv4.tcp_syncookies 该参数表示是否打开 TCP 同步标签(

1
SYN_COOKIES

),内核必须开启并编译 CONFIG_SYN_COOKIES,

1
SYN_COOKIES

 可以防止一个套接字在有过多试图连接到达时引起过载。默认值 0 表示关闭。
当该参数被设置为 1 且 

1
SYN_RECV

 队列满了之后,内核会对 SYN 包的回复做一定的修改,即,在响应的 SYN+ACK 包中,初始的序列号是由源 IP + Port、目的 IP + Port 及时间这五个参数共同计算出一个值组成精心组装的 TCP 包。由于 ACK 包中确认的序列号并不是之前计算出的值,恶意攻击者无法响应或误判,而请求者会根据收到的 SYN+ACK 包做正确的响应。启用 

1
net.ipv4.tcp_syncookies

后,会忽略 

1
net.ipv4.tcp_max_syn_backlog

net.ipv4.tcp_synack_retries 该参数指明了处于 

1
SYN_RECV

 状态时重传 SYN+ACK 包的次数。

net.ipv4.tcp_abort_on_overflow 设置该参数为 1 时,当系统在短时间内收到了大量的请求,而相关的应用程序未能处理时,就会发送 Reset 包直接终止这些链接。建议通过优化应用程序的效率来提高处理能力,而不是简单地 Reset。
默认值: 0。
net.core.somaxconn 该参数定义了系统中每一个端口最大的监听队列的长度,是个全局参数。该参数和 

1
net.ipv4.tcp_max_syn_backlog

 有关联,后者指的是还在三次握手的半连接的上限,该参数指的是处于 ESTABLISHED 的数量上限。若您的 ECS 实例业务负载很高,则有必要调高该参数。

1
listen(2)

 函数中的参数 

1
backlog

 同样是指明监听的端口处于 ESTABLISHED 的数量上限,当 

1
backlog

 大于 

1
net.core.somaxconn

时,以 

1
net.core.somaxconn

 参数为准。

net.core.netdev_max_backlog 当内核处理速度比网卡接收速度慢时,这部分多出来的包就会被保存在网卡的接收队列上,而该参数说明了这个队列的数量上限。

参考链接

  1. Linux man-pages
  2. kernel/git/torvalds/linux.git_proc
  3. kernel/git/torvalds/linux.git_proc_net_tcp
  4. kernel/git/torvalds/linux.git_ip-sysctl
  5. kernel/git/torvalds/linux.git_netfilter-sysctl
  6. kernel/git/torvalds/linux.git_nf_conntrack-sysctl

发表评论

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: