Permalink: 2012-12-14 10:32:15 by ning in misc tags: all

An Argument for Increasing TCP’s Initial Congestion Window

http://code.google.com/speed/articles/tcp_initcwnd_paper.pdf

init_cwnd=10 => 提升10%

慢启动阶段, 拥塞窗口一直增加, 直到丢包, 丢包通常是由于 路由器buffer overflow

  • 平均网页大小384KB.
  • 大多数连接,都是短连接.
  • 使用iptables 命令可以修改init_cwnd.

理论

  1. 90%的网页在 网页 在16KB 以内. 所以改为10, 90%的网页都可以在一个RTT内发出
  2. init_cwnd 比较大, 如果拥塞, 能够尽快的快速重传, 而不是等待超时. 一下发送10个包, 如果丢了5个, 发送方很可能收到dup ack, 所以出发 Fast Restransmit.

注意 除了init_cwnd 之外, client 的通告窗口(rwnd)如果太小, 也不能一下发送很多包,

效果

提升8%-16%

其它方法

cached cwnd. (可能是记录每个client ip 的最佳cwnd)

负面影响

  1. 重传增加 (增加很少)
  2. 浏览器本身会并发访问, 通常并发是3-6.

zetaTCP

对于小公司, 网页下载, 不需要这种技术, 直接调大init_cwnd 就行.

工具

修改init_cwnd

Step 1: check route settings.:

sajal@sajal-desktop:~$ ip route show
192.168.1.0/24 dev eth0  proto kernel  scope link  src 192.168.1.100  metric 1
169.254.0.0/16 dev eth0  scope link  metric 1000
default via 192.168.1.1 dev eth0  proto static
sajal@sajal-desktop:~$
Make a note of the line starting with default.

Step 2: Change the default settings:

Paste the current settings for default and add initcwnd 10 to it.

sajal@sajal-desktop:~$ sudo ip route change default via 192.168.1.1 dev eth0  proto static initcwnd 10

Step 3: Verify:

sajal@sajal-desktop:~$ ip route show
192.168.1.0/24 dev eth0  proto kernel  scope link  src 192.168.1.100  metric 1
169.254.0.0/16 dev eth0  scope link  metric 1000
default via 192.168.1.1 dev eth0  proto static  initcwnd 10

我试试:

# ip route show
10.26.140.0/24 dev eth1  proto kernel  scope link  src 10.26.140.32
169.254.0.0/16 dev eth1  scope link  metric 1003
192.168.0.0/16 via 10.26.140.1 dev eth1
172.16.0.0/12 via 10.26.140.1 dev eth1  initcwnd 10
10.0.0.0/8 via 10.26.140.1 dev eth1

# ip route change 172.16.0.0/12 via 10.26.140.1 dev eth1 initcwnd 20

# ip route show
10.26.140.0/24 dev eth1  proto kernel  scope link  src 10.26.140.32
169.254.0.0/16 dev eth1  scope link  metric 1003
192.168.0.0/16 via 10.26.140.1 dev eth1
172.16.0.0/12 via 10.26.140.1 dev eth1  initcwnd 20
10.0.0.0/8 via 10.26.140.1 dev eth1

initcwnd_check

需要apt-get 安装

libnetpacket-perl, libnet-pcap-perl, libnet-rawip-perl
(ENV)ning@ning-laptop ~/test/initcwnd_check$ sudo ./initcwnd_check.pl wlan0 http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js
+ connected from 172.22.146.48:6031 to 74.125.31.95:80
* ajax.googleapis.com (74.125.31.95) - init_cwnd: 10 (13800 byte), init_rwnd: 10 (14300 byte)

(ENV)ning@ning-laptop ~/test/initcwnd_check$ sudo ./initcwnd_check.pl wlan0 http://10.26.140.32:8080/500K.data
+ connected from 172.22.146.48:4648 to 10.26.140.32:8080
* 10.26.140.32 (10.26.140.32) - init_cwnd: 10 (14280 byte), init_rwnd: 10 (14600 byte)

(ENV)ning@ning-laptop ~/test/initcwnd_check$ sudo ./initcwnd_check.pl wlan0 http://10.26.138.25:8080/500K.data
+ connected from 172.22.146.48:6988 to 10.26.138.25:8080
* 10.26.138.25 (10.26.138.25) - init_cwnd: 44 (62832 byte), init_rwnd: 44 (63443 byte)                                                  //44.

case

case win

虽然调大了,但是下面这个case 的还是只能发送4个, 因为接收放说win 5840:

(ENV)ning@ning-laptop ~/test/initcwnd_check$ sudo tcpdump -i wlan0 -nn 'host 123.125.115.254'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlan0, link-type EN10MB (Ethernet), capture size 96 bytes

19:36:42.621338 IP 172.22.146.48.41102 > 123.125.115.254.80: Flags [S], seq 2478746345, win 5840, options [mss 1460,sackOK,TS val 19574584 ecr 0,nop,wscale 7], length 0
19:36:42.638841 IP 123.125.115.254.80 > 172.22.146.48.41102: Flags [S.], seq 3088442681, ack 2478746346, win 5840, options [mss 1380,sackOK,nop,nop,nop,nop,nop,nop,nop,nop,nop,nop,nop,nop,nop,nop], length 0
19:36:42.638883 IP 172.22.146.48.41102 > 123.125.115.254.80: Flags [.], ack 1, win 5840, length 0
19:36:42.639081 IP 172.22.146.48.41102 > 123.125.115.254.80: Flags [P.], seq 1:122, ack 1, win 5840, length 121
19:36:42.642216 IP 123.125.115.254.80 > 172.22.146.48.41102: Flags [.], ack 122, win 14600, length 0
19:36:42.642969 IP 123.125.115.254.80 > 172.22.146.48.41102: Flags [.], seq 1:1025, ack 122, win 14600, length 1024
19:36:42.642999 IP 172.22.146.48.41102 > 123.125.115.254.80: Flags [.], ack 1025, win 7168, length 0
19:36:42.644692 IP 123.125.115.254.80 > 172.22.146.48.41102: Flags [.], seq 1025:2049, ack 122, win 14600, length 1024
19:36:42.644720 IP 172.22.146.48.41102 > 123.125.115.254.80: Flags [.], ack 2049, win 9216, length 0
19:36:42.645531 IP 123.125.115.254.80 > 172.22.146.48.41102: Flags [.], seq 2049:3073, ack 122, win 14600, length 1024
19:36:42.645548 IP 172.22.146.48.41102 > 123.125.115.254.80: Flags [.], ack 3073, win 11264, length 0
19:36:42.648202 IP 123.125.115.254.80 > 172.22.146.48.41102: Flags [P.], seq 3073:4097, ack 122, win 14600, length 1024
19:36:42.648229 IP 172.22.146.48.41102 > 123.125.115.254.80: Flags [.], ack 4097, win 13312, length 0

19:36:42.654387 IP 123.125.115.254.80 > 172.22.146.48.41102: Flags [.], seq 4097:5121, ack 122, win 14600, length 1024
19:36:42.654416 IP 172.22.146.48.41102 > 123.125.115.254.80: Flags [.], ack 5121, win 15360, length 0
19:36:42.659431 IP 123.125.115.254.80 > 172.22.146.48.41102: Flags [.], seq 5121:6145, ack 122, win 14600, length 1024
19:36:42.659455 IP 172.22.146.48.41102 > 123.125.115.254.80: Flags [.], ack 6145, win 17408, length 0
19:36:42.660544 IP 123.125.115.254.80 > 172.22.146.48.41102: Flags [P.], seq 6145:7169, ack 122, win 14600, length 1024
19:36:42.660558 IP 172.22.146.48.41102 > 123.125.115.254.80: Flags [.], ack 7169, win 19456, length 0
19:36:42.677194 IP 123.125.115.254.80 > 172.22.146.48.41102: Flags [.], seq 7169:8193, ack 122, win 14600, length 1024

case tso

我们在使用 x工具 之后, 关闭它之后. 用上面的initcwnd_check 工具查看,发现init_cwnd变成了6,很奇怪:

(ENV)ning@ning-laptop ~/test/initcwnd_check$ sudo ./initcwnd_check.pl wlan0 http://10.26.138.25:8080/50K.data
+ connected from 172.22.146.48:3832 to 10.26.138.25:8080
* 10.26.138.25 (10.26.138.25) - init_cwnd: 6 (8568 byte), init_rwnd: 10 (14600 byte)

抓包看:

# tcpdump -nn 'host 172.22.146.48'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes

17:28:32.410069 IP 172.22.146.48.2608 > 10.26.138.25.8080: Flags [S], seq 2380923257:2380923262, win 65535, options [mss 1460], length 5
17:28:32.410108 IP 10.26.138.25.8080 > 172.22.146.48.2608: Flags [S.], seq 1345408678, ack 2380923258, win 14600, options [mss 1460], length 0
17:28:32.411585 IP 172.22.146.48.2608 > 10.26.138.25.8080: Flags [.], ack 1, win 65535, options [mss 1460], length 0
17:28:32.412577 IP 172.22.146.48.2608 > 10.26.138.25.8080: Flags [P.], seq 1:218, ack 1, win 65535, options [mss 1460], length 217
17:28:32.412611 IP 10.26.138.25.8080 > 172.22.146.48.2608: Flags [.], ack 218, win 15544, length 0
17:28:32.412969 IP 10.26.138.25.8080 > 172.22.146.48.2608: Flags [.], seq 1:1461, ack 218, win 15544, length 1460
17:28:32.412983 IP 10.26.138.25.8080 > 172.22.146.48.2608: Flags [.], seq 1461:2921, ack 218, win 15544, length 1460
17:28:32.412991 IP 10.26.138.25.8080 > 172.22.146.48.2608: Flags [.], seq 2921:4381, ack 218, win 15544, length 1460
17:28:32.412999 IP 10.26.138.25.8080 > 172.22.146.48.2608: Flags [.], seq 4381:5841, ack 218, win 15544, length 1460
17:28:32.413005 IP 10.26.138.25.8080 > 172.22.146.48.2608: Flags [.], seq 5841:7301, ack 218, win 15544, length 1460
17:28:32.413012 IP 10.26.138.25.8080 > 172.22.146.48.2608: Flags [.], seq 7301:8761, ack 218, win 15544, length 1460
17:28:32.614498 IP 10.26.138.25.8080 > 172.22.146.48.2608: Flags [.], seq 1:1461, ack 218, win 15544, length 1460
17:28:32.622373 IP 172.22.146.48.2608 > 10.26.138.25.8080: Flags [R], seq 2380923475, win 65535, options [mss 1460], length 0

和一台正常机器对比:

# tcpdump -nn 'host 172.22.146.48'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
17:28:30.780922 IP 172.22.146.48.1634 > 10.26.140.32.8080: Flags [S], seq 1797559918:1797559923, win 65535, options [mss 1460], length 5
17:28:30.780961 IP 10.26.140.32.8080 > 172.22.146.48.1634: Flags [S.], seq 1660446299, ack 1797559919, win 14600, options [mss 1460], length 0
17:28:30.782374 IP 172.22.146.48.1634 > 10.26.140.32.8080: Flags [.], ack 1, win 65535, options [mss 1460], length 0
17:28:30.783062 IP 172.22.146.48.1634 > 10.26.140.32.8080: Flags [P.], seq 1:219, ack 1, win 65535, options [mss 1460], length 218
17:28:30.783096 IP 10.26.140.32.8080 > 172.22.146.48.1634: Flags [.], ack 219, win 15544, length 0
17:28:30.783548 IP 10.26.140.32.8080 > 172.22.146.48.1634: Flags [.], seq 1:2921, ack 219, win 15544, length 2920
17:28:30.783567 IP 10.26.140.32.8080 > 172.22.146.48.1634: Flags [.], seq 2921:7301, ack 219, win 15544, length 4380
17:28:30.783576 IP 10.26.140.32.8080 > 172.22.146.48.1634: Flags [.], seq 7301:11681, ack 219, win 15544, length 4380
17:28:30.783589 IP 10.26.140.32.8080 > 172.22.146.48.1634: Flags [.], seq 11681:14601, ack 219, win 15544, length 2920
17:28:30.985323 IP 10.26.140.32.8080 > 172.22.146.48.1634: Flags [.], seq 1:1461, ack 219, win 15544, length 1460
17:28:30.991513 IP 172.22.146.48.1634 > 10.26.140.32.8080: Flags [R], seq 1797560137, win 65535, options [mss 1460], length 0

同样是发了6个包. 客户端认为一个是6, 一个是10. 正常机器有大包, 怀疑是tso, 果然是两台机器tso 都开启:

# sudo ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on            *****
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off

怀疑是x工具把tso 状态搞坏了.

Comments