This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
system:benches10gbps:direct [2012/10/02 15:57] ze cpu sibblings informations |
system:benches10gbps:direct [2012/10/04 13:16] (current) ze add httpterm benches |
||
|---|---|---|---|
| Line 188: | Line 188: | ||
| idea how the CPU are regarding to threads and processors : | idea how the CPU are regarding to threads and processors : | ||
| - | ^ CPU#0-5 | processor 0 | core 0-5 | thread 0 | | + | ^ CPU ^ processor ^ core ^ thread ^ |
| - | ^ CPU#6-11 | processor 1 | core 0-5 | thread 0 | | + | | 0-5 | 0 | 0-5 | 0 | |
| - | ^ CPU#12-17 | processor 0 | core 0-5 | thread 1 | | + | | 6-11 | 1 | 0-5 | 0 | |
| - | ^ CPU#18-23 | processor 1 | core 0-5 | thread 1 | | + | | 12-17 | 0 | 0-5 | 1 | |
| + | | 18-23 | 1 | 0-5 | 1 | | ||
| How to split ? Lets try differents splitting. | How to split ? Lets try differents splitting. | ||
| Line 373: | Line 374: | ||
| /root/inject -p 24 -d 60 -u 500 -s 20 -f small-24.txt -S 10.140.0.0-10.140.15.255:1024-65535 | /root/inject -p 24 -d 60 -u 500 -s 20 -f small-24.txt -S 10.140.0.0-10.140.15.255:1024-65535 | ||
| 241193 hits/s | 241193 hits/s | ||
| + | |||
| + | ====== dual ====== | ||
| + | |||
| + | To check on which side we have a bottle neck, lets try to have 2 | ||
| + | servers, or 2 clients. | ||
| + | |||
| + | Tests done with the lastest configurations (client and server) which | ||
| + | could give 240k hits/s. | ||
| + | |||
| + | ===== dual servers ===== | ||
| + | |||
| + | We get a second server with the same configuration, and checked it also | ||
| + | can handle the 240k/s. Then, we change the scenario to hit the 24 IPs | ||
| + | from both servers. | ||
| + | |||
| + | New input file: dual-24.txt | ||
| + | new page0a 0 | ||
| + | get 10.128.0.0:80 / | ||
| + | new page0b 0 | ||
| + | get 10.132.0.0:80 / | ||
| + | new page1a 0 | ||
| + | get 10.128.0.1:80 / | ||
| + | new page1b 0 | ||
| + | get 10.132.0.1:80 / | ||
| + | [...] | ||
| + | new page23a 0 | ||
| + | get 10.128.0.23:80 / | ||
| + | new page23b 0 | ||
| + | get 10.132.0.23:80 / | ||
| + | |||
| + | /root/inject -p 24 -d 60 -u 500 -s 20 -f dual-24.txt -S 10.140.0.0-10.140.15.255:1024-65535 | ||
| + | 401391 hits/s | ||
| + | |||
| + | Though the client seems to use all its CPU for 240k/s, it still can go | ||
| + | up and handle 400k hits/s. The bottle neck is probably not really on | ||
| + | that side. | ||
| + | |||
| + | ===== dual client ===== | ||
| + | |||
| + | We get a second client with the same configuration, and checked it also | ||
| + | can generate the 240k/s. | ||
| + | |||
| + | To launch both clients at the same time, cssh is very nice :) | ||
| + | |||
| + | /root/inject -p 24 -d 60 -u 500 -s 20 -f small-24.txt -S 10.140.0.0-10.140.15.255:1024-65535 | ||
| + | 123016 hits/s | ||
| + | 121312 hits/s | ||
| + | total: 244328 hits/s | ||
| + | |||
| + | Ok, client is clearly not the limitation, as with two clients, we get | ||
| + | the same total. | ||
| ====== conclusions ====== | ====== conclusions ====== | ||
| Line 383: | Line 435: | ||
| * on high load configuration, reducing the number of process to just have one per used core is better | * on high load configuration, reducing the number of process to just have one per used core is better | ||
| * 240k connections / seconds is doable with a single host | * 240k connections / seconds is doable with a single host | ||
| + | |||
| + | For some unknown reason (at the time of writing that documentation), the | ||
| + | connections highly drops for 1-2s, as can be seen on | ||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/direct/bench-bad/nginx-bad/elastiques-nginx/|bench-bad/nginx-bad]] | ||
| + | graphs. I tried to avoid using results triggering such behaviour. Any ideas/hints on what could produce such are welcome. | ||
| + | |||
| + | ====== post-bench ====== | ||
| + | |||
| + | After publishing the first benches, someone adviced to use httpterm, instead of nginx. Unlike nginx, httpterm is aimed at only doing stress bench, and not serve real pages. | ||
| + | |||
| + | Bench using multi-process httpterm directly shows some bug. It still sends header, but fails to send data. Getting down to 1 process keep it running, but obviously not using all cores. | ||
| + | |||
| + | As we have 16 core for the web server, so 16 process with 1 IP each were launched, pinned with taskset on a cpu each. | ||
| + | |||
| + | file-0.cfg: | ||
| + | # taskset 000010 ./httpterm -D -f file-0.cfg | ||
| + | global | ||
| + | maxconn 30000 | ||
| + | ulimit-n 500000 | ||
| + | nbproc 1 | ||
| + | quiet | ||
| + | | ||
| + | listen proxy1 10.128.0.0:80 | ||
| + | object weight 1 name test1 code 200 size 200 | ||
| + | clitimeout 10000 | ||
| + | |||
| + | That gives up more connections per seconds: 278765 | ||
| + | |||
| + | |||
| + | That helps get even more requests per seconds, but we still get some stall at times. | ||