This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
|
system:benches10gbps:firewall [2012/10/04 15:37] ze created |
system:benches10gbps:firewall [2012/11/02 15:41] (current) ze ipset |
||
|---|---|---|---|
| Line 2: | Line 2: | ||
| Those bench start with the inject / httpterm configuration from direct | Those bench start with the inject / httpterm configuration from direct | ||
| benches, with 270-290k connections/s between a client and a server. | benches, with 270-290k connections/s between a client and a server. | ||
| - | |||
| - | FIXME: graph not available yet. Will wait until the bench are over. | ||
| Monitoring graphs for the different benches can be found | Monitoring graphs for the different benches can be found | ||
| [[http://www.hagtheil.net/files/system/benches10gbps/firewall/|here]]. | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/|here]]. | ||
| - | ====== Single ====== | + | ====== Gateway ====== |
| + | |||
| + | A gateway should be as neutral as possible on the network trafic going | ||
| + | through it. If we can get 270k hits by having client and server talking | ||
| + | directly, it would be nice to also have that while it transit via our | ||
| + | gateway. | ||
| + | |||
| + | Having 6 servers, the most we can do is probably having 2 clients | ||
| + | hitting on 3 servers. | ||
| ===== Baseline ===== | ===== Baseline ===== | ||
| - | For a first baseline, we start with a client and a server, going through | + | For a first baseline, we start with 2 clients hitting 3 servers, |
| - | a server, with the same interface. | + | directly. No gateway involved. |
| - | That gives us a first approach of a gateway impact. | + | If we check the different graph to get an idea of the trafic going on, |
| + | we have (approximate reading on graphs) : | ||
| - | 276619 hits/s | + | ^ what ^ per client ^ per server ^ total ^ graph ^ |
| + | ^ conn/s | 400k | 266k | 800k | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/00-baseline-direct/douves-client/tcp_stats_conn_out.png|cli1]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/00-baseline-direct/muraille-client/tcp_stats_conn_out.png|cli2]] | | ||
| + | ^ Gbps from cli/srv | 1.1/1.7 | 0.75/1.12 | 2.4/3.4 | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/00-baseline-direct/douves-client/interfaces_eth1_bps.png|cli1]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/00-baseline-direct/muraille-client/interfaces_eth1_bps.png|cli2]] | | ||
| + | ^ Mpkt/s from cli/srv | 1.2/1.62 | 0.8/1.08 | 2.4/3.24 | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/00-baseline-direct/douves-client/interfaces_eth1_pkt.png|cli1]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/00-baseline-direct/muraille-client/interfaces_eth1_pkt.png|cli2]] | | ||
| - | Just having a gateway doesn't seem to have any impact on the number of | + | Well, we can have about a little over 3Gbps, with 800k connections/s. |
| - | connections we can handle. | + | That might not be enough to reach the limit of a 10Gbps gateway, but it |
| + | should already be enough to give us a hint of some limits. | ||
| + | |||
| + | ===== Gateway ===== | ||
| + | |||
| + | Now that we have an idea of the trafic we can generate, lets see how it | ||
| + | gets handled by a single gateway. | ||
| + | |||
| + | For our first test, the gateway will use the same interface in and out. | ||
| + | That should theoricaly give us 5.6Gbps in and out of it. | ||
| + | |||
| + | We make sure our gateway forward the packets, and doesn't send any | ||
| + | redirect (I use the same subnet, so default might send redirect to avoid | ||
| + | using the useless gateway). | ||
| + | |||
| + | gateway#sysctl.conf | ||
| + | net.ipv4.ip_forward = 1 | ||
| + | net.ipv4.conf.all.accept_redirects = 0 | ||
| + | net.ipv4.conf.all.send_redirects = 0 | ||
| + | net.ipv4.conf.default.accept_redirects = 0 | ||
| + | net.ipv4.conf.default.send_redirects = 0 | ||
| + | net.ipv4.conf.eth0.accept_redirects = 0 | ||
| + | net.ipv4.conf.eth0.send_redirects = 0 | ||
| + | net.ipv4.conf.eth1.accept_redirects = 0 | ||
| + | net.ipv4.conf.eth1.send_redirects = 0 | ||
| + | |||
| + | And something "heard" as being a good idea (checked in later bench) : | ||
| + | |||
| + | gateway#/etc/rc.local | ||
| + | ethtool -G eth1 rx 4096 | ||
| + | ethtool -G eth1 tx 4096 | ||
| + | |||
| + | |||
| + | As we don't have any process that will be running here, and only the | ||
| + | kernel handling the interupts, the irq affinity is spread among all the | ||
| + | processor threads : | ||
| + | |||
| + | eth1-TxRx-0 0 | ||
| + | eth1-TxRx-1 1 | ||
| + | eth1-TxRx-2 2 | ||
| + | [...] | ||
| + | eth1-TxRx-22 22 | ||
| + | eth1-TxRx-23 23 | ||
| + | |||
| + | Results seems to indicate something very near the total of in/out we had | ||
| + | earlier, which is explained by the fact both go in and out of our | ||
| + | gateway via the same interface. | ||
| + | |||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/01-baseline-gateway/rempart-firewall/interfaces_eth1_bps.png|bps]] | ||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/01-baseline-gateway/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | ||
| + | |||
| + | ===== no rules ===== | ||
| + | |||
| + | Ok, without doing anything but forwarding the trafic, it gets pretty | ||
| + | nicely. Lets just check it doesn't change anything if we have the | ||
| + | firewall up, but without any rules. | ||
| + | |||
| + | iptables -L | ||
| + | iptables -t mangle -L | ||
| + | iptables -t raw -L | ||
| + | |||
| + | With no rule, and the 3 tables filter, raw and mangle, we already get | ||
| + | down from 5.7 to 4.9 (both, Gbps and M pkt/s). That's down from 800k to | ||
| + | just under 700k conn/s. | ||
| + | |||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/02-baseline-norule/rempart-firewall/interfaces_eth1_bps.png|bps]] | ||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/02-baseline-norule/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | ||
| + | |||
| + | |||
| + | ====== Firewall ====== | ||
| ===== Conntrack ===== | ===== Conntrack ===== | ||
| Line 29: | Line 108: | ||
| values. | values. | ||
| + | # nat | ||
| + | iptables -t nat -L | ||
| # load the conntrack modules (ipv4) | # load the conntrack modules (ipv4) | ||
| iptables -I FORWARD -m state --state ESTABLISHED | iptables -I FORWARD -m state --state ESTABLISHED | ||
| iptables -D FORWARD -m state --state ESTABLISHED | iptables -D FORWARD -m state --state ESTABLISHED | ||
| - | # load the conntrack modules (ipv6) | ||
| - | ip6tables -I FORWARD -m state --state ESTABLISHED | ||
| - | ip6tables -D FORWARD -m state --state ESTABLISHED | ||
| # increase the max conntrack (default: 256k) | # increase the max conntrack (default: 256k) | ||
| echo 33554432 > /proc/sys/net/netfilter/nf_conntrack_max | echo 33554432 > /proc/sys/net/netfilter/nf_conntrack_max | ||
| Line 43: | Line 121: | ||
| has on our connection rate. | has on our connection rate. | ||
| - | average over 5 minutes: 154726 | + | average connection rate over 5 minutes : 150976 |
| But looking at the graph, we see a breakdown. It starts at 180k for 120 | But looking at the graph, we see a breakdown. It starts at 180k for 120 | ||
| - | seconds, then there is a drastic drop to an average of 137k for the rest | + | seconds, then there is a drastic drop to an average of 135k for the rest |
| of the time. | of the time. | ||
| - | As the conntrack count increase in a linear form, up to about 21.5M, and | + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/03-firewall-conntrack/rempart-firewall/interfaces_eth1_bps.png|bps]] |
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/03-firewall-conntrack/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | ||
| + | |||
| + | As the conntrack count increase in a linear form, up to about 21.2M, and | ||
| decrease from that time, it seems to drop when the first timeout start | decrease from that time, it seems to drop when the first timeout start | ||
| to hit. | to hit. | ||
| Line 55: | Line 136: | ||
| Having conntracking gives a performance hit. Having tracking timeouts | Having conntracking gives a performance hit. Having tracking timeouts | ||
| while gives an other performance hit. | while gives an other performance hit. | ||
| + | |||
| + | To make sure it is related, we checked what timeouts were at 120, and | ||
| + | changed them. | ||
| + | |||
| + | cd /proc/sys/net/netfilter | ||
| + | grep 120 /proc/sys/net/netfilter/* | ||
| + | nf_conntrack_tcp_timeout_fin_wait:120 | ||
| + | nf_conntrack_tcp_timeout_syn_sent:120 | ||
| + | nf_conntrack_tcp_timeout_time_wait:120 | ||
| + | echo 150 > nf_conntrack_tcp_timeout_fin_wait | ||
| + | echo 180 > nf_conntrack_tcp_timeout_syn_sent | ||
| + | echo 60 > nf_conntrack_tcp_timeout_time_wait | ||
| + | |||
| + | Testing with those values allowed us to get the break at 60 seconds. | ||
| + | Connections gets in time_wait, and expires after 60s instead of 120s. | ||
| + | |||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/04-firewall-conntrack2/rempart-firewall/interfaces_eth1_bps.png|bps]] | ||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/04-firewall-conntrack2/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | ||
| + | |||
| + | |||
| + | Testing with nf_conntrack_tcp_timeout_time_wait set to 1s gives directly | ||
| + | the low performances, even if the conntrack stay under 200k, instead of | ||
| + | a few millions. | ||
| + | |||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/05-firewall-conntrack3/rempart-firewall/interfaces_eth1_bps.png|bps]] | ||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/05-firewall-conntrack3/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | ||
| For our heavy connections, we clearly need to be able to *not* track | For our heavy connections, we clearly need to be able to *not* track | ||
| them. | them. | ||
| + | |||
| + | |||
| + | ===== notrack ===== | ||
| + | |||
| + | Obviously, not tracking those requests would probably be a good idea. | ||
| + | Lets add the rules to just do that, and see if it helps. | ||
| + | |||
| + | *raw | ||
| + | -A PREROUTING -d 10.128.0.0/16 -p tcp -m tcp --dport 80 -j NOTRACK | ||
| + | -A PREROUTING -s 10.128.0.0/16 -p tcp -m tcp --sport 80 -j NOTRACK | ||
| + | -A PREROUTING -d 10.132.0.0/16 -p tcp -m tcp --dport 80 -j NOTRACK | ||
| + | -A PREROUTING -s 10.132.0.0/16 -p tcp -m tcp --sport 80 -j NOTRACK | ||
| + | -A PREROUTING -d 10.148.0.0/16 -p tcp -m tcp --dport 80 -j NOTRACK | ||
| + | -A PREROUTING -s 10.148.0.0/16 -p tcp -m tcp --sport 80 -j NOTRACK | ||
| + | |||
| + | That let us get about the same results as without the firewall modules | ||
| + | loaded. CPU usage on the firewall seems to be lightly more on the 2 | ||
| + | thread that handles most of the interrupts. (I'd say from about 15-17% | ||
| + | to 18-20%) | ||
| + | |||
| + | Here, one of the CPU is used at 100%, with only about 4.1G bps, | ||
| + | 4.2M pkt/s, total of about 590k conn/s instead of our 800k without | ||
| + | firewall. | ||
| + | |||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/06-firewall-notrack/rempart-firewall/interfaces_eth1_bps.png|bps]] | ||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/06-firewall-notrack/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | ||
| + | |||
| + | |||
| + | Trying to get only one rule for notrack get un slightly better | ||
| + | performances : | ||
| + | |||
| + | *raw | ||
| + | -A PREROUTING -j NOTRACK | ||
| + | |||
| + | That give us about 620k conn/s. | ||
| + | |||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/07-firewall-notrack2/rempart-firewall/interfaces_eth1_bps.png|bps]] | ||
| + | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/07-firewall-notrack2/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | ||
| + | |||
| + | ===== simple rules ===== | ||
| + | |||
| + | Lets get back to a configuration without nat loaded, and see how a few | ||
| + | matching rules can affect our CPU usage, and decrease the rate we have. | ||
| + | |||
| + | Earlier, we tried with directly filter, raw and mangle. | ||
| + | |||
| + | Lets try with just filter loaded, and no rules, and then try adding | ||
| + | useless matches, like checking source IP with IP we are not even | ||
| + | considering. | ||
| + | |||
| + | # with n being the matching rules... | ||
| + | n=64 | ||
| + | iptables -F ; for ((i=0;i<n;++i)) ; { iptables -A FORWARD -s 10.0.0.$i ; } | ||
| + | |||
| + | ^ match rules ^ conn/s ^ pkt/s ^ graph ^ | ||
| + | | 0 | 800k | 5.7M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/08-firewall-simple-rules_0000/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/08-firewall-simple-rules_0000/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | 16 | 780k | 5.6M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/08-firewall-simple-rules_0010/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/08-firewall-simple-rules_0010/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | 64 | 730k | 5.1M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/08-firewall-simple-rules_0040/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/08-firewall-simple-rules_0040/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | 256 | 480k | 3.38M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/08-firewall-simple-rules_0100/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/08-firewall-simple-rules_0100/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | 1024 | 148k | 1.05M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/08-firewall-simple-rules_0400/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/08-firewall-simple-rules_0400/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | |||
| + | |||
| + | ===== other matches ===== | ||
| + | |||
| + | Source match has an impact on the requests. Lets check other kind of match. | ||
| + | |||
| + | Tests done with 256 match rules. | ||
| + | |||
| + | ^ match rule ^ conn/s ^ pkt/s ^ graph ^ | ||
| + | | -m u32 --u32 ""0xc&0xffffffff=0xa0000`printf %02x $i`" | 67k | 480k | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/09-firewall-rule-u32-src/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/09-firewall-rule-u32-src/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | -p udp -m udp --dport 53 | 315k | 2.4M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/09-firewall-rule-udp/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/09-firewall-rule-udp/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | -p tcp -m tcp --dport 443 | 155k | 1.1M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/09-firewall-rule-tcp-https/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/09-firewall-rule-tcp-https/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | -p tcp -m tcp --dport 80 (does match) | 140k | 990k | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/09-firewall-rule-tcp-http/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/09-firewall-rule-tcp-http/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | -d 10.0.0.$i | 460k | 3.2M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/09-firewall-rule-dst/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/09-firewall-rule-dst/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | |||
| + | Different kind of matches have different kind of impact. -d or -s have | ||
| + | about the same impact. | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | ===== other configs ===== | ||
| + | |||
| + | Tests done with 256 -s xxx matches, as it's the one that gave the best | ||
| + | performances so far. | ||
| + | |||
| + | ^ match rules ^ conn/s ^ pkt/s ^ graph ^ | ||
| + | | default | 480k | 3.38M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/10-firewall-txqueuelen1k/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/10-firewall-txqueuelen1k/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | ethtool -G eth1 {tx/rx} 512 | 505k | 3.6M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/10-firewall-ethtool512/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/10-firewall-ethtool512/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | ethtool -G eth1 {tx/rx} 64 | 450k | 3.2M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/10-firewall-ethtool64/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/10-firewall-ethtool64/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | ip link set eth1 txqueuelen 10000 | 470k | 3.3M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/10-firewall-txqueuelen10k/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/10-firewall-txqueuelen10k/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | |||
| + | txqueuelen - no effect | ||
| + | |||
| + | ring parameters rx/tx, do have an effect, and neither too big nor too | ||
| + | small is the best. | ||
| + | |||
| + | ===== tree search ===== | ||
| + | |||
| + | netfilter allow to use chains, with a few match, you can send to a | ||
| + | chain, and avoid doing what is in the chain if it does not match. | ||
| + | |||
| + | As seen previously, having alot of match in a single chain means that | ||
| + | the packet is tested for all possible match. | ||
| + | |||
| + | Lets see how we could have per ip match for a whole /13. (yeah, that | ||
| + | mean 512k different IPs) | ||
| + | |||
| + | Using iptables to generate that many entry is just too slow, it would | ||
| + | take days. generating a result, and using iptables-restore can provide | ||
| + | way better performances (5-10 minutes, instead of days). | ||
| + | |||
| + | Having 512k rules to match for each packet would slow down the trafic | ||
| + | alot. A way to do it would is to get a few matches, and send to a | ||
| + | specific rule. Then match a few more bits, and send to other rules. | ||
| + | |||
| + | Ideas for that Jesper Dangaard Brouer slides about | ||
| + | [[http://www.slideshare.net/brouer/netfilter-making-large-iptables-rulesets-scale|Making large iptables rulesets scale]]. | ||
| + | Unfortunatly, the perl library has not been updated to build with wheezy | ||
| + | iptables version yet. | ||
| + | |||
| + | Exemple of rules that would be checked/matched for 10.139.5.43 : | ||
| + | |||
| + | <code> | ||
| + | -A FORWARD -s 10.128.0.0/12 -j cidr_12_176160768 (match) | ||
| + | -A cidr_12_176160768 -s 10.136.0.0/14 -j cidr_14_176685056 (match) | ||
| + | -A cidr_14_176685056 -s 10.136.0.0/16 -j cidr_16_176685056 | ||
| + | -A cidr_14_176685056 -s 10.137.0.0/16 -j cidr_16_176750592 | ||
| + | -A cidr_14_176685056 -s 10.138.0.0/16 -j cidr_16_176816128 | ||
| + | -A cidr_14_176685056 -s 10.139.0.0/16 -j cidr_16_176881664 (match) | ||
| + | -A cidr_16_176881664 -s 10.139.0.0/18 -j cidr_18_176881664 (match) | ||
| + | -A cidr_18_176881664 -s 10.139.0.0/20 -j cidr_20_176881664 (match) | ||
| + | -A cidr_20_176881664 -s 10.139.0.0/22 -j cidr_22_176881664 | ||
| + | -A cidr_20_176881664 -s 10.139.4.0/22 -j cidr_22_176882688 (match) | ||
| + | -A cidr_22_176882688 -s 10.139.4.0/24 -j cidr_24_176882688 | ||
| + | -A cidr_22_176882688 -s 10.139.5.0/24 -j cidr_24_176882944 (match) | ||
| + | -A cidr_24_176882944 -s 10.139.5.0/26 -j cidr_26_176882944 (match) | ||
| + | -A cidr_26_176882944 -s 10.139.5.0/28 -j cidr_28_176882944 | ||
| + | -A cidr_26_176882944 -s 10.139.5.16/28 -j cidr_28_176882960 | ||
| + | -A cidr_26_176882944 -s 10.139.5.32/28 -j cidr_28_176882976 (match) | ||
| + | -A cidr_28_176882976 -s 10.139.5.32/30 -j cidr_30_176882976 | ||
| + | -A cidr_28_176882976 -s 10.139.5.36/30 -j cidr_30_176882980 | ||
| + | -A cidr_28_176882976 -s 10.139.5.40/30 -j cidr_30_176882984 (match) | ||
| + | -A cidr_30_176882984 -s 10.139.5.40/32 -j cidr_32_176882984 | ||
| + | -A cidr_30_176882984 -s 10.139.5.41/32 -j cidr_32_176882985 | ||
| + | -A cidr_30_176882984 -s 10.139.5.42/32 -j cidr_32_176882986 | ||
| + | -A cidr_30_176882984 -s 10.139.5.43/32 -j cidr_32_176882987 (match) | ||
| + | -A cidr_32_176882987 ... | ||
| + | -A cidr_28_176882976 -s 10.139.5.44/30 -j cidr_30_176882988 | ||
| + | -A cidr_26_176882944 -s 10.139.5.48/28 -j cidr_28_176882992 | ||
| + | -A cidr_24_176882944 -s 10.139.5.64/26 -j cidr_26_176883008 | ||
| + | -A cidr_24_176882944 -s 10.139.5.128/26 -j cidr_26_176883072 | ||
| + | -A cidr_24_176882944 -s 10.139.5.192/26 -j cidr_26_176883136 | ||
| + | -A cidr_22_176882688 -s 10.139.6.0/24 -j cidr_24_176883200 | ||
| + | -A cidr_22_176882688 -s 10.139.7.0/24 -j cidr_24_176883456 | ||
| + | -A cidr_20_176881664 -s 10.139.8.0/22 -j cidr_22_176883712 | ||
| + | -A cidr_20_176881664 -s 10.139.12.0/22 -j cidr_22_176884736 | ||
| + | -A cidr_18_176881664 -s 10.139.16.0/20 -j cidr_20_176885760 | ||
| + | -A cidr_18_176881664 -s 10.139.32.0/20 -j cidr_20_176889856 | ||
| + | -A cidr_18_176881664 -s 10.139.48.0/20 -j cidr_20_176893952 | ||
| + | -A cidr_16_176881664 -s 10.139.64.0/18 -j cidr_18_176898048 | ||
| + | -A cidr_16_176881664 -s 10.139.128.0/18 -j cidr_18_176914432 | ||
| + | -A cidr_16_176881664 -s 10.139.192.0/18 -j cidr_18_176930816 | ||
| + | -A cidr_12_176160768 -s 10.140.0.0/14 -j cidr_14_176947200 | ||
| + | </code> | ||
| + | |||
| + | With at most 39 check, and 11 jump, any IP within the /13 arrives it its | ||
| + | own chain (or merged chained, if you have serveral IP that need the same | ||
| + | rules). Anything not even in the /12 would get just one check, and get | ||
| + | to the next entries. | ||
| + | |||
| + | ^ bits matched per level ^ check ^ match ^ conn/s ^ pkt/s ^ graph ^ | ||
| + | | 2 | 39 | 11 | 560k | 3.9M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/11-fw-sourcetree-2/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/11-fw-sourcetree-2/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | 3 | 51 | 8 | 595k | 4.2M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/11-fw-sourcetree-3/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/11-fw-sourcetree-3/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | 4 | 73 | 6 | 580k | 4.0M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/11-fw-sourcetree-4/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/11-fw-sourcetree-4/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | | 5 | 113 | 5 | 575k | 4.0M | [[http://www.hagtheil.net/files/system/benches10gbps/firewall/11-fw-sourcetree-5/rempart-firewall/interfaces_eth1_bps.png|bps]] [[http://www.hagtheil.net/files/system/benches10gbps/firewall/11-fw-sourcetree-5/rempart-firewall/interfaces_eth1_pkt.png|pkt]] | | ||
| + | |||
| + | Note: such high number of rules uses memory. Like 20GB+ of ram used. | ||
| + | |||
| + | ===== nat ===== | ||
| + | |||
| + | Earlier, we already noticed that conntracking all our connections would | ||
| + | be too much. What if we can have a main 1:1 mapping that would not | ||
| + | require any tracking ? | ||
| + | |||
| + | Well, iptables NOTRACK prevent any form of nat, so that can't be done... | ||
| + | |||
| + | Will have to seek for other solutions. | ||
| + | |||
| + | ===== ipset ===== | ||
| + | |||
| + | Some people mentionned ipset. Lets bench that. | ||
| + | |||
| + | <code> | ||
| + | # lets create some sets we might use | ||
| + | ipset create ip hash:ip | ||
| + | ipset create net hash:net | ||
| + | ipset create ip,port hash:ip,port | ||
| + | ipset create net,port hash:net,port | ||
| + | </code> | ||
| + | |||
| + | Rules used for different tests : | ||
| + | <code> | ||
| + | -A FORWARD -m set --match-set ip src | ||
| + | -A FORWARD -m set --match-set net src | ||
| + | -A FORWARD -m set --match-set net,port src,src | ||
| + | -A FORWARD -m set --match-set ip,port src,dst | ||
| + | </code> | ||
| + | |||
| + | Lets see how a few match for hash:ip affects our traffic : | ||
| + | |||
| + | ^ # rules ^ conn/s ^ pkt/s ^ | ||
| + | | 1 | 570k | 3.6M | | ||
| + | | 2 | 340k | 2.05M | | ||
| + | | 3 | 240k | 1.45M | | ||
| + | | 4 | 184k | 1.1M | | ||
| + | |||
| + | Ok, so just a few ipset match affects us ALOT. What about other hashes ? | ||
| + | |||
| + | (tests done with 2 matches) | ||
| + | |||
| + | ^ ipset ^ conn/s ^ pkt/s ^ | ||
| + | | hash:ip | 340k | 2.05M | | ||
| + | | hash:net | 350k | 2.1M | | ||
| + | | hash:ip,port | 330k | 2M | | ||
| + | | hash:net,port | 330k | 2M | | ||
| + | |||
| + | Net or ip doesn't change much, and including the port is only a light overhead, | ||
| + | considering the overhead we already have. | ||
| + | |||
| + | What about ipset bitmasks ? | ||
| + | |||
| + | <code> | ||
| + | ipset create bip0 bitmap:ip range 10.136.0.0-10.136.255.255 | ||
| + | ipset create bip1 bitmap:ip range 10.140.0.0-10.140.255.255 | ||
| + | </code> | ||
| + | |||
| + | ^ # rules ^ conn/s ^ pkt/s ^ | ||
| + | | 2 | 550k | 3.5M | | ||
| + | | 4 | 320k | 1.9M | | ||
| + | |||
| + | |||
| + | Considering ipset is limited to 65k entries, and the results, I would advise | ||
| + | against using it, unless you really need the easy to manage set. | ||
| + | |||
| + | |||
| + | ===== interface irq affinity ===== | ||
| + | |||
| + | FIXME: add irq affinity matches with results | ||
| + | |||
| + | ====== Conclusion ====== | ||
| + | |||
| + | * Alot of matching reduce performances. | ||
| + | * u32 are costly | ||
| + | * if you can, try to match and segregate to different subchains, with like 8 to 16 match per chain (for src/dst match, maybe less with heavier match) | ||
| + | * irq affinity can change performances on high loads | ||
| + | |||