Monday, August 13, 2007

Big Gateways with iptables

At HOTorNOT, all the servers run on an internal network. Only a select few have a physical connection to the outside world such load balancers, mail machines and gateways. All the other servers reach the internet via the gateway. These are the reasons why serious companies do this:
  • internet - many machines need internet. for example, webservers may pull api data from external sources.
  • security - you shouldn't be able to touch a database machine via it's external ip
  • scarcity - being able to buy thousands of machines doesn't necessarily mean you can get thousands of IPs (at least until ipv6)
A few lessons we learned while working on the Facebook Platform.
  • It can get very popular *very* fast.
  • Facebook API calls must happen on the webservers
  • Our gateway machine got overloaded
The problem is the TCP/IP standard has some builtin connection timeouts which we must adhere to. But the webservers were making thousands of API calls resulting in thousands of iptables tracked connections. We were basically running out of ports on the machine. While we weren't doing 65k active concurrent connections, TCP/IP timeouts prevented the reuse of the port for 5 minutes.

The solution.

Every IP can run concurrent connections for all 65k ports. What if you need to track more? Spread the love. The iptables component on linux lets us load balance connections over many IPs. Here's a sample configuration for Fedora 6 (/etc/sysconfig/iptables)

Note: eth0 is an internal (10.0.0.0/8) and eth1 is external.

:FORWARD DROP [0:0]
-A FORWARD -i lo -j ACCEPT
-A FORWARD -i eth0 -s 10.0.0.0/8 -j ACCEPT
-A FORWARD -i eth1 -m state --state ESTABLISHED,RELATED -j ACCEPT

*nat
:PREROUTING ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A POSTROUTING -o eth1 -j SNAT --to-source 4.2.2.0-4.2.2.5
COMMIT

To make this work, the machine needs to have aliases for all the ip addresses you want to spread load over. In this case, we created 5 aliases. eth1:0 - eth1:4.

Before we figured this out, we had to throw in more externally facing machines and manually change the default gateways for our webservers.

No comments: