Alexander ([info]dj_alexander) wrote,
@ 2008-03-05 14:26:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
LazyWebHosting
Dear LazyWeb,

1 x Domain Name (i.e. www.domain.co.uk)
3 x Web Servers (i.e. www1.domain.co.uk [100.100.100.1] / www2.domain.co.uk [100.100.100.2] / www3.domain.co.uk [100.100.100.3] )

What's the best way to load balance web sites these days? So that anyone looking at www.domain.co.uk gets a round robin or whatever to one of the physical web servers...? Does it require a network config or can it all be done via DNS?


(Post a new comment)


[info]learath
2008-03-05 02:29 pm UTC (link)
It all depends on what you need. Round robin DNS would mostly work. If they are all in the same data center you could use a product like a BigIP F5 or a Cisco ACE loadbalancer (both of which start at 5 figures and go up).

(Reply to this) (Thread)


[info]dj_alexander
2008-03-05 02:39 pm UTC (link)
They might be in the same data centre but at the moment they're not. I'd rather do it all via DNS as such to get it all set in stone and working now.

Obviously round robin DNS works as long as all the servers are up, but if one of the servers goes down we get screwed. I'm actually wondering what's the best way to do it so that if one server goes down so that nobody notices.

I suppose setting up a round robin DNS and then just removing the downed server listing is one answer... just wondering if there is actually a better way.

(Reply to this) (Parent)(Thread)


[info]learath
2008-03-05 02:51 pm UTC (link)
Remember that DNS changes may take up to 72 hours to propagate.

(Reply to this) (Parent)(Thread)


[info]pir
2008-03-05 02:54 pm UTC (link)
If you set the TTLs low enough to start with then it's only very broken clients that won't see the changes fairly quickly.

(Reply to this) (Parent)(Thread)


[info]dj_alexander
2008-03-05 02:58 pm UTC (link)
That's basically what I've come up with so far. Set the TTL to 1 hour, make the changes as they happen and hope for the best.

(Reply to this) (Parent)(Thread)


[info]mr_pete
2008-03-05 06:45 pm UTC (link)
1 hour? 15 mins or less! You can change them back afterwards.

Anyone, no-one else appears to have asked a sensible question : is anything interactive on these servers? If you have anything session/state based, and you're not running anything cunning (i.e. something like MS Application Server) then your user could visit, get a cookie specific to server 1, then 5 mins later, get shunted to server 2.

I'd be checking with your web admins that any BROWSER based tracking (cookies) are domain level (i.e. not www1, www2) and that any backend DB knows about all front end cookies and matches the state tracking.

Otherwise you'll have plenty of amused users....

[EDIT : Ok, spikylau did ask at the end, I missed it!]



Edited at 2008-03-05 06:46 pm UTC

(Reply to this) (Parent)(Thread)


[info]giolla
2008-03-05 07:55 pm UTC (link)
If you're expecting outages you may as well go down as low as 30s, which is the F5 3DNS default. DNS traffic isn't terribly high bandwidth after all.

As the stated set-up is one FSDn www.example.com and 3 servers, it's a reasonable assumption that something is already directing the traffic to the appropriate server so either:
1) There's nothing stateful
or
2) Once the client goes to wwwX that server provides a new url containing wwwX, or does something else clever to keep the client there.

if 1 there's no problem
if 2 then dropping changing the DNS for a server that's out of service will confuse existing sessions but will mean new clients will work so that's a plus. Something must after all be controlling how clients get to and stay with ( if needed ) the appropriate server at present.

(Reply to this) (Parent)(Thread)


[info]poggs
2008-03-05 08:04 pm UTC (link)
ISTR there are some ISPs nameservers which ignore 'too low' TTLs.

(Reply to this) (Parent)(Thread)


[info]giolla
2008-03-05 08:08 pm UTC (link)
You can only worry so much about other peoples broken setups.

(Reply to this) (Parent)(Thread)


[info]poggs
2008-03-05 08:11 pm UTC (link)
Agreed.

Thankfully many SPs don't mess with Layer 3, they like to tinker up the ISO model.

(Reply to this) (Parent)


[info]mr_pete
2008-03-05 09:29 pm UTC (link)
Something must after all be controlling how clients get to and stay with ( if needed ) the appropriate server at present.

It's surprising how many sites don't do that....even base proxy servers corporates use don't do that - so if you have multiple exit points, you can get screwed.....

(Reply to this) (Parent)(Thread)


[info]giolla
2008-03-05 09:46 pm UTC (link)
If they don't have it now, then not worrying about it for the question at hand doesn't make things any worse, you keep the same level of service.

(Reply to this) (Parent)


[info]dj_alexander
2008-03-05 07:58 pm UTC (link)
There would indeed be issues, mostly with live synching of logs and zope data, but I'm planning to get around this by running a syslog server. But that's a discussion for another day!

(Reply to this) (Parent)


[info]pir
2008-03-05 02:53 pm UTC (link)
If you need to deal with one of the servers being down then you're no longer just load balancing and you can't do it automatically with just DNS (although there are DNS based load balancing systems, which I think the F5 is one but it's years since I've looked at them).

Also it's normal to try and keep the same source IP's requests going to the same destination machine if you have any non-static content on the webservers.

For either of these you need something external, like the aforementioned BigIP F5 or netscalers. I know the netscalers can be put in multiple locations and run in pairs (pair per location) so you can fail over to another location. I'm not sure about two independent front ends.

To do this properly isn't a simple problem, the more fault cases you want to deal with and the better redundancy level the more complicated it becomes so there's a tradeoff between cost/complexity and redundancy.

(Reply to this) (Parent)(Thread)


[info]giolla
2008-03-05 02:59 pm UTC (link)
The F5 3DNS/GTM units do this the BigIP devices are for local load balancing.

As you say how complex it is all depends on how stateful the sessions are.

If there are 3 servers at present the load must be distributed between them somehow.

(Reply to this) (Parent)


[info]dj_alexander
2008-03-05 03:02 pm UTC (link)
What it comes down to is that we can't do it at the moment, but because we're moving to a single data centre it'll be entirely possible to do some decent load balancing.

Essentially I'm trying to figure "some decent load balancing" that I can manage pretty simply, and can be done on a four figure budget is possible. Hence my query a few months ago regarding Zyxel Zywall 1050s which look nice and fit both categories. Obviously it won't provide a complete redundancy solution but good enough for a company with only 20 people in it.

(Reply to this) (Parent)


[info]giolla
2008-03-05 02:34 pm UTC (link)
It can be done with DNS, with multiple A records, which is the easy/cheap way to do it. Or you can do it better with a dedicated load balancer so that if one server is down traffic doesn't go to it. proper load balancers like F5's BigIP/LTM or 3DNS/GTM devices are a lot of fun and let you do some quite clever stuff.

Or you can do it with clever network stuff and a shared IP address.

All valid, all in use take your pick. Partly it depends on how separated the servers are, if they're all in the same rack on the same subnet you've more choices than if they're in multiple locations.

(Reply to this) (Thread)


[info]dj_alexander
2008-03-05 02:42 pm UTC (link)
They're in multiple locations at the moment but they'll all be in the same location soon enough. But it's between then and now that I'm worried about, especially as one of the servers is being taken down for an hour on Saturday.

(Reply to this) (Parent)(Thread)


[info]giolla
2008-03-05 02:48 pm UTC (link)
In which case it has to be done by DNS as you suspect, so short of spending quite a few pennies for an F5 3DNS/GTM device which monitors servers and returns DNS accordingly you're basically stuff with round robin ( setting up multicast/anycast type stuff won't be viable ).

What you can do to get round the server going down problem, other than just manually drop it from DNS, is use dynamic DNS and have a monitoring script which adds/removes servers from DNS as they stop/start responding. A poor man's version of what the F5 kit does. Wouldn't be too tricky to set up and if you have monitoring tools in place already that can run scripts it's even easier.

(Reply to this) (Parent)(Thread)


[info]andialan
2008-03-05 04:50 pm UTC (link)
I concur.

Incidentally, I have a Nortel / Alteon AD3 load balancer sitting around taking up space if you want a long term loan of it, or even buy the bloody thing off me (though I see an identical one on eBay for only £150 - http://cgi.ebay.co.uk/NORTEL-ALTEON-AD3_W0QQitemZ360029162942QQcmdZViewItem). It's an older machine but still a fine bit of kit. I don't need it anymore as I've just replaced it with a couple of Cisco 11501 css things.

(Reply to this) (Parent)


[info]aenikata
2008-03-05 05:36 pm UTC (link)
Depending on what you're wanting to do, other things to consider include having a single webserver and multiple database servers set up for clustering, or some other configuration that spreads the load between multiple systems. Those setups can have a certain amount of fault tolerance, and where you have a single point of failure in the front-end system, it's possible to consider something like Heartbeat for linux (not familiar with the equivalent options for Windows), which provides availability monitoring and fail-over.

(Reply to this)


[info]spikylau
2008-03-05 05:38 pm UTC (link)
DNS round robin sounds like the job, but the webservers have to be completely stateless from request to request and sometimes all customers of an ISP can wind up using one address and flattening a machine.

If you're just balancing web traffic and using Apache then mod_backhand is supposed to be quite good, though I've never used it.

(Reply to this) (Thread)


[info]mr_pete
2008-03-05 06:46 pm UTC (link)
Almost what I said :)

(Reply to this) (Parent)


Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…