Next Previous Contents

1. Introduction

1.1 ChangeLog

LVS-HOWTO

posted to the LVS site, LVS website http://www.linuxvirtualserver.org

1.2 Purpose of this HOWTO

To enable you to understand how a Linux Virtual Server (LVS) works. Another document, the LVS-mini-HOWTO, tells you how to setup and install an LVS without understanding how the LVS works. The mini-HOWTO was extracted from this HOWTO. Eventually the redundant material will be dropped from this HOWTO.

The material here covers directors and real-servers with 2.2 kernels. For 2.4 kernels, the coverage is less extensive. In general ipvsadm commands and services have not changed between kernels. Otherwise there will be separate information for 2.2 and 2.4 kernels in each section. If there's nothing on 2.4 kernels, it's probably because I haven't written it yet.

1.3 Nomenclature/Abbreviations

Preferred names:

The code that patches the linux kernel is called ipvs (or IPVS).

synonyms

other

The real-servers sometimes are front-ends to other (back-end) servers. (The client does not connect to the back-end servers and they are not in the ipvsadm table.)

e.g.

Nomenclature is not uniform

Anarchy still rules with nomenclature on the mailing list. People call both the director and the real-servers "the server" (in the ipvsadm man page the director is called the "LVS server/host" - this is a historical accident which will be fixed), the real-server is called the "real server" and the whole system the "virtual server". Because of the ambiguity involved, I would rather avoid the term "the server" with LVS. Most people use "real server" for what I call "real-server". I use "real-server" as I despair of finding a reference to a real server in a webpage using the search keys "real" and "server". Until a standardised nomenclature is adopted, you should be prepared for anything on the mailing list.

IPs

The following abbreviations are used in the HOWTO (and in the mailing list).

client IP = CIP
virtual IP = VIP (the IP on the director that the client connects to)
director IP = DIP 
real-server IP = RIP (and RIP1, RIP2...)
director GW = DGW
real-serer GW = RGW

1.4 What is an LVS?

A Linux Virtual Server (LVS) is a piece of software which makes a cluster of servers appear to be one server to an outside client. This apparent single server is called here a "virtual server". The individual servers (real-servers) are under the control of a director (or load balancer), which runs a patched Linux kernel. The client is connected to one particular real-server for the duration of its connection to the LVS. This connection is chosen and managed by the director. The real-servers serve services (eg ftp, http, dns, telnet, nntp, smtp) such as are found in /etc/services or inetd.conf. The LVS presents one IP on the director (the virtual IP, VIP) to clients.

An LVS is useful

The Client/Server relationship is preserved in an LVS since

Here is the basic setup for an LVS.



                        ________
                       |        |
                       | client | (local or on internet)
                       |________|
                           |
                        (router)
                           |
--                         |
L                      Virtual IP
i                      ____|_____
n                     |          | (director can have 1 or 2 NICs)
u                     | director |
x                     |__________|
                           |
V                          |
i                          |
r         ----------------------------------
t         |                |               |
u         |                |               |
a         |                |               |
l    _____________   _____________   _____________
    |             | |             | |             |
S   | real-server | | real-server | | real-server |
e   |_____________| |_____________| |_____________|
r
v
e
r
---

In the computer bestiary, the director is a layer 4 (L4) switch. The director makes decisions at the IP layer and just sees a stream of packets going between the client and the real-servers. In particular an L4 switch makes decisions based on the IP information in the headers of the packets.

Here's a description of an L4 switch from Super Sparrow Global Load Balancer documentation

" Layer 4 Switching: Determining the path of packets based on information available at layer 4 of the OSI 7 layer protocol stack. In the context of the Internet, this implies that the IP address and port are available as is the underlying protocol, TCP/IP or UCP/IP. This is used to effect load balancing by keeping an affinity for a client to a particular server for the duration of a connection. "

The director does not inspect the content of the packets and cannot make decisions based on the content of the packets (e.g. if the packet contains a cookie, the director doesn't know about it and doesn't care). The director doesn't know anything about the application generating the packets or what the application is doing. Because the director does not inspect the content of the packets (layer 7, L7) it is not capable of session management or providing service based on packet content. L7 capability would be a useful feature for LVS and perhaps this will be developed in the future.

The director is basically a router, with routing tables set up for the LVS function. These tables allow the director to forward packets to real-servers for services that are being LVS'ed. If http (port 80) is a service that is being LVS'ed then the director will forward those packets. The director does not have a socket listener on VIP:80 (i.e. netstat won't see a listener).

John Cronin jsc3@havoc.gtf.org (19 Oct 2000) calls these types of servers (i.e. lots of little boxes appearing to be one machine) "RAILS" (Redundant Arrays of Inexpensive Linux|Little|Lightweight|L* Servers). Lorn Kay lorn_kay@hotmail.com calls them RAICs (C=computer), pronounced "rake".

The director uses 3 different methods of forwarding

Some modification of the real-server's ifconfig and routing tables will be needed for VS-DR and VS-Tun forwarding. For VS-NAT the real-servers only need a functioning tcpip stack (ie the real-server can be a networked printer).

LVS works with all services tested so far (single and 2 port services) except that VS-DR and VS-Tun cannot work with services that initiate connects from the real-servers (so far; identd and rsh).

The real-servers can be indentical, presenting the same service (eg http, ftp) working off file systems which are kept in sync for content. This type of LVS increases the number of clients able to be served. Or the real-servers can be different, presenting a range of services from machines with different services or operating systems, enabling the virtual server to present a total set of services not available on any one server. The real-servers can be local/remote, running Linux (any kernel) or other OS's. Some methods for setting up an LVS have fast packet handling (eg VS-DR which is good for http and ftp) while others are easier to setup (eg transparent proxy) but have slower packet throughput. In the latter case, if the service is CPU or I/O bound, the slower packet throughput may not be a problem.

For any one service (eg httpd at port 80) all the real-servers must present identical content since the client could be connected to any one of them and over many connections/reconnections, will cycle through the real-servers. Thus if the LVS is providing access to a farm of web, database, file or mail servers, all real-servers must have identical files/content. You cannot split up a database amongst the real-servers and access pieces of it with LVS.

An LVS requires a Linux director (Intel and Alpha versions known to work, other versions eg Sparc expected to work, but haven't been tested).

There are differences in the coding for LVS for the 2.0.x, 2.2.x and 2.4.x kernels. Development of LVS on 2.0.36 kernels has stopped (May 99). Code for 2.2.x kernels is production level and this HOWTO is up to date for 2.2.18 kernels. Code for 2.4.x kernels is relatively new and this HOWTO may not contain all the information neccessary to run 2.4.x kernel ipvs (check on the mailing list).

The 2.0.x and 2.2.x code is based on the masquerading code. Even if you don't explicitely use ipchains (eg with VS-DR or VS-Tun), you will see masquerading entries with `ipchains -M -L` (or `netstat -M`). Because 2.2.x kernels are not SMP for kernel code, LVS can only use one processor at a time. Code for 2.4.x kernels was rewritten to be compatible with the netfilter code (i.e. its entries will show up in netfilter tables). It is production level for VS-DR and VS-Tun. Because of incompatibilities with VS-NAT for 2.4.x LVS is development (Jan 2001) with problems are expected to be resolved shortly. Directors running 2.4.x kernel LVS's will run in SMP mode in the kernel allowing a higher throughput of packets than with a 2.2.x kernel running on the same (SMP) hardware.

If you are setting up an LVS for the first time, then set up the director with the latest 2.2.x kernel or 2.4.x kernel. You can have almost any OS on the real-servers.

LVS works on ethernet. There are some limitations on using ATM.

LVS is continually being developed and usually only the more recent kernel and kernel patches are supported. Usually development is incremental, but with the 2.2.14 kernels the entries in the /proc file system changed and all subsequent 2.2.x versions were incompatible with previous versions.

For more documentation, look at the LVS web site (eg a talk I gave on how LVS works on 2.0.36 kernel directors)

1.5 Minimal knowledge required

(from Ratz ratz@tac.ch)

To be able to setup/maintain an LVS it would be helpfull if you

1.6 LVS Failure

The LVS itself does not provide high availability. Other software (eg mon, ldirectord, or the Linux HA code) is used in conjunction with LVS to provide high availability (i.e. to switch out a failed real-server/service or a failed director).

Another package keepalived is designed to work with LVS watching the health of services, but no-one has reported using it. Julian has written Netparse, which is suffering the same fate.

There are two types of failures with an LVS.

In both cases (failure of director, or failure of a service), the client's session with the real-server will be lost (as would happen in the case of a single server). With failover however, the client will be presented with a new connection when they reconnect.

1.7 Thanks

Contributions to this HOWTO came from the mailing list and are attibuted to the poster (with e-mail address). Postings may have been edited to fit them into the flow of the HOWTO.

The LVS logo (Tux with 3 lighter shaded penguins behind him representing a director and 3 real-servers) is by Mike Douglas spike@bayside.net

LVS homepage in Germany hosted by Lars lmb@suse.de

1.8 Mailing list, subscribing, unsubscribing and searchable archives

mailing list details

Thanks to Hank Leininger for the mailing list archive which is searchable not only by subject and author, but by strings in the body. Hank's resource has been of great help in assembling this HOWTO.

1.9 getting technical help

1.10 Posting problems/questions to the mailing list

There's always new ideas and questions being posted on the mailing list. We don't expect this to stop.

If you have a problem with your LVS not working, before you come up on the mailing list, please -

If you don't understand your problem well, Here's a suggested submission format from Roberto Nibali ratz@tac.ch

  1. System information, such as kernel, tools and their versions. Example:
            hog:~ # uname -a
            Linux hog 2.2.18 #2 Sun Dec 24 15:27:49 CET 2000 i686 unknown
    
            hog:~ # ipvsadm -L -n | head -1
            IP Virtual Server version 1.0.2 (size=4096)
    
            hog:~ # ipvsadm -h | head -1
            ipvsadm v1.13 2000/12/17 (compiled with popt and IPVS v1.0.2)
            hog:~ # 
            
    
  2. Short description and maybe sketch of what you intended to setup. Example (LVS-DR):
            o Using LVS-DR, gatewaying method.
            o Load balancing port 80 (http) non-persistent.
            o Network Setup:
    
                            ________
                           |        |
                           | client |
                           |________|
                               | CIP
                               |
                               |
                               |
                            (router)
                               |
                               |
                               |
                               | GEP
                     (packetfilter, firewall)
                               | GIP
                               |
                               |       __________
                               |  DIP |          |
                               +------+ director |
                               |  VIP |__________|
                               |
                               |
                               |
             +-----------------+----------------+
             |                 |                |
             |                 |                |
         RIP1, VIP         RIP2, VIP        RIP3, VIP
        ____________      ____________    ____________
       |            |    |            |  |            |
       |real-server1|    |real-server2|  |real-server3|
       |____________|    |____________|  |____________|
    
    
            CIP  = 212.23.34.83
            GEP  = 81.23.10.2       (external gateway, eth0)
            GIP  = 192.168.1.1      (internal gateway, eth1, masq or NAT)
            DIP  = 192.168.1.2      (eth0)
            VIP1 = 192.168.1.100    (eth0:4 on director, lo:1 on realserver)
            RIP1 = 192.168.1.11     (belonging to VIP1)
            RIP2 = 192.168.1.12     (belonging to VIP1)
            RIP3 = 192.168.1.13     (belonging to VIP1)
            DGW  = 192.168.1.1      (GIP for all realserver)
    
            o ipvsadm -L -n
    
    hog:~ # ipvsadm -L -n
    IP Virtual Server version 1.0.2 (size=4096)
    Prot LocalAddress:Port Scheduler Flags
      -> RemoteAddress:Port          Forward Weight ActiveConn InActConn
    TCP  192.168.1.10:80 wlc
      -> 192.168.1.13:80             Route   0      0          0         
      -> 192.168.1.12:80             Route   0      0          0         
      -> 192.168.1.11:80             Route   0      0          0         
    hog:~ # 
    
    The output from ifconfig from all machines (abbreviated, just need the
    IP, netmask etc), and the output from netstat -rn.
    
            
    
  3. What doesn't work. Show some output like tcpdump, ipchains, ipvsadm and kernlog. Maybe we even ask you for a more detailed configuration like routing table, OS-version or interface setup on some machines used in your setup. Tell us what you expected. Example:
            o ipchains -L -M -n
            o echo 9 > /proc/sys/net/ipv4/vs/debug_level && tail -f /var/log/kernlog
            o tcpdump -n -e -i eth0 tcp port 80
            o route -n
            o netstat -an
            o ifconfig -a
            
    

1.11 ToDo List

Combining HA and LVS (e.g. Ultramonkey).

I realise that information in here isn't all that easy to locate yet (there's no index and you'll have to search with your editor) and that the ordering of sections could be improved.

I'll work on it as I have time.

1.12 Other load balancing solutions

from lvs@spiderhosting.com a list of load balancers

1.13 Software/Information useful/related to LVS

Ultra Monkey is LVS and HA combined.

from lvs@spiderhosting.com Super Sparrow Global Load Balancing using BGP routing information.

From ratz, there's a write up on load imbalance with persistence and sticky bits at our friends at M$.

From ratz, Zero copy patches to the kernel to speed up network throughput, Dave Miller's patches, Rik van Riel's vm-patches and more of Rick van Riel's patches. The Zero copy patches may not work with LVS and may not work with netfilter either (from Katejohn@antefacto.com).

From Michael Brown michael_e_brown@dell.com, the TUX kernel level webserver.

From Lars lmb@suse.de mod_backhand, a method of balancing apache httpd servers that looks like ICP for web caches.

A lightweight and simple webbased cluster monitoring tool designed for beowulfs procstatd, the latest version was 1.3.4 (you'll have to look around on this page).

From Putchong Uthayopas pu@ku.ac.th a heavyweight (lots of bells and whistles) cluster monitoring tool KCAP


Next Previous Contents