HOWTO BONDING
NIC Bonding
Sometimes this is called port trunking and may be called for something else too, but we will use bonding. But what is bonding? It's shortly making X number of NICs to work as one, with the purse of increasing the throughput (HT), increase the network availability (HA) or a combination of both.
It's possible to use different brands and models of NICs, in a HA setup you can have different speeds (the bond will adapt to the slowest). Even if a NIC supports jumbo frames, it may not always work well in a bond together with jumbo frames.
Before you begin with setting up your bond, check that all of the components used in your bond are working properly, for broken hardware and bad cables will be slightly more difficult to detect when you are setting up your bond for the first time.
Example
This example will include 3 servers, all using 3 NICs for their bond (The servers could have more NICs or/and bonds) and they have a RedHat like Linux which uses network-scripts to configure network settings.
- Server 1: NFS server (ip: 10.0.0.1) - Server 2: NFS client (ip: 10.0.0.2) - Server 3: NFS client (ip: 10.0.0.3)
You have to decide if we want to use mii or arp monitoring of the "ports", mii is done locally and won't detect if something stopped to work remotely. Arp has the disadvantage that not all NIC drivers supports features needed for this to work.
You also need to pick a mode how your bond should work, mode 0 - 3 should work with most switches, while mode 4 will require features you won't find in home switches and mode 5 - 6 will require that your NICs driver has ethtool support.
In the /etc/modprobe.conf file add the following (mii):
alias bond0 bonding options bond0 miimon=80 mode=0
In the /etc/modprobe.conf file add the following (arp, server 1):
alias bond0 bonding options bond0 arp_interval=80 arp_ip_target=10.0.0.2,10.0.0.3 mode=0
You must specify between 1 and 16 ip-numbers, the more ip-numbers listed in the arp_ip_target the less risk that the "port" will be taken down when the remote machine reboots, all addresses is separated with a comma.
Create the /etc/sysconfig/network-scripts/ifcfg-bond0 (server 1):
DEVICE=bond0 IPADDR=10.0.0.1 NETMASK=255.255.255.0 NETWORK=10.0.0.0 BROADCAST=10.0.0.255 GATEWAY= ONBOOT=yes BOOTPROTO=none USERCTL=no
The ifcfg-bond0 don't really differ from a traditional ifcfg-eth0, and it may have gateway specified
Change the /etc/sysconfig/network-scripts/ifcfg-eth1 to (all servers):
DEVICE=eth1 HWADDR=c6:73:4b:1b:ba:45 ONBOOT=yes BOOTPROTO=none USERCTL=no MASTER=bond0 SLAVE=yes
Do always specify the hardware address, or else you will never know which NIC is eth0, eht1 and so on and will cause you problems if you have more than one bond or you have an NIC not part of the bond. Do a similar modification for the eth2 and eth3
Now you can restart the network service and you will have a new entry when you run ifconfig, the bond0, it will have the same MAC adress as eth1 and that applies also to eth2 and eth3. If you want to change the mode used, you need to unload the bonding module and change the setting and then load the module again, this can cause some problems if you do it remotely.
If you decide to remove a NIC from the bond, either you take it down manually with ifconfig, or stop the network change the ifcfg-ethX file so that the NIC ain't part of the bond and then start the network again, if you change the file and then restart the network, you will still have the NIC as part of the bond.
Modes
It's possible to assign the mode number or the mode name when selecting the mode in the kernel module option.
0 or balance-rr
Round-robin policy: Transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance. (This is the default mode if no mode specified)
1 or active-backup
Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch.
In bonding version 2.6.2 or later, when a failover occurs in active-backup mode, bonding will issue one or more gratuitous ARPs on the newly active slave. One gratutious ARP is issued for the bonding master interface and each VLAN interfaces configured above it, provided that the interface has at least one IP address configured. Gratuitous ARPs issued for VLAN interfaces are tagged with the appropriate VLAN id.
This mode provides fault tolerance. The primary option, documented below, affects the behavior of this mode.
2 or balance-xor XOR policy: Transmit based on the selected transmit hash policy. The default policy is a simple [(source MAC address XOR'd with destination MAC address) modulo slave count]. Alternate transmit policies may be selected via the xmit_hash_policy option, described below.
This mode provides load balancing and fault tolerance.
3 or broadcast
Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.
4 or 802.3ad
IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification.
Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option, documented below. Note that not all transmit policies may be 802.3ad compliant, particularly in regards to the packet mis-ordering requirements of section 43.2.4 of the 802.3ad standard. Differing peer implementations will have varying tolerances for noncompliance.
Prerequisites:
1. Ethtool support in the base drivers for retrieving the speed and duplex of each slave.
2. A switch that supports IEEE 802.3ad Dynamic link aggregation.
Most switches will require some type of configuration to enable 802.3ad mode.
5 or balance-tlb
Adaptive transmit load balancing: channel bonding that does not require any special switch support. The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. Incoming traffic is received by the current slave. If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave.
Prerequisite:
Ethtool support in the base drivers for retrieving the speed of each slave.
6 or balance-alb
Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves in the bond such that different peers use different hardware addresses for the server.
Receive traffic from connections created by the server is also balanced. When the local system sends an ARP Request the bonding driver copies and saves the peer's IP information from the ARP packet. When the ARP Reply arrives from the peer, its hardware address is retrieved and the bonding driver initiates an ARP reply to this peer assigning it to one of the slaves in the bond. A problematic outcome of using ARP negotiation for balancing is that each time that an ARP request is broadcast it uses the hardware address of the bond. Hence, peers learn the hardware address of the bond and the balancing of receive traffic collapses to the current slave. This is handled by sending updates (ARP Replies) to all the peers with their individually assigned hardware address such that the traffic is redistributed. Receive traffic is also redistributed when a new slave is added to the bond and when an inactive slave is re-activated. The receive load is distributed sequentially (round robin) among the group of highest speed slaves in the bond.
When a link is reconnected or a new slave joins the bond the receive traffic is redistributed among all active slaves in the bond by initiating ARP Replies with the selected mac address to each of the clients. The updelay parameter (detailed below) must be set to a value equal or greater than the switch's forwarding delay so that the ARP Replies sent to the peers will not be blocked by the switch.
Prerequisites:
1. Ethtool support in the base drivers for retrieving the speed of each slave.
2. Base driver support for setting the hardware address of a device while it is open. This is required so that there will always be one slave in the team using the bond hardware address (the curr_active_slave) while having a unique hardware address for each slave in the bond. If the curr_active_slave fails its hardware address is swapped with the new curr_active_slave that was chosen.
Use together with KVM
This isn't something you are meant to do within your KVM guests, but on the host, you can assign the bridge to the bond instead of the traditional eth0, this way you will have HA, HT or HA/HT setup.
Problem with Bridge + Bonding
There is a known ARP problem for bridge on a bonded interface. Ref:
- https://bugzilla.redhat.com/show_bug.cgi?id=584872
- https://lists.linux-foundation.org/pipermail/bridge/2007-April/005376.html
Please let me know if you know a solution.
Read more
Here are some useful external links how to setup your bond for other Linux distributions and of course the more in detail Linux Ethernet Bonding Driver HOWTO, where you can read a bit of different examples of how to build your network with one switch (single point of failure).
- Linux Ethernet Bonding Driver HOWTO - Gentoo bonding HOWTO - Ubuntu 6 Bonding
--Trizt 13:57, 16 August 2009 (EDT)