MikroTik RouterOS Interface Bonding

5 01 2008

I have two separate Metro Ethernet links (via fiber optic) from the datacenter to the NOC. Each link is 10Mbps. I need to utilize both links (bonding) and make sure sure that if one of the links goes down (redundancy), I won’t lose half of my packets. Bonding and redundancy are my goals.

Initially I tried Cisco Catalyst’s EtherChannel feature to accommodate this need since I learned about EtherChannel when I was doing my CNAP. Unfortunately EtherChannel cannot fit in this scenario due to my Metro Ethernet provider’s network setup. They use Cisco Catalyst 3750 switches to aggregate customers links from each POP to their headquarters. My first attempt was to establish trunk mode EtherChannel (802.1q) with Cisco Catalyst 2950 on one side and Cisco Catalyst Express 500 on the other side. Later I noticed that this is not doable since trunking requires MTU size to be larger than 1500 (1504) when my provider strictly limits MTU size to 1500 and negotiation between my 2 switches to establish trunking wouldn’t work since my switches’ BPDU packets are “intercepted” by my provider’s switches. Basically my Cisco switches were trying to establish a VLAN trunk with my provider’s directly connected switches when my switches areĀ  supposed to be negotiating to each other.

I consulted a few experienced people including an employee of the provider, and they told me to use access mode EtherChannel instead of the trunk mode EtherChannel. This is not possible with Cisco Catalyst Express 500, which only offers trunk mode EtherChannel. I bought a Cisco Catalyst 2960 to replace the Cisco Catalyst Express 500 hoping that access mode EtherChannel would work, it didn’t. Even if it did work, it wouldn’t be aware of link state changes since my switches do not connect directly to the fiber cables. There is a fiber-to-ethernet bridge for each side of each link, so my switches will always detect both links as always up as long as the bridges are up.

Since link states cannot be used as a measurement in this scenario, I had to find another way. MikroTik RouterOS offers not only bonding feature, but fail-over mechanism too! The fail-over mechanism uses ARP packets to detect link failures, it is far from perfect but at least it works.

I will add examples later, but for now have a look at this. Hopefully I will discuss EoIP and EoIP over PPTP too.

References:
http://www.mikrotik.com/testdocs/ros/3.0/interface/bonding.php



LVM recovery on Fedora Core 6 with Fedora 8 Rescue CD

5 01 2008

One of my hard disks on my Fedora Core 6 server nearly failed yesterday. It sounded like it loses power every now and then. This hard disk is my primary, it has the /boot partition and an LVM partition. It holds at least 36GB of the 300GB+ LVM Volume Group. Had it died totally, then most of the OS would have been gone along with some of my data. Luckily I was still able to boot from this nearly-dead hard disk for a couple of times.

I downloaded Fedora Core 6′s Rescue CD ISO and burned it. Every time I could boot into the system without a problem, I rebooted immediately and booted from the Rescue CD. I was hoping that I could move the LVM PEs off the broken hard disk ASAP.

During my first attempt, the faulty hard disk ‘disappeared’ when I was running e2fsck. I had to shut the system down for about 15 minutes to let it cool down. This trick did work and I tried another attempt. This time e2fsck finished without a problem, and I ran pvmove to move the PEs from the faulty disk. Unfortunately my kernel is the latest version but the device-mapper and lvm2 packages are not. pvmove printed out errors (“device-mapper: reload ioctl failed: Invalid argument”) no matter how many times I tried. Initially I thought that the faulty hard disk may be too damaged, but then since I could still boot the system without a hitch so I guessed it couldn’t be that damaged.

This post has a solution, but I didn’t use it. I downloaded Fedora 8 Rescue CD ISO and used that instead of Fedora Core 6′s. This time pvmove didn’t show any error and the process completed as expected without any lost data. I was then able to vgreduce the faulty hard disk from the LVM Volume Group.

If you experience the same problem, try downloading a newer Rescue CD and give it a try. Hopefully it will address problems that are present on older Rescue CDs. Good luck! :)