Sunday, March 31, 2013

Ububtu netoworking hell.

Or how to stop hating routing and get on with your life.

As some of you may know, when our love, ubuntu, encounters a problem setting up network interfaces during upstart it tries to "failsafe". As a result, it waits two minutes with little information about what went wrong.

In most cases where DHCP is configuring the interface this is the correct thing to do. However, when the interface is static no amount of waiting is going to safe your sorry butt.

Unfortunately the networking subsystem "ifup/down" can be less then scrutable in it's error messages.

Take the following config as an example of a fail.

auto eth0
iface eth0 inet static
 address 10.10.10.200
 netmask 255.255.0.0
 network 10.10.0.0
 broadcast 10.10.255.255
 gateway 10.20.0.1
 dns-nameservers 10.20.0.1
 up ip route add 10.20.0.0/16 proto static dev eth0
 down ip route del 10.20.0.0/16 dev eth0

Seems legit? Ok so ya, this system is running on different subnet then the gateway but that's FINE. If the route exists then the packets flow. The problem here is the gateway but at this point we don't know that.

This config results in the 2 minute wait at startup because ifup is returning an error "no such process".

Figuring out what was causing the problem was a bit.. hard

Looking into how upstart configures the interfaces is what worked.

Upstart uses the ifup/down utilities and if it detects an error calls failsafe.

Ok that works. Let's bring down the interface and try it again. EXCEPT the interface isn't really up so use --force.

ifdown --force eth0

After that going line by line in the config with ifup and ifdown --force until we don't get an error.

In my case it was the gateway line. Turns out you need the route to the gateway BEFORE it can be configured but the line which adds the route happens POST up.

Since we can't add a route pre-up we must configure the gateway post-up. It could be done at up but there might not be a guarantee of execution order so keeping it explicit seems like a good idea.

The new config which works,
auto eth0
iface eth0 inet static
 address 10.10.10.200
 netmask 255.255.0.0
 network 10.10.0.0
 broadcast 10.10.255.255
 dns-nameservers 10.20.0.1
 up ip route add 10.20.0.0/16 proto static dev eth0
 post-up route add default gw 10.20.0.1 eth0
 down ip route del 10.20.0.0/16 dev eth0