VXLAN on Ravello between Google and Amazon EC2

v2Ravello_Logo_largeA week or so ago I did a post around a cross vCenter vMotion lab that I had setup utilizing both Amazon EC2 and Google Cloud through Ravello Systems new beta which allows us to run nested ESXi.  It was a fun project to work on, migrating VMs back and forth through the clouds, but I tried to keep a lot of the technical detail out of the post – focusing more on what Ravello had to offer.  One key aspect of the setup was creating a VXLAN tunnel in order to bridge the two VM networks inside of each cloud – allowing me to complete the vMotion without performing any additional network configuration on my VMs once they had migrated.  Anyways, I thought I’d go into a little more detail on how I accomplished this.

Now, keep in mind here I don’t claim to be any sort of network guy – it’s probably my biggest lack in terms of IT skill-sets, therefore I could be going about this all wrong – any advice, comments, just leave the in the box below – I always appreciate any feedback I get.  I also appreciate any help I get, and had a lot with this project – CTO of Ravello Systems Alex Fishman had a couple of calls with me offering up his experience (very very smart guy).  Also, there’s a blog post on Ravello’s blog which goes over the setup as well – that said, I thought go into a little more detail in the case that someone else at the same level of network knowledge as myself might be looking for help.

So to start let’s have a brief look at the Ravello setup.  Firstly we need a couple of applications, one published to the EC2 cloud and one published to the Google Cloud.  Each application (in this case) contains two VMs – one to act as a client and one to act as the gateway/vxlan bridge.  I’ve used Ubuntu 14.04 server for these but you could use any OS you like, so long as the vxlan module is loaded and supported.  The table below outlines each VM and the networking associated with it in my test setup.

Cloud VM Network IP Notes
Amazon EC2 ec2-vxlan eth0 IP: 192.168.0.1
SN: 255.255.255.0
GW: None
Internal
Network Gateway
eth1 IP: 10.0.0.1
SN: 255.255.255.248
GW: 10.0.0.3
External network w/ Elastic IP attached
ec2-client eth0 IP: 192.168.0.100
SN: 255.255.255.0
GW: 192.168.0.1 (in OS) None in Ravello
Internal
Google google-vxlan eth0 IP: 192.168.0.1
SN: 255.255.255.0
GW: None
Internal Network Gateway
eth1 IP: 10.0.0.2
SN: 255.255.255.248
GW: 10.0.0.3
External Network w/ Elastic IP attached
google-client eth0 IP: 192.168.0.200
SN: 255.255.255.0
GW: 192.168.0.1 (in OS) None in Ravello
Internal

I’ve used a /29 subnet on the external networks as I really only need 3 total IPs available, one for each of the vxlan VMs as well as a third for a gateway – Honestly again you could use whatever you wanted here.  I understand that sometimes a picture is worth a thousand words so here is a side by side of both the Amazon and Google network canvas.

ec2networking googlenet

So a pretty simply setup when looking at the canvas – essentially the vxlan VM will need two NICs, one connected to the internal lan (eth0 in this case) and one connected to an externally routed network (eth1).  Before we finish up within the Ravello canvas and establish a tunnel let’s first look at the EC2 side of things so we can better understand the settings on the vxlan and client VMs that end up making our network canvas look like the above.

ec2-nic1 ec2-nic2

Looking closer at the network configuration of the ec2-vxlan VM (sorry, couldn’t get it all on one screen cap so I put them side-by-side) we can notice a couple of things; eth0, the internal lan (which will act as a gateway for the client VMs) is setup on the 192.168.0.1/24 network with no gateway.  Also, we have selected public IP for this nic under external access but left the ‘Even without external services’ unchecked.  What this does is ensure that this VM can only be accessed by routing through our vxlan tunnel and cannot be accessed directly from the internet.  The second nic (eth1) is the nic we will use to establish our vxlan bridge.  This nic is subnetted in a way that allows very few IPs within the network, as we only need three anyways (one for ec2, one for google, and one for a gateway).  This nic has an Elastic IP tied to it within the External Access settings.  Since we will need to use this public IP later when we establish the tunnel it’s best that it not change often, and an elastic IP will never change, thus why we used it.

externalservices

Another note in regards to the vxlan VM is the external services provided.  For this case I’ve simply allowed all external traffic on all ports into eth1 – probably not the greatest feat in terms of security but it sure does ensure I get the communication I need. (might be able to change to just other IP).

Note that you will need to setup the Google side of things inside Ravello exactly the same as shown above, obviously replacing the IP addresses of eth1 with those shown in the table earlier.  Once done with that we have completed our setup within Ravello and it’s now time to setup the tunnel.

Again, let’s start with EC2 and work our way over to google.  Take note of both your EC2 and Google elastic IP’s as we will need them for the configuration below.

First is our network configuration (/etc./network/interfaces).  Below is a shot of mine – the key here is that even though we specified an IP in the Ravello interface for eth0 we are not doing it within Ubuntu – we will be using this IP on our bridge instead, however we still want eth0 to be active so we set it as manual.  So as far as network interfaces your setup should be similar to below – of course with address being 10.0.0.2 on the Google cloud.

# The primary network interface
auto eth0
iface eth0 inet manual
auto eth1
iface eth1 inet static
        address 10.0.0.1
        netmask 255.255.255.248
        gateway 10.0.0.3

Once we have our network interfaces set up we can continue with the setup of the vxlan and bridge.  First we will walk through the EC2 side and then move to google.

We begin by adding the vxlan device with “ip link add mtu <MTUSIZE> <VXLAN DEVICE NAME> type vxlan id <VXLAN ID>”.  For example on the EC2 side I ran..

ip link add mtu 65000 vxlan1 type vxlan id 1

Then let’s create a new forwarding database entry on our vxlan device to allow all traffic through using “bridge fdb append <MAC> dev <VXLAN DEVICE NAME> dst <DESTINATION ADDRESS>”.  For example, I used the following on my EC2 instance – ensuring I use the public IP of the Google Cloud.

bridge fdb append 00:00:00:00:00:00 dev vxlan1 dst 31.220.68.229

From here we can add our bridge using “brctl addbr <BRIDGE NAME>” – again, I ran…

brctl addbr br0

Then add both our vxlan and our internal interface to the newly created bridge with “brctl addif <BRIDGE NAME> <INTERFACE> <INTERFACE>”

brctl addif br0 vxlan1 eth0

Now we can assign our internal IP that we wish to use for the internal lans gateway to our bridge, and then simply bring up our bridge and vxlan interfaces.  Lastly, I ran these commands…

ifconfig br0 192.168.0.1/24
ifconfig br0 up
ifconfig vxlan1 up

That does it for the Amazon configuration, the Google configuration is exactly the same, except substituting the Amazon public IP as our destination when adding the forwarding entry.  Once we do this you should be able to ping back and forth between the test client VMs, each located in their separate cloud.  Now think of the possibilities – nested ESXi in each cloud, with a layer 2 VM Network that stretches through the vxlan tunnel – pretty sweet stuff!

One thing that could be a PIA is that this config won’t persist across reboots.  I’ve tried in many ways to get the interfaces added to the persistent rules, but found the quickest way to get this to run on boot is to simply create a small bash script containing the commands; both Google and Amazons are shown below.

ec2script googlescript

Save each bash file in their respective /etc./init.d/ directories and make them executable.  I called the configVXLAN Then run the following on both Amazon and Google to ensure it gets ran on startup

update-rc.d /etc/init.d/configVXLAN defaults 99

And that is it!  A functioning stretched Layer 2 network between Google Cloud and Amazon EC2 using Ravello!  The possibilities are endless…  Again, I’m not a big networking guy, so if you know of any way I can improve this (use NSX?) just let me know…  Thanks for reading!