What we can learn from data centres on redundancy

Submitted by fredrik.nyman on Tue, 02/05/2019 - 09:23

Last week I wrote about redundancy in FTTH networks and the fact that a layer 3 architecture makes redundant topologies easier to implement and operate. One area in modern networking that recently has embraced layer 3 is the data centers.

The past few years have seen a lot of development in the data center world. New companies have emerged and new technologies have changed the way network equipment and architectures in data centers are built. I'm thinking about technology such as SDN with whitebox switches and control plane separation. OpenFlow was one such protocol that for a few years gained some momentum for its fine-granular control of traffic.

Image removed.

In data centers with thousands of virtual machines the network architecture quickly becomes complex as these virtual machines need to communicate in private or semi-private networks at the same time as a redundant network architecture is needed to provide high availability. The old VLAN and spanning-tree protocols do not scale well enough to handle the criss-crossing of connectivity between different virtual machines and even between different data centres. In the really big data centres we are talking about tens of thousands of virtual machines, so VLAN and MAC-address scalability quickly becomes an issue with a layer2 topology.

In response a protocol called VXLAN emerged to offer layer 2 connectivity over layer 3.

At first glance VXLAN is just another tunnling protocol. We have seen this before. L2TP was and is still used for this particular purpose. GRE in some implementations also supports layer2 payload. Actually a variant to GRE (NVGRE) was also on the table but VXLAN seems to have emerged the victor. Currently the IETF is working on standardized version (Geneve). The concept is called layer 2 overlay - a layer 2 network over a layer 3 topology.

As networks grow in scale they tend to go down the MPLS route to create and separate private networks over the same infrastructure. So why one more protocol, what was wrong with the multiple options that already existed?

Well, VXLAN has a couple of key benefits. First, it separates control plane and forwarding plane in a way where the forwarding plane can easily be implemented in hardware ASICs. L2TP has a lot of signalling which makes hardware acceleration difficult or at least significantly more integrated with the control plane. With VXLAN the packet format is simple and thus the lower-cost ASICs used in data centre switches can implement the technology in hardware gaining performance.

Secondly, the VXLAN tunnels are stateless. In MPLS you need to configure every end-point to make it part of a private connection. This requires manual hands-on or advanced automation systems to handle reconfiguration of equipoment. It's also a bit complex to mix and match end end-point configuration in MPLS.

In VXLAN, the forwarding is based on Ethernet-like tables. A MAC-address+VLAN is associated with a VXLAN end-point. So when a frame arrives the switch looks up in its forwarding table where the frame should be bridged. If the destination is over a VXLAN tunnel the switch creates the necessary header dynamically and sends the packet. No need to pre-configure the tunnel end-points - the end-points need not be aware of each other's existence until there is traffic to pass. This reduces the strain on configuration and control.

Different means, including EVPN, has emerged as options to signal the control-plane information (MAC+VLAN forwarding table information), which also means that VXLAN can integrate nicely with BGP for signalling. But other options also exist as for some topologies even BGP becomes overcomplicated.

To solve those two problems of VLAN/MAC scalability and redundancy in the data centre leaf-spine, VXLAN technology allows the whole data centre to operate on layer 3 with well proven redundancy and load-balancing solutions. Routing also means that no MAC-addresses needs to be kept in the spine for the thousands of virtual machines connected in the network. This reduce load and simplifies the spine. Routing also means that cross-connection between data centres can be routed and still allow layer 2 connectivity between virtual machines in different data centres thereby supporting scaling.

Image removed.

So the whole data centre infrastructure lies on a routed network with scalable and easy operation through routing protocols. Still full layer 2 connectivity is preserved with hardware accelerated performance through the VXLAN protocol. The cost is a little more overhead on the links, but that's a bargain for the operational benefits in form of flexibility, redundancy and scalability.

So what can FTTH learn from data centres? Well, why not use VXLAN for bitstream/wholesale services in fibre networks? Why not use VXLAN instead of customer VLANs to connect each and every customer to the central BNG? Why not build a stable, reliable, easy to operate layer 3 routed infrastructure in the access and still provide the layer 2 connectivity and services needed to enable the full service portfolio?

VXLAN applied in fibre network has goot potential to provide the same operaitonal benefits to FTTH as it has done to data centres.

Blog posts

Turn on automation of your FTTH network

Submitted by fredrik.nyman on Mon, 04/01/2019 - 09:08

The distributed nature of a fiber to the home network means that you will have equipment spread out and you might not always do the on-site installation yourself. If every switch has to pass your desk for pre-configuration port before getting deployed into the field you will need to deal with the logistics of getting the units from your warehouse via your desk, packing and unpacking, and clearly marking them so that the right unit goes into the right location.


Submitted by fredrik.nyman on Thu, 03/21/2019 - 09:51

I love acronyms. You got three of them in the title of this post.

In recent years we got Software Defined Networking (SDN) and Network Function Virtualization (NFV). Many of the large telcos have invested millions into research of these subjects and are pushing the industry in this direction. Telefonica has expressed high ambitions to move to a completely SDN/NFV enabled network in record time. All the big ones are involved.

Keeping product lines around

Submitted by fredrik.nyman on Fri, 03/15/2019 - 09:50

Building fibre to the home networks are different from any traditional enterprise or telecommunications network. One of the main differences is the time it takes to complete the network. You make a plan, design a an architecture with VLANs and redundancy and imagine how this will scale as the number of connected customers increase. But then the years go by, because building a fibre network to connect every home in the community can take decades.

Save the planet - work from home

Submitted by fredrik.nyman on Thu, 03/07/2019 - 10:30

In my last post i revealed how dirty a fiber network can be depending on the source of electricity powering the network. I showed how a typcial 24-port access switch might contribute anything between 23kg to 485kg of carbon dioxide per year to the atmosphere depending on the electricity mix and how that can be reduced with lowpower optical modules.

How do you troubleshoot IoT devices?

Submitted by fredrik.nyman on Fri, 02/15/2019 - 13:00

Continuing on the subject of troubleshooting the network. Troubleshooting MPEG video has the benefit of a user that can tell you if it doesn't work and you can simply ask that user if the problem persists once you have fixed it. But what if there isn't any obvious way to determine if things are working, for example is that trashcan really signalling that its' full or does the temperature device really update the building climate control properly?

How to see what your users see

Submitted by fredrik.nyman on Mon, 02/11/2019 - 10:21

Live broadcast TV is one of the most popular services in fibre networks. You can get high quality pictures because there is enough bandwidth to send video uncompressed. But the nature of broadcast media is that it is very sensitive to packet loss or jitter. There is no retransmission of packets because it is live – you can’t hold the stream to get a lost packet back.