Tuesday, September 25, 2012

vSphere 5 Host Network Design - 10GbE & 1Gb Hybrid Ent+ Design




This design comes from multiple requests for a 10GbE and 1Gb hybrid environment. Many organizations want to make use of the investments they have made into 1Gb infrastructure for as long as they can and therefore this has been a common request for a design.

This design makes use of Enterprise Plus licenses to enable bandwidth control on the 10GbE network. If you do not have Ent+ then consider pinning VM Network to vmnic4 and Storage to vmnic5 using a single portgroup per traffic type.

Please note that this and all my other designs should be considered starting point for your own specific design. There is no one-size-fits-all configuration, only good practises to minimise risk, maximise throughput and to get the best return on investment.

It is assumed that each host has two 10GB NICs provided by a single PCI-x Dual Port expansion card and four 1Gb uplinks provided by a single quad port card. Whilst this isn't the best configuration for redundancy it does represent what a lot of organizations are currently doing. So whilst I like to build optimal configurations, that's pointless when it never happens because of the budget.

Management and vMotion are done from a single vSS using four uplinks going to two seperate switches. However this design could be made slightly less complex by seperating out Management and vMotion onto their own standard switches. What we gain in this design is 2Gbps throughput for both vMotion and Management (think cold migrations) on a single switch. It is important to note that each of the Management vmkernels should be going to seperate physical switches, ditto with the vMotion uplinks. This will ensure that traffic is balanced across the physical network.

It isn't shown in the diagram and I probably don't need to mention it, but somewhere in the back end the 10GbE and 1Gb networks will need to be able to communicate with each other.
VM Networking, iSCSI and Fault Tolerance (FT not shown in this design but could be inlcuded by adding the FT vmkernel ports onto dvSwitch1) traffic is assigned to a single distributed virtual switch. Only VM Networking makes use of Load Based Teaming (LBT) in conjunction with Network IO Control (NIOC) and Storage IO Control (SIOC). A good writeup on how to configure NIOC shares can be found on the VMware Networking Blog. LBT is a teaming policy only available when using a virtual Distributed Switch (vDS). I have used two iSCSI vmkernels and configured them using a specific failover order. Storage does make use of NIOC and SIOC but does not use LBT, VM Networking traffic is aware of what iSCSI is doing and will allow for it without issue.

LACP is not used as it wouldn't be a good design choice for this configuration. LACP has had some major improvements with vSphere 5.1, but for now I am still not including it in my designs. A valid use case for LACP could be made when using the Nexus 1000V as LBT is not available for this type of switch.

In order to gain the performance increase of Jumbo Frames for the storage layer all networking components will need to have Jumbo Frames enabled. The requires end-to-end configuration from the hosts through the network and to the storage arrays. There is definitely a performance increase by incorporating Jumbo Frames and this is outlined in the following blog post. It is important to note that enabling Jumbo Frames on the single switch will allow all traffic to transmitt at 9000MTU. This means that FT and Storage will all use Jumbo Frames. VMs will not use Jumbo Frames unless this feature is enabled on the network adapter inside the OS of the VM.

Trunking needs to be configured on the 1Gb physical switches to allow for Management and vMotion traffic. Trunking also needs to be configured on the 10GbE physical switches to allow FT, VM Networking and Storage traffic. Trunking at the physical switch will enable the definition of multiple allowable VLANs at the virtual switch layer.

If you need to present iSCSI straight up to a VM then this can be enabled by adding the Storage VLAN to the list of VLANs that can be accessed within the VM Network portgroup. I try my best to avoid doing this as it opens up the Storage layer to attack, but sometimes this is a requirement for some organizations.

When running Cisco equipment there is the potential to use the Rapid Spanning Tree Protocol (802.1w) standard. This means there is no requirement to configure trunk ports with Portfast or disable STP as the physical switches will automatically identify these functions correctly. If running any other type of equipment the safest option would probably be to disable STP and enable Portfast on each trunk port, but please refer to the switch manufacturer manual for confirmation.
Since this design makes use of standard virtual switches it is acceptable to have vCenter as a VM on the same hardware that is being managed. However it is always a good practise to have a seperate management cluster if that option is available.

*** Updates ***
5th of November 2012 - Fixed minor issues with the iSCSI port groups.