vSAN ROBO Project – Design & Config

Categories Hardware, Storage, vSAN, vSphere
vSAN ROBO - ESXi - Front

Introduction

I recently presented at a VMUG event where I outlined the entire process of designing, deploying, configuring, and managing a Remote Office/Branch Office (ROBO) vSAN environment with one of my customers.  I figured since I put in all the work to present this for a small VMUG audience, I might as well throw it on my blog to share the love with the greater vCommunity. My hope is that this design/configuration can be leveraged by many organizations for their ROBO use cases in a “cookie cutter” manner. You can download the slide deck below if you are interested:

Use Case / Business Challenge

My specific customer has multiple remote sites that require a small amount of infrastructure to host some essential workloads.  Many of their remote sites either have no existing infrastructure because they are new, or they are due for a hardware refresh.  The types of workloads required to run onsite are things like a Domain Controller hosting DNS/DHCP, a print server, file server(s), and various other applications.

Use Case / Business Challenge
Use Case / Business Challenge

Requirements

From a requirements perspective, we had the luxury of being relatively vague… meaning we did not need to identify hard numbers for performance, network latency, capacity, etc.  Therefore, we kept it simple and identified the following items we needed from our new ROBO infrastructure design.

Requirements
Business & Technical Requirements

Solution

Ultimately, we decided upon a ROBO vSAN configuration as it satisfies all of the requirements identified above.  This solution leverages two physical hosts that reside at the remote site and a vSAN witness appliance that resides in the customer’s primary datacenter.  Our configuration includes traditional spinning disks (Hybrid configuration) rather than an All-Flash configuration to keep the cost low and the capacity high.  As you will see below, despite the Hybrid configuration, the performance of the solution has proven to be quite good.

vSAN ROBO Solution
vSAN ROBO Solution

Licensing

One of the great things about a ROBO vSAN solution is that the licensing is extremely reasonable from a cost perspective.  We chose the “vSAN Standard for Retail and Branch Offices (VMs)” license since we are leveraging a Hybrid configuration and cannot do Dedupe/Compression or Erasure Coding (RAID5/RAID6) anyway.  Some things to know about ROBO vSAN licensing:

  • Licenses are priced per-virtual machine (per-VM) and sold in packages of 25 licenses.  A 25-pack of licenses can be shared across multiple locations—for example, five remotes offices each running five virtual machines.
  • Each remote office is limited to a maximum of 25 VMs. If more than 25 VMs are running at a remote office, vSAN Standard, Advanced, or Enterprise licensing must be used.
  • It is important to note there is no upgrade/conversion path from vSAN for ROBO per-VM licenses to vSAN Standard, Advanced, Enterprise per-CPU licenses.
Licensing
Licensing

Hardware Configuration

We leveraged two vSAN ReadyNodes (https://vsanreadynode.vmware.com/RN/RN) for our solution to ensure the hardware components were compatible with vSAN.  My customer’s hardware vendor of choice is HPE; however, you can leverage many different vendors based on your infrastructure standards.  Below is the hardware configuration of each host:

  • HPE ProLiant DL380 Gen10 24SFF
  • (2) Intel Xeon-Gold 6142 (2.6GHz/16-core/150W)
  • 192GB (32GB x 6) DDR4-2666 Memory
  • (4) 1GbE Ports
  • (4) 10GbE Ports
  • (2) 480GB SSD (ESXi)
  • (2) 800GB SSD (Write Intensive)
  • (7) 1.2TB 10K HDD

This hardware configuration resulted in (2) vSAN disk groups per host with an overall capacity of 16.8TB Raw / ~8.4TB Usable!  More than enough for a handful of VMs.

ESXi Host - Front
ESXi Host – Front
ESXi Host - Back
ESXi Host – Back

Network Configuration

One of the best features of a ROBO vSAN configuration is that you can connect the two nodes directly to each other via 10GbE.  This direct connection is leveraged for the vSAN data and vMotion traffic while the traditional ESXi Management and VM traffic traverses the 1GbE uplinks to a physical switch.  This greatly reduces the overall cost of the solution because you do not need to have a 10GbE switch!

We kept the design simple and leveraged Virtual Standard Switches (VSS) rather than a Virtual Distributed Switch (VDS); however, the VDS is absolutely supported for this type of configuration and even provides a bit more functionality.  Below are diagrams of the physical networking environment as well as the vSphere networking within ESXi:

Networking - Physical
Networking – Physical
Networking - vSphere
Networking – vSphere

Scalability

In the event that the environment needs to be scaled up/out in the future, this configuration provides the following options:

  • Add Disk(s) (Scale Up) – This is the cheapest option if additional disk capacity is required.  By purchasing an additional (14) disks to fill the existing Disk Groups we can double our vSAN datastore capacity (from 16.8TB to 33.6TB).
  • Add Disk Group (Scale Up) – This option is more expensive since we need to purchase additional cache disks and potentially controllers; however, it provides increased disk performance as well as additional disk capacity.  The total disk capacity would now be 50.4TB Raw which is an enormous amount for a two node ROBO configuration.
  • Add Host & Switch (Scale Out) – The third and most expensive option to scale would be to add a host to the configuration; however, this also requires a 10GbE switch.  This option increases the amount of CPU/Memory available to the cluster and increases disk performance and capacity.  If we performed the first two options above and then added a third host, we could increase the overall disk capacity to 75.6TB Raw!
Scalability
Scalability

Implementation

We split up the solution deployment process into three phases: Pre-vSAN Configuration, vSAN Configuration, and Post vSAN Configuration.

Deployment Phases
Deployment Phases

Validation Testing

As part of our deployment we included two unique types of testing: Basic Validation Testing and a vSAN Test Plan.  Since this was my customer’s first experience with vSAN, we needed to ensure the environment was stable and would perform as expected in the event of a component failure.

Basic Validation Testing
Basic Validation Testing
vSAN Test Plan
vSAN Test Plan

Performance Testing

We also wanted to ensure the environment met our performance expectations and we wanted to capture a performance benchmark before we began deploying workloads.  This performance benchmark can also be leveraged when deploying additional ROBO vSAN environments in the future.

HCIBench (https://flings.vmware.com/hcibench) was leveraged to perform the performance testing and proved to be very simple and efficient.  The environment performed as follow:

  • Avg IOPS – 86.5K
  • Peak IOPS – ~120K
  • Avg Throughput – 346MBs
  • Avg Read Latency – 3.63ms
  • Avg Write Latency – 2.46ms
HCIBench Testing
HCIBench Testing

Day 2 Operations – Management / Patching / Monitoring

One of the most amazing things about vSAN is that the management, patching, and monitoring of the solution uses the same tools customers are already used to: vCenter, Update Manger, and vRealize Operations!  That means there are no new tools that need to be implemented/operationalized and organizations do not need to spend time and money learning how to use them.

Management-Patching
Management-Patching
Monitoring - vCenter
Monitoring – vCenter
Monitoring - vROps-vCenter
Monitoring – vROps-vCenter
Monitoring - vROps
Monitoring – vROps

Day 2 Operations – Support

From a support perspective, customers can take advantage of vSAN Support Insight (https://storagehub.vmware.com/t/vmware-vsan/vsan-support-insight/) which provides VMware Support with rich vSAN environment data that can be leveraged to assist with quickly resolving issues.  Customers only need to turn on Customer Experience Improvement Program (CEIP) within vCenter to take advantage.

vSAN Support Insight
vSAN Support Insight

Business Outcomes / Conclusion

Overall, this solution turned out to be exactly what my customer needed as it satisfied all of their business and technical requirements which led to positive business outcomes!

  • High Performance – Happy Customers
  • Highly Available – Less Downtime
  • Highly Scalable – Future Proof
  • Easy to Operate – Familiar Tools
  • Cost Efficient – Budget Friendly
Business Outcomes
Business Outcomes

Resources

3 thoughts on “vSAN ROBO Project – Design & Config

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.