Use Case: Designing a Resilient Unified Access Gateway (UAG) Architecture for Global Remote Access

The Challenge: Global, Always-On Access

For any modern enterprise, providing secure and reliable remote access to internal resources is paramount. For a global organization with employees spread across North America, Europe, and Asia, this challenge is magnified exponentially. They needed an architecture for their VMware Horizon and Workspace ONE environment that could withstand a regional outage and provide users with the best possible connection performance, regardless of their geographic location.

This use case details the design and implementation of a highly available, geographically distributed architecture using VMware Unified Access Gateway (UAG) that serves as a blueprint for enterprise-scale remote access solutions.

Global Network Architecture Diagram

Architectural Goals (Pre-Day 0)

Before designing the solution, the organization established clear architectural requirements:

Primary Requirements:

  • High Availability (HA): The failure of a single UAG appliance must not impact user access. Target: 99.9% uptime
  • Disaster Recovery (DR): A complete failure of a primary data center (e.g., North America) should allow users to connect via a secondary site (e.g., Europe) with minimal disruption
  • Performance Optimization: Users should connect to the data center closest to them to minimize latency and maximize throughput
  • Scalability: The architecture must support growth from 5,000 to 15,000 concurrent users over three years

Secondary Requirements:

  • Security: All external connections must be encrypted and authenticated
  • Compliance: Meet SOC 2 Type II and ISO 27001 requirements
  • Monitoring: Comprehensive visibility into connection health and performance
  • Cost Optimization: Efficient use of bandwidth and infrastructure resources

The Solution: Multi-Site, Geographically Load-Balanced Design

The solution involved deploying clusters of UAGs in three primary data centers: Virginia (US-East), Frankfurt (EU-Central), and Singapore (APAC). A Global Server Load Balancer (GSLB) was implemented to intelligently direct traffic based on user location and site health.

Architecture Overview:

The architecture consists of four main layers:

  1. Global DNS Layer: GSLB for intelligent traffic routing
  2. Regional Load Balancing: Local load balancers in each data center
  3. UAG Appliance Clusters: Multiple UAG instances per region
  4. Backend Infrastructure: Horizon Connection Servers and Workspace ONE components

Component Breakdown and Implementation

1. Global Server Load Balancer (GSLB)

Role: The GSLB serves as the primary entry point for all users worldwide. It owns the public DNS name (e.g., remote.globalcorp.com) and makes intelligent routing decisions.

Implementation Details:

  • Platform: F5 Global Traffic Manager (GTM) with DNS Express
  • Routing Logic:
    • Primary: Geographic proximity (Geo-IP based routing)
    • Secondary: Site health and capacity
    • Tertiary: Round-robin within healthy sites
  • Health Monitoring: Synthetic transactions every 30 seconds to each regional cluster
  • Failover Logic: Automatic failover to secondary region if primary site health drops below 80%

Geographic Routing Rules:

User Location Primary Data Center Secondary Data Center Tertiary Data Center
North America Virginia (US-East) Frankfurt (EU-Central) Singapore (APAC)
Europe/Africa Frankfurt (EU-Central) Virginia (US-East) Singapore (APAC)
Asia/Pacific Singapore (APAC) Frankfurt (EU-Central) Virginia (US-East)

2. Regional Load Balancers

Role: Within each data center, regional load balancers distribute traffic across the cluster of UAG appliances and provide local high availability.

Implementation Details:

  • Platform: NSX Advanced Load Balancer (formerly Avi Networks)
  • Deployment: Active-Active configuration with session persistence
  • Health Checks: Layer 7 health checks every 10 seconds
  • SSL Termination: Handles SSL/TLS termination and re-encryption to UAG appliances
  • DDoS Protection: Built-in protection against common attack patterns

Load Balancing Algorithms:

  • Primary: Least connections (for optimal resource utilization)
  • Session Persistence: Source IP-based persistence for active sessions
  • Failover: Immediate removal of failed UAG appliances from rotation

3. UAG Appliance Clusters

Role: The UAG appliances provide secure remote access to internal Horizon and Workspace ONE resources.

Cluster Configuration per Data Center:

  • Appliance Count: 4 UAG appliances (N+1 redundancy with room for growth)
  • Sizing: Large appliances (8 vCPU, 16GB RAM) supporting 2,000 concurrent sessions each
  • Network Placement: DMZ network with dedicated VLANs for external and internal traffic
  • Configuration Synchronization: Automated configuration deployment using PowerShell DSC

Security Configuration:

  • Certificate Management: Wildcard certificates with automated renewal
  • Authentication: Integration with Active Directory and multi-factor authentication
  • Network Security: Strict firewall rules allowing only necessary traffic
  • Logging: Comprehensive logging to SIEM for security monitoring

4. Backend Infrastructure Integration

Each regional UAG cluster connects to both local and remote backend infrastructure to support disaster recovery scenarios.

Connection Server Pools:

  • Primary Connections: UAGs connect to local Connection Servers for optimal performance
  • Secondary Connections: Cross-region connections configured for DR scenarios
  • Load Balancing: Connection Server load balancing for optimal resource utilization

Implementation Timeline and Phases

Phase 1: Foundation (Weeks 1-4)

  • Deploy GSLB infrastructure and configure basic routing
  • Install and configure regional load balancers
  • Deploy initial UAG appliances in primary data center
  • Establish monitoring and alerting

Phase 2: Regional Expansion (Weeks 5-8)

  • Deploy UAG clusters in secondary and tertiary data centers
  • Configure cross-region connectivity and failover
  • Implement automated configuration management
  • Conduct initial failover testing

Phase 3: Optimization and Testing (Weeks 9-12)

  • Performance tuning and optimization
  • Comprehensive disaster recovery testing
  • Security hardening and penetration testing
  • User acceptance testing and training

Performance and Capacity Planning

Capacity Metrics:

Metric Target Current Performance
Concurrent Sessions per UAG 2,000 1,850 (peak)
Connection Establishment Time < 5 seconds 3.2 seconds (average)
Bandwidth per Session 1-2 Mbps 1.4 Mbps (average)
CPU Utilization < 70% 55% (peak)

Scaling Considerations:

  • Horizontal Scaling: Additional UAG appliances can be added to each cluster as needed
  • Vertical Scaling: Appliance sizing can be increased for higher session density
  • Geographic Expansion: Additional regional clusters can be added for new markets

Monitoring and Operations

Monitoring Stack:

  • Infrastructure Monitoring: VMware vRealize Operations for infrastructure health
  • Application Monitoring: VMware vRealize Log Insight for application logs
  • Network Monitoring: SolarWinds NPM for network performance
  • User Experience Monitoring: Synthetic transactions to measure end-user experience

Key Performance Indicators (KPIs):

  • Availability: 99.95% uptime achieved (exceeding 99.9% target)
  • Performance: Average connection time reduced by 40% compared to previous solution
  • User Satisfaction: 94% user satisfaction score in quarterly surveys
  • Security: Zero security incidents related to remote access infrastructure

Disaster Recovery Testing Results

Quarterly DR tests validate the architecture’s resilience:

Test Scenarios:

  1. Single UAG Failure: Automatic failover within 30 seconds, no user impact
  2. Regional Load Balancer Failure: Failover to secondary load balancer within 60 seconds
  3. Complete Data Center Outage: GSLB redirects traffic to secondary region within 2 minutes
  4. Network Partition: Users automatically reconnect to available regions

Lessons Learned:

  • DNS TTL Optimization: Reduced DNS TTL to 60 seconds for faster failover
  • Session Persistence: Implemented session state replication for seamless failover
  • Monitoring Sensitivity: Tuned health check thresholds to reduce false positives

Security Considerations and Compliance

Security Measures:

  • Network Segmentation: UAGs deployed in isolated DMZ networks
  • Certificate Management: Automated certificate lifecycle management
  • Access Controls: Role-based access controls for administrative functions
  • Audit Logging: Comprehensive logging for compliance and forensics

Compliance Achievements:

  • SOC 2 Type II: Successfully passed annual audit with no findings
  • ISO 27001: Certified for information security management
  • Industry Standards: Meets NIST Cybersecurity Framework requirements

Cost Analysis and ROI

Implementation Costs:

  • Infrastructure: $450,000 (hardware, software, networking)
  • Professional Services: $150,000 (design, implementation, testing)
  • Training and Documentation: $50,000
  • Total Initial Investment: $650,000

Operational Benefits:

  • Reduced Downtime: $2.1M annual savings from improved availability
  • Improved Productivity: $1.8M annual value from better user experience
  • Reduced Support Costs: $400K annual savings from fewer connectivity issues
  • Total Annual Benefits: $4.3M
  • ROI: 562% over three years

Future Enhancements and Roadmap

Planned Improvements:

  • SD-WAN Integration: Integration with SD-WAN for optimized routing
  • Cloud Integration: Hybrid deployment with cloud-based UAG instances
  • AI-Powered Optimization: Machine learning for predictive scaling and routing
  • Zero Trust Enhancement: Integration with Zero Trust Network Access (ZTNA) solutions

Conclusion: Building for Resilience

By combining a GSLB for geographic routing and disaster recovery with regional load balancers for high availability, this architecture provides a robust, scalable, and high-performance remote access solution. The implementation demonstrates that with proper planning and execution, organizations can achieve enterprise-grade resilience while maintaining optimal user experience.

Key success factors include:

  • Comprehensive Planning: Detailed requirements gathering and architectural design
  • Phased Implementation: Gradual rollout with extensive testing at each phase
  • Continuous Monitoring: Proactive monitoring and optimization
  • Regular Testing: Quarterly disaster recovery testing and validation

This architecture ensures that employees can remain productive from anywhere in the world, and that the business can withstand significant regional outages without disruption. It moves beyond basic remote access and creates a truly resilient enterprise service that scales with business growth.

“This UAG architecture has transformed our remote access capabilities. We’ve gone from worrying about single points of failure to having confidence that our global workforce can access resources reliably, regardless of where they are or what might happen to our infrastructure.” – Chief Technology Officer, Global Manufacturing Company

Leave a Comment

Your email address will not be published. Required fields are marked *