Introduction: Infrastructure as Code Meets Virtualization
In my years of working with enterprise virtualization environments, I’ve seen the evolution from manual VM provisioning to sophisticated automation platforms. Terraform has emerged as one of the most powerful tools for managing infrastructure as code, and its integration with vSphere opens up incredible possibilities for automating virtual machine deployments at scale.
This implementation guide walks you through building a comprehensive Terraform-based VM deployment system for vSphere environments. Whether you’re looking to automate development environment provisioning, implement self-service infrastructure, or standardize your VM deployment processes, this guide provides the practical foundation you need.
Prerequisites and Environment Setup
Required Components
Before diving into the implementation, ensure you have the following components properly configured:
Infrastructure Requirements:
- vSphere Environment: vCenter Server 6.7 or later
- Terraform: Version 1.0 or later
- vSphere Provider: Latest version from HashiCorp
- Service Account: Dedicated vSphere account with appropriate permissions
- Network Access: Connectivity from Terraform host to vCenter
vSphere Permissions Setup
Creating a dedicated service account with minimal required permissions is crucial for security. Here’s the exact permission set I recommend:
# PowerShell script to create vSphere service account permissions
# Connect to vCenter first
Connect-VIServer -Server vcenter.domain.com -User administrator@vsphere.local
# Create custom role for Terraform
$terraformRole = "Terraform-Automation"
$permissions = @(
"Datastore.AllocateSpace",
"Datastore.Browse",
"Datastore.FileManagement",
"Network.Assign",
"Resource.AssignVMToPool",
"VirtualMachine.Config.AddExistingDisk",
"VirtualMachine.Config.AddNewDisk",
"VirtualMachine.Config.AddRemoveDevice",
"VirtualMachine.Config.AdvancedConfig",
"VirtualMachine.Config.Annotation",
"VirtualMachine.Config.CPUCount",
"VirtualMachine.Config.Disk",
"VirtualMachine.Config.DiskExtend",
"VirtualMachine.Config.DiskLease",
"VirtualMachine.Config.EditDevice",
"VirtualMachine.Config.Memory",
"VirtualMachine.Config.MksControl",
"VirtualMachine.Config.QueryFTCompatibility",
"VirtualMachine.Config.QueryUnownedFiles",
"VirtualMachine.Config.RawDevice",
"VirtualMachine.Config.ReloadFromPath",
"VirtualMachine.Config.RemoveDisk",
"VirtualMachine.Config.Rename",
"VirtualMachine.Config.ResetGuestInfo",
"VirtualMachine.Config.Resource",
"VirtualMachine.Config.Settings",
"VirtualMachine.Config.SwapPlacement",
"VirtualMachine.Config.ToggleForkParent",
"VirtualMachine.Config.UpgradeVirtualHardware",
"VirtualMachine.Interact.AnswerQuestion",
"VirtualMachine.Interact.Backup",
"VirtualMachine.Interact.ConsoleInteract",
"VirtualMachine.Interact.CreateScreenshot",
"VirtualMachine.Interact.CreateSecondary",
"VirtualMachine.Interact.DefragmentAllDisks",
"VirtualMachine.Interact.DeviceConnection",
"VirtualMachine.Interact.DisableSecondary",
"VirtualMachine.Interact.DnD",
"VirtualMachine.Interact.EnableSecondary",
"VirtualMachine.Interact.GuestControl",
"VirtualMachine.Interact.MakePrimary",
"VirtualMachine.Interact.Pause",
"VirtualMachine.Interact.PowerOff",
"VirtualMachine.Interact.PowerOn",
"VirtualMachine.Interact.PutUsbScanCodes",
"VirtualMachine.Interact.Record",
"VirtualMachine.Interact.Replay",
"VirtualMachine.Interact.Reset",
"VirtualMachine.Interact.SESparseMaintenance",
"VirtualMachine.Interact.SetCDMedia",
"VirtualMachine.Interact.SetFloppyMedia",
"VirtualMachine.Interact.Suspend",
"VirtualMachine.Interact.TerminateFaultTolerantVM",
"VirtualMachine.Interact.ToolsInstall",
"VirtualMachine.Interact.TurnOffFaultTolerance",
"VirtualMachine.Inventory.Create",
"VirtualMachine.Inventory.CreateFromExisting",
"VirtualMachine.Inventory.Delete",
"VirtualMachine.Inventory.Move",
"VirtualMachine.Inventory.Register",
"VirtualMachine.Inventory.Unregister",
"VirtualMachine.Provisioning.Clone",
"VirtualMachine.Provisioning.CloneTemplate",
"VirtualMachine.Provisioning.CreateTemplateFromVM",
"VirtualMachine.Provisioning.Customize",
"VirtualMachine.Provisioning.DeployTemplate",
"VirtualMachine.Provisioning.DiskRandomAccess",
"VirtualMachine.Provisioning.DiskRandomRead",
"VirtualMachine.Provisioning.FileRandomAccess",
"VirtualMachine.Provisioning.GetVmFiles",
"VirtualMachine.Provisioning.MarkAsTemplate",
"VirtualMachine.Provisioning.MarkAsVM",
"VirtualMachine.Provisioning.ModifyCustSpecs",
"VirtualMachine.Provisioning.PromoteDisks",
"VirtualMachine.Provisioning.PutVmFiles",
"VirtualMachine.Provisioning.ReadCustSpecs"
)
# Create the role
New-VIRole -Name $terraformRole -Privilege $permissions
# Create service account (this would typically be done in AD)
Write-Host "Create service account 'terraform-svc' in your domain with a strong password"
Write-Host "Then assign the role to this account on the appropriate vSphere objects"
Terraform Installation and Configuration
Let me walk you through setting up Terraform with the vSphere provider. I’ll show you both Windows and Linux installation approaches since I’ve worked with both in enterprise environments.
Windows Installation:
# PowerShell script for Windows Terraform installation
# Download and install Terraform
$terraformVersion = "1.6.0"
$downloadUrl = "https://releases.hashicorp.com/terraform/${terraformVersion}/terraform_${terraformVersion}_windows_amd64.zip"
$installPath = "C:terraform"
# Create installation directory
New-Item -ItemType Directory -Path $installPath -Force
# Download Terraform
Invoke-WebRequest -Uri $downloadUrl -OutFile "$installPathterraform.zip"
# Extract
Expand-Archive -Path "$installPathterraform.zip" -DestinationPath $installPath -Force
# Add to PATH
$currentPath = [Environment]::GetEnvironmentVariable("PATH", "Machine")
if ($currentPath -notlike "*$installPath*") {
[Environment]::SetEnvironmentVariable("PATH", "$currentPath;$installPath", "Machine")
}
# Verify installation
terraform version
Linux Installation:
#!/bin/bash
# Linux Terraform installation script
# Download and install Terraform
TERRAFORM_VERSION="1.6.0"
cd /tmp
wget https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip
unzip terraform_${TERRAFORM_VERSION}_linux_amd64.zip
sudo mv terraform /usr/local/bin/
# Verify installation
terraform version
# Install additional tools
sudo apt-get update
sudo apt-get install -y git curl jq
Building the Terraform Configuration
Project Structure and Organization
Proper project organization is crucial for maintainable Terraform code. Here’s the structure I use for vSphere automation projects:
terraform-vsphere/
├── main.tf # Main configuration
├── variables.tf # Variable definitions
├── outputs.tf # Output definitions
├── terraform.tfvars # Variable values (gitignored)
├── versions.tf # Provider version constraints
├── modules/
│ ├── vm/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ └── network/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
├── environments/
│ ├── dev/
│ ├── staging/
│ └── production/
└── templates/
├── windows-2019.json
├── ubuntu-20.04.json
└── centos-8.json
Core Provider Configuration
Let’s start with the foundational configuration files. The provider configuration is where most connectivity issues arise, so I’ll show you a robust setup that handles common authentication scenarios.
versions.tf – Provider Version Constraints:
terraform {
required_version = ">= 1.0"
required_providers {
vsphere = {
source = "hashicorp/vsphere"
version = "~> 2.5.0"
}
random = {
source = "hashicorp/random"
version = "~> 3.4.0"
}
local = {
source = "hashicorp/local"
version = "~> 2.4.0"
}
}
}
# Configure the vSphere Provider
provider "vsphere" {
user = var.vsphere_user
password = var.vsphere_password
vsphere_server = var.vsphere_server
allow_unverified_ssl = var.allow_unverified_ssl
# Connection timeout and retry settings
client_debug = var.client_debug
rest_session_path = var.rest_session_path
vim_session_path = var.vim_session_path
}
variables.tf – Variable Definitions:
# vSphere connection variables
variable "vsphere_user" {
description = "vSphere username"
type = string
sensitive = true
}
variable "vsphere_password" {
description = "vSphere password"
type = string
sensitive = true
}
variable "vsphere_server" {
description = "vSphere server FQDN or IP"
type = string
}
variable "allow_unverified_ssl" {
description = "Allow unverified SSL certificates"
type = bool
default = false
}
variable "client_debug" {
description = "Enable client debugging"
type = bool
default = false
}
variable "rest_session_path" {
description = "REST session path"
type = string
default = "/rest/com/vmware/cis/session"
}
variable "vim_session_path" {
description = "VIM session path"
type = string
default = "/sdk"
}
# Infrastructure variables
variable "datacenter_name" {
description = "vSphere datacenter name"
type = string
}
variable "cluster_name" {
description = "vSphere cluster name"
type = string
}
variable "datastore_name" {
description = "vSphere datastore name"
type = string
}
variable "network_name" {
description = "vSphere network name"
type = string
}
variable "template_name" {
description = "VM template name"
type = string
}
variable "resource_pool_name" {
description = "Resource pool name"
type = string
default = ""
}
# VM configuration variables
variable "vm_configurations" {
description = "List of VM configurations to deploy"
type = list(object({
name = string
cpu_count = number
memory_mb = number
disk_size_gb = number
network_name = string
folder_path = string
annotation = string
tags = map(string)
# Guest OS customization
guest_os_type = string
computer_name = string
domain = string
domain_user = string
domain_password = string
admin_password = string
time_zone = string
# Network configuration
ipv4_address = string
ipv4_netmask = number
ipv4_gateway = string
dns_servers = list(string)
}))
default = []
}
# Environment and tagging
variable "environment" {
description = "Environment name (dev, staging, prod)"
type = string
default = "dev"
}
variable "project_name" {
description = "Project name for resource naming"
type = string
}
variable "owner" {
description = "Resource owner"
type = string
}
variable "cost_center" {
description = "Cost center for billing"
type = string
default = ""
}
Data Sources and Resource Discovery
Data sources are essential for discovering existing vSphere resources. Here’s how I structure the data source configuration for maximum flexibility:
main.tf – Data Sources:
# Data sources for vSphere objects
data "vsphere_datacenter" "datacenter" {
name = var.datacenter_name
}
data "vsphere_compute_cluster" "cluster" {
name = var.cluster_name
datacenter_id = data.vsphere_datacenter.datacenter.id
}
data "vsphere_datastore" "datastore" {
name = var.datastore_name
datacenter_id = data.vsphere_datacenter.datacenter.id
}
data "vsphere_network" "network" {
name = var.network_name
datacenter_id = data.vsphere_datacenter.datacenter.id
}
data "vsphere_virtual_machine" "template" {
name = var.template_name
datacenter_id = data.vsphere_datacenter.datacenter.id
}
# Resource pool (optional)
data "vsphere_resource_pool" "pool" {
count = var.resource_pool_name != "" ? 1 : 0
name = var.resource_pool_name
datacenter_id = data.vsphere_datacenter.datacenter.id
}
# Default resource pool if none specified
data "vsphere_resource_pool" "default_pool" {
count = var.resource_pool_name == "" ? 1 : 0
name = format("%s/Resources", data.vsphere_compute_cluster.cluster.name)
datacenter_id = data.vsphere_datacenter.datacenter.id
}
# Local values for computed resources
locals {
resource_pool_id = var.resource_pool_name != "" ? data.vsphere_resource_pool.pool[0].id : data.vsphere_resource_pool.default_pool[0].id
# Generate VM configurations with defaults
vm_configs = [
for vm in var.vm_configurations : {
name = vm.name
cpu_count = vm.cpu_count
memory_mb = vm.memory_mb
disk_size_gb = vm.disk_size_gb
network_name = vm.network_name != "" ? vm.network_name : var.network_name
folder_path = vm.folder_path
annotation = vm.annotation
tags = merge(vm.tags, {
Environment = var.environment
Project = var.project_name
Owner = var.owner
CostCenter = var.cost_center
ManagedBy = "Terraform"
CreatedDate = formatdate("YYYY-MM-DD", timestamp())
})
# Guest OS customization
guest_os_type = vm.guest_os_type
computer_name = vm.computer_name != "" ? vm.computer_name : vm.name
domain = vm.domain
domain_user = vm.domain_user
domain_password = vm.domain_password
admin_password = vm.admin_password
time_zone = vm.time_zone
# Network configuration
ipv4_address = vm.ipv4_address
ipv4_netmask = vm.ipv4_netmask
ipv4_gateway = vm.ipv4_gateway
dns_servers = vm.dns_servers
}
]
}
VM Deployment Module
Creating a Reusable VM Module
Modular design is crucial for maintainable Terraform code. Let me show you how to create a comprehensive VM module that handles various deployment scenarios I’ve encountered in enterprise environments.
modules/vm/main.tf:
# VM Module - Main Configuration
terraform {
required_providers {
vsphere = {
source = "hashicorp/vsphere"
}
}
}
# Generate random password if not provided
resource "random_password" "admin_password" {
count = var.admin_password == "" ? 1 : 0
length = 16
special = true
}
# Create VM folder if specified
resource "vsphere_folder" "vm_folder" {
count = var.folder_path != "" ? 1 : 0
path = var.folder_path
type = "vm"
datacenter_id = var.datacenter_id
}
# Main VM resource
resource "vsphere_virtual_machine" "vm" {
name = var.vm_name
resource_pool_id = var.resource_pool_id
datastore_id = var.datastore_id
folder = var.folder_path
# VM specifications
num_cpus = var.cpu_count
memory = var.memory_mb
cpu_hot_add_enabled = var.cpu_hot_add_enabled
memory_hot_add_enabled = var.memory_hot_add_enabled
# Guest OS settings
guest_id = data.vsphere_virtual_machine.template.guest_id
firmware = data.vsphere_virtual_machine.template.firmware
scsi_type = data.vsphere_virtual_machine.template.scsi_type
scsi_bus_sharing = data.vsphere_virtual_machine.template.scsi_bus_sharing
scsi_controller_count = data.vsphere_virtual_machine.template.scsi_controller_count
# Enable features
enable_disk_uuid = true
enable_logging = var.enable_logging
cpu_performance_counters_enabled = var.cpu_performance_counters_enabled
# VM Tools and hardware
sync_time_with_host = var.sync_time_with_host
run_tools_scripts_after_power_on = var.run_tools_scripts_after_power_on
run_tools_scripts_after_resume = var.run_tools_scripts_after_resume
run_tools_scripts_before_guest_shutdown = var.run_tools_scripts_before_guest_shutdown
run_tools_scripts_before_guest_standby = var.run_tools_scripts_before_guest_standby
# Wait for guest network
wait_for_guest_net_timeout = var.wait_for_guest_net_timeout
wait_for_guest_ip_timeout = var.wait_for_guest_ip_timeout
wait_for_guest_net_routable = var.wait_for_guest_net_routable
# Annotation and metadata
annotation = var.annotation
# Tags
dynamic "tag" {
for_each = var.tags
content {
tag_id = vsphere_tag.vm_tags[tag.key].id
}
}
# Network interfaces
dynamic "network_interface" {
for_each = var.network_interfaces
content {
network_id = network_interface.value.network_id
adapter_type = network_interface.value.adapter_type
}
}
# Disks
dynamic "disk" {
for_each = var.disks
content {
label = disk.value.label
size = disk.value.size_gb
unit_number = disk.value.unit_number
thin_provisioned = disk.value.thin_provisioned
eagerly_scrub = disk.value.eagerly_scrub
datastore_id = disk.value.datastore_id != "" ? disk.value.datastore_id : var.datastore_id
storage_policy_id = disk.value.storage_policy_id
io_limit = disk.value.io_limit
io_reservation = disk.value.io_reservation
io_share_level = disk.value.io_share_level
io_share_count = disk.value.io_share_count
}
}
# Clone configuration
clone {
template_uuid = data.vsphere_virtual_machine.template.id
linked_clone = var.linked_clone
timeout = var.clone_timeout
# Customization
dynamic "customize" {
for_each = var.customize_guest ? [1] : []
content {
timeout = var.customization_timeout
# Linux customization
dynamic "linux_options" {
for_each = var.guest_os_family == "linux" ? [1] : []
content {
host_name = var.computer_name
domain = var.domain
hw_clock_utc = var.hw_clock_utc
time_zone = var.time_zone
}
}
# Windows customization
dynamic "windows_options" {
for_each = var.guest_os_family == "windows" ? [1] : []
content {
computer_name = var.computer_name
admin_password = var.admin_password != "" ? var.admin_password : random_password.admin_password[0].result
domain = var.domain
domain_admin_user = var.domain_user
domain_admin_password = var.domain_password
full_name = var.full_name
organization_name = var.organization_name
product_key = var.product_key
time_zone = var.time_zone
auto_logon = var.auto_logon
auto_logon_count = var.auto_logon_count
workgroup = var.workgroup
# Run once commands
dynamic "run_once_command_list" {
for_each = var.run_once_commands
content {
run_once_command_list.value
}
}
}
}
# Network interface customization
dynamic "network_interface" {
for_each = var.network_customization
content {
ipv4_address = network_interface.value.ipv4_address
ipv4_netmask = network_interface.value.ipv4_netmask
dns_server_list = network_interface.value.dns_servers
dns_domain = network_interface.value.dns_domain
ipv4_gateway = network_interface.value.ipv4_gateway
}
}
# Global network settings
ipv4_gateway = var.ipv4_gateway
dns_server_list = var.dns_servers
dns_suffix_list = var.dns_suffix_list
}
}
}
# Lifecycle management
lifecycle {
ignore_changes = [
annotation,
clone[0].template_uuid,
clone[0].customize[0].windows_options[0].admin_password
]
}
depends_on = [
vsphere_folder.vm_folder,
vsphere_tag.vm_tags
]
}
# Create tags for the VM
resource "vsphere_tag_category" "vm_category" {
for_each = toset(keys(var.tags))
name = each.key
cardinality = "SINGLE"
description = "Tag category for ${each.key}"
associable_types = [
"VirtualMachine",
]
}
resource "vsphere_tag" "vm_tags" {
for_each = var.tags
name = each.value
category_id = vsphere_tag_category.vm_category[each.key].id
description = "Tag for ${each.key}: ${each.value}"
}
# Data source for template
data "vsphere_virtual_machine" "template" {
name = var.template_name
datacenter_id = var.datacenter_id
}
modules/vm/variables.tf:
# VM Module Variables
variable "vm_name" {
description = "Name of the virtual machine"
type = string
}
variable "datacenter_id" {
description = "vSphere datacenter ID"
type = string
}
variable "resource_pool_id" {
description = "vSphere resource pool ID"
type = string
}
variable "datastore_id" {
description = "vSphere datastore ID"
type = string
}
variable "template_name" {
description = "Template name to clone from"
type = string
}
variable "folder_path" {
description = "VM folder path"
type = string
default = ""
}
# VM specifications
variable "cpu_count" {
description = "Number of CPUs"
type = number
default = 2
}
variable "memory_mb" {
description = "Memory in MB"
type = number
default = 4096
}
variable "cpu_hot_add_enabled" {
description = "Enable CPU hot add"
type = bool
default = true
}
variable "memory_hot_add_enabled" {
description = "Enable memory hot add"
type = bool
default = true
}
# Guest OS configuration
variable "guest_os_family" {
description = "Guest OS family (windows or linux)"
type = string
default = "windows"
validation {
condition = contains(["windows", "linux"], var.guest_os_family)
error_message = "Guest OS family must be either 'windows' or 'linux'."
}
}
variable "computer_name" {
description = "Computer name for guest OS"
type = string
default = ""
}
variable "domain" {
description = "Domain to join"
type = string
default = ""
}
variable "domain_user" {
description = "Domain user for joining domain"
type = string
default = ""
}
variable "domain_password" {
description = "Domain password for joining domain"
type = string
default = ""
sensitive = true
}
variable "admin_password" {
description = "Local administrator password"
type = string
default = ""
sensitive = true
}
variable "time_zone" {
description = "Time zone for guest OS"
type = string
default = "UTC"
}
# Network configuration
variable "network_interfaces" {
description = "Network interface configurations"
type = list(object({
network_id = string
adapter_type = string
}))
default = []
}
variable "network_customization" {
description = "Network customization settings"
type = list(object({
ipv4_address = string
ipv4_netmask = number
dns_servers = list(string)
dns_domain = string
ipv4_gateway = string
}))
default = []
}
variable "ipv4_gateway" {
description = "IPv4 gateway"
type = string
default = ""
}
variable "dns_servers" {
description = "DNS servers"
type = list(string)
default = []
}
variable "dns_suffix_list" {
description = "DNS suffix list"
type = list(string)
default = []
}
# Disk configuration
variable "disks" {
description = "Disk configurations"
type = list(object({
label = string
size_gb = number
unit_number = number
thin_provisioned = bool
eagerly_scrub = bool
datastore_id = string
storage_policy_id = string
io_limit = number
io_reservation = number
io_share_level = string
io_share_count = number
}))
default = []
}
# Clone settings
variable "linked_clone" {
description = "Create linked clone"
type = bool
default = false
}
variable "clone_timeout" {
description = "Clone timeout in minutes"
type = number
default = 30
}
# Customization settings
variable "customize_guest" {
description = "Enable guest customization"
type = bool
default = true
}
variable "customization_timeout" {
description = "Customization timeout in minutes"
type = number
default = 20
}
# Windows-specific settings
variable "full_name" {
description = "Full name for Windows customization"
type = string
default = "Administrator"
}
variable "organization_name" {
description = "Organization name for Windows customization"
type = string
default = "Organization"
}
variable "product_key" {
description = "Windows product key"
type = string
default = ""
sensitive = true
}
variable "auto_logon" {
description = "Enable auto logon"
type = bool
default = false
}
variable "auto_logon_count" {
description = "Auto logon count"
type = number
default = 1
}
variable "workgroup" {
description = "Workgroup name"
type = string
default = "WORKGROUP"
}
variable "run_once_commands" {
description = "Commands to run once after customization"
type = list(string)
default = []
}
# Linux-specific settings
variable "hw_clock_utc" {
description = "Hardware clock uses UTC"
type = bool
default = true
}
# VM features
variable "enable_logging" {
description = "Enable VM logging"
type = bool
default = false
}
variable "cpu_performance_counters_enabled" {
description = "Enable CPU performance counters"
type = bool
default = false
}
variable "sync_time_with_host" {
description = "Sync time with host"
type = bool
default = true
}
variable "run_tools_scripts_after_power_on" {
description = "Run tools scripts after power on"
type = bool
default = true
}
variable "run_tools_scripts_after_resume" {
description = "Run tools scripts after resume"
type = bool
default = true
}
variable "run_tools_scripts_before_guest_shutdown" {
description = "Run tools scripts before guest shutdown"
type = bool
default = true
}
variable "run_tools_scripts_before_guest_standby" {
description = "Run tools scripts before guest standby"
type = bool
default = true
}
# Wait settings
variable "wait_for_guest_net_timeout" {
description = "Wait for guest network timeout"
type = number
default = 5
}
variable "wait_for_guest_ip_timeout" {
description = "Wait for guest IP timeout"
type = number
default = 5
}
variable "wait_for_guest_net_routable" {
description = "Wait for guest network routable"
type = bool
default = true
}
# Metadata
variable "annotation" {
description = "VM annotation"
type = string
default = ""
}
variable "tags" {
description = "Tags to apply to the VM"
type = map(string)
default = {}
}
Advanced Configuration Patterns
Multi-Environment Deployment
In enterprise environments, you’ll often need to deploy the same infrastructure across multiple environments. Here’s how I structure multi-environment deployments:
environments/dev/terraform.tfvars:
# Development environment configuration
vsphere_server = "vcenter-dev.domain.com"
datacenter_name = "Datacenter-Dev"
cluster_name = "Cluster-Dev"
datastore_name = "datastore-dev-01"
network_name = "VM Network Dev"
template_name = "windows-2019-template-dev"
environment = "dev"
project_name = "webapp"
owner = "development-team"
cost_center = "IT-DEV-001"
vm_configurations = [
{
name = "webapp-dev-web01"
cpu_count = 2
memory_mb = 4096
disk_size_gb = 60
network_name = "VM Network Dev"
folder_path = "Development/WebApp"
annotation = "Development web server"
tags = {
Role = "WebServer"
Tier = "Frontend"
}
guest_os_type = "windows9Server64Guest"
computer_name = "webapp-web01"
domain = "dev.domain.com"
domain_user = "svc-terraform"
domain_password = "SecurePassword123!"
admin_password = "LocalAdminPass123!"
time_zone = "Eastern Standard Time"
ipv4_address = "192.168.10.10"
ipv4_netmask = 24
ipv4_gateway = "192.168.10.1"
dns_servers = ["192.168.10.5", "192.168.10.6"]
},
{
name = "webapp-dev-db01"
cpu_count = 4
memory_mb = 8192
disk_size_gb = 100
network_name = "VM Network Dev"
folder_path = "Development/WebApp"
annotation = "Development database server"
tags = {
Role = "DatabaseServer"
Tier = "Backend"
}
guest_os_type = "windows9Server64Guest"
computer_name = "webapp-db01"
domain = "dev.domain.com"
domain_user = "svc-terraform"
domain_password = "SecurePassword123!"
admin_password = "LocalAdminPass123!"
time_zone = "Eastern Standard Time"
ipv4_address = "192.168.10.11"
ipv4_netmask = 24
ipv4_gateway = "192.168.10.1"
dns_servers = ["192.168.10.5", "192.168.10.6"]
}
]
Using the VM Module
Now let’s put it all together in the main configuration that uses our VM module:
main.tf – Using the Module:
# Main Terraform configuration using the VM module
module "virtual_machines" {
source = "./modules/vm"
for_each = { for vm in local.vm_configs : vm.name => vm }
# Basic VM settings
vm_name = each.value.name
datacenter_id = data.vsphere_datacenter.datacenter.id
resource_pool_id = local.resource_pool_id
datastore_id = data.vsphere_datastore.datastore.id
template_name = var.template_name
folder_path = each.value.folder_path
# VM specifications
cpu_count = each.value.cpu_count
memory_mb = each.value.memory_mb
# Guest OS configuration
guest_os_family = each.value.guest_os_type == "ubuntu64Guest" ? "linux" : "windows"
computer_name = each.value.computer_name
domain = each.value.domain
domain_user = each.value.domain_user
domain_password = each.value.domain_password
admin_password = each.value.admin_password
time_zone = each.value.time_zone
# Network configuration
network_interfaces = [
{
network_id = data.vsphere_network.network.id
adapter_type = "vmxnet3"
}
]
network_customization = each.value.ipv4_address != "" ? [
{
ipv4_address = each.value.ipv4_address
ipv4_netmask = each.value.ipv4_netmask
dns_servers = each.value.dns_servers
dns_domain = each.value.domain
ipv4_gateway = each.value.ipv4_gateway
}
] : []
# Disk configuration
disks = [
{
label = "disk0"
size_gb = each.value.disk_size_gb
unit_number = 0
thin_provisioned = true
eagerly_scrub = false
datastore_id = ""
storage_policy_id = ""
io_limit = -1
io_reservation = 0
io_share_level = "normal"
io_share_count = 0
}
]
# Metadata
annotation = each.value.annotation
tags = each.value.tags
# Global network settings
ipv4_gateway = each.value.ipv4_gateway
dns_servers = each.value.dns_servers
}
Automation and CI/CD Integration
GitLab CI/CD Pipeline
Integrating Terraform with CI/CD pipelines is essential for enterprise automation. Here’s a GitLab CI configuration I use for Terraform deployments:
.gitlab-ci.yml:
stages:
- validate
- plan
- apply
- destroy
variables:
TF_ROOT: ${CI_PROJECT_DIR}
TF_IN_AUTOMATION: "true"
TF_INPUT: "false"
TF_CLI_ARGS: "-no-color"
cache:
key: "${CI_COMMIT_REF_SLUG}"
paths:
- ${TF_ROOT}/.terraform
before_script:
- cd ${TF_ROOT}
- terraform --version
- terraform init -backend-config="address=${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${CI_ENVIRONMENT_NAME}"
- terraform init -backend-config="lock_address=${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${CI_ENVIRONMENT_NAME}/lock"
- terraform init -backend-config="unlock_address=${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${CI_ENVIRONMENT_NAME}/lock"
- terraform init -backend-config="username=${CI_USERNAME}"
- terraform init -backend-config="password=${CI_JOB_TOKEN}"
- terraform init -backend-config="lock_method=POST"
- terraform init -backend-config="unlock_method=DELETE"
- terraform init -backend-config="retry_wait_min=5"
validate:
stage: validate
script:
- terraform validate
- terraform fmt -check=true -diff=true
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == "main"'
plan:
stage: plan
script:
- terraform plan -var-file="environments/${CI_ENVIRONMENT_NAME}/terraform.tfvars" -out="planfile"
artifacts:
name: plan
paths:
- ${TF_ROOT}/planfile
expire_in: 1 week
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == "main"'
apply:
stage: apply
script:
- terraform apply -input=false "planfile"
dependencies:
- plan
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: manual
environment:
name: ${CI_ENVIRONMENT_NAME}
destroy:
stage: destroy
script:
- terraform destroy -var-file="environments/${CI_ENVIRONMENT_NAME}/terraform.tfvars" -auto-approve
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: manual
environment:
name: ${CI_ENVIRONMENT_NAME}
action: stop
PowerShell Automation Scripts
For Windows-centric environments, PowerShell scripts can provide additional automation capabilities:
Deploy-Infrastructure.ps1:
# PowerShell script for automated Terraform deployment
param(
[Parameter(Mandatory=$true)]
[ValidateSet("dev", "staging", "prod")]
[string]$Environment,
[Parameter(Mandatory=$true)]
[ValidateSet("plan", "apply", "destroy")]
[string]$Action,
[string]$ProjectPath = ".",
[switch]$AutoApprove,
[switch]$Detailed
)
# Set error handling
$ErrorActionPreference = "Stop"
# Configuration
$TerraformPath = "terraform.exe"
$LogPath = "logs"
$Timestamp = Get-Date -Format "yyyyMMdd-HHmmss"
# Create log directory
if (-not (Test-Path $LogPath)) {
New-Item -ItemType Directory -Path $LogPath -Force
}
# Logging function
function Write-Log {
param([string]$Message, [string]$Level = "INFO")
$LogMessage = "$(Get-Date -Format 'yyyy-MM-dd HH:mm:ss') [$Level] $Message"
Write-Host $LogMessage
Add-Content -Path "$LogPathterraform-$Timestamp.log" -Value $LogMessage
}
# Validate Terraform installation
try {
$TerraformVersion = & $TerraformPath version
Write-Log "Terraform version: $($TerraformVersion[0])"
} catch {
Write-Log "Terraform not found or not executable" "ERROR"
exit 1
}
# Change to project directory
Set-Location $ProjectPath
Write-Log "Working directory: $(Get-Location)"
# Initialize Terraform
Write-Log "Initializing Terraform..."
try {
& $TerraformPath init -upgrade
if ($LASTEXITCODE -ne 0) {
throw "Terraform init failed"
}
} catch {
Write-Log "Terraform initialization failed: $($_.Exception.Message)" "ERROR"
exit 1
}
# Validate configuration
Write-Log "Validating Terraform configuration..."
try {
& $TerraformPath validate
if ($LASTEXITCODE -ne 0) {
throw "Terraform validation failed"
}
Write-Log "Configuration validation successful"
} catch {
Write-Log "Terraform validation failed: $($_.Exception.Message)" "ERROR"
exit 1
}
# Execute requested action
switch ($Action) {
"plan" {
Write-Log "Creating Terraform plan for environment: $Environment"
$PlanFile = "terraform-$Environment-$Timestamp.tfplan"
try {
if ($Detailed) {
& $TerraformPath plan -var-file="environments/$Environment/terraform.tfvars" -out=$PlanFile -detailed-exitcode
} else {
& $TerraformPath plan -var-file="environments/$Environment/terraform.tfvars" -out=$PlanFile
}
if ($LASTEXITCODE -eq 0) {
Write-Log "No changes detected"
} elseif ($LASTEXITCODE -eq 2) {
Write-Log "Changes detected and plan saved to: $PlanFile"
} else {
throw "Plan failed with exit code: $LASTEXITCODE"
}
} catch {
Write-Log "Terraform plan failed: $($_.Exception.Message)" "ERROR"
exit 1
}
}
"apply" {
Write-Log "Applying Terraform configuration for environment: $Environment"
# Check for existing plan file
$PlanFiles = Get-ChildItem -Filter "terraform-$Environment-*.tfplan" | Sort-Object LastWriteTime -Descending
if ($PlanFiles.Count -gt 0 -and -not $AutoApprove) {
$LatestPlan = $PlanFiles[0].Name
Write-Log "Using existing plan file: $LatestPlan"
try {
& $TerraformPath apply $LatestPlan
if ($LASTEXITCODE -ne 0) {
throw "Apply failed"
}
Write-Log "Apply completed successfully"
} catch {
Write-Log "Terraform apply failed: $($_.Exception.Message)" "ERROR"
exit 1
}
} else {
Write-Log "No plan file found or auto-approve enabled, applying directly"
try {
if ($AutoApprove) {
& $TerraformPath apply -var-file="environments/$Environment/terraform.tfvars" -auto-approve
} else {
& $TerraformPath apply -var-file="environments/$Environment/terraform.tfvars"
}
if ($LASTEXITCODE -ne 0) {
throw "Apply failed"
}
Write-Log "Apply completed successfully"
} catch {
Write-Log "Terraform apply failed: $($_.Exception.Message)" "ERROR"
exit 1
}
}
}
"destroy" {
Write-Log "Destroying Terraform-managed infrastructure for environment: $Environment" "WARNING"
if (-not $AutoApprove) {
$Confirmation = Read-Host "Are you sure you want to destroy all resources in $Environment? (yes/no)"
if ($Confirmation -ne "yes") {
Write-Log "Destroy operation cancelled by user"
exit 0
}
}
try {
if ($AutoApprove) {
& $TerraformPath destroy -var-file="environments/$Environment/terraform.tfvars" -auto-approve
} else {
& $TerraformPath destroy -var-file="environments/$Environment/terraform.tfvars"
}
if ($LASTEXITCODE -ne 0) {
throw "Destroy failed"
}
Write-Log "Destroy completed successfully"
} catch {
Write-Log "Terraform destroy failed: $($_.Exception.Message)" "ERROR"
exit 1
}
}
}
# Generate summary report
Write-Log "Generating deployment summary..."
$StateFile = "terraform.tfstate"
if (Test-Path $StateFile) {
$State = Get-Content $StateFile | ConvertFrom-Json
$ResourceCount = $State.resources.Count
Write-Log "Total resources in state: $ResourceCount"
# List VMs
$VMs = $State.resources | Where-Object { $_.type -eq "vsphere_virtual_machine" }
if ($VMs) {
Write-Log "Virtual Machines:"
foreach ($VM in $VMs) {
Write-Log " - $($VM.instances[0].attributes.name)"
}
}
}
Write-Log "Operation completed successfully"
Monitoring and Maintenance
State Management and Backup
Proper state management is crucial for production Terraform deployments. Here’s how I handle state backup and recovery:
State Backup Script:
# PowerShell script for Terraform state backup
param(
[string]$BackupPath = "backups",
[string]$RetentionDays = 30
)
$Timestamp = Get-Date -Format "yyyyMMdd-HHmmss"
$StateFile = "terraform.tfstate"
$BackupFile = "$BackupPathterraform-state-$Timestamp.tfstate"
# Create backup directory
if (-not (Test-Path $BackupPath)) {
New-Item -ItemType Directory -Path $BackupPath -Force
}
# Backup current state
if (Test-Path $StateFile) {
Copy-Item $StateFile $BackupFile
Write-Host "State backed up to: $BackupFile"
# Compress backup
Compress-Archive -Path $BackupFile -DestinationPath "$BackupFile.zip" -Force
Remove-Item $BackupFile
Write-Host "Backup compressed: $BackupFile.zip"
} else {
Write-Host "No state file found to backup"
}
# Clean old backups
$CutoffDate = (Get-Date).AddDays(-$RetentionDays)
$OldBackups = Get-ChildItem -Path $BackupPath -Filter "terraform-state-*.zip" | Where-Object { $_.LastWriteTime -lt $CutoffDate }
if ($OldBackups) {
Write-Host "Removing $($OldBackups.Count) old backup(s)"
$OldBackups | Remove-Item -Force
}
Troubleshooting Common Issues
Authentication and Connectivity Problems
Based on my experience, here are the most common issues and their solutions:
Issue 1: SSL Certificate Verification Failures
# Diagnostic script for SSL issues
$vCenterServer = "vcenter.domain.com"
$Port = 443
# Test basic connectivity
$Connection = Test-NetConnection -ComputerName $vCenterServer -Port $Port
if (-not $Connection.TcpTestSucceeded) {
Write-Host "❌ Cannot connect to $vCenterServer on port $Port" -ForegroundColor Red
exit 1
}
# Check SSL certificate
try {
$Request = [System.Net.WebRequest]::Create("https://$vCenterServer")
$Request.Timeout = 10000
$Response = $Request.GetResponse()
$Certificate = $Request.ServicePoint.Certificate
Write-Host "✅ SSL connection successful" -ForegroundColor Green
Write-Host "Certificate Subject: $($Certificate.Subject)"
Write-Host "Certificate Issuer: $($Certificate.Issuer)"
Write-Host "Certificate Expires: $($Certificate.GetExpirationDateString())"
} catch {
Write-Host "❌ SSL certificate validation failed: $($_.Exception.Message)" -ForegroundColor Red
Write-Host " Consider setting allow_unverified_ssl = true in provider configuration" -ForegroundColor Yellow
}
Issue 2: Insufficient Permissions
# PowerShell script to test vSphere permissions
# This would be run from a machine with PowerCLI installed
Import-Module VMware.PowerCLI
$vCenterServer = "vcenter.domain.com"
$Username = "terraform-svc@domain.com"
$Password = "SecurePassword123!"
try {
$Credential = New-Object System.Management.Automation.PSCredential($Username, (ConvertTo-SecureString $Password -AsPlainText -Force))
Connect-VIServer -Server $vCenterServer -Credential $Credential -Force
Write-Host "✅ Authentication successful" -ForegroundColor Green
# Test basic operations
$Datacenters = Get-Datacenter
Write-Host "✅ Can list datacenters: $($Datacenters.Count) found" -ForegroundColor Green
$VMs = Get-VM | Select-Object -First 5
Write-Host "✅ Can list VMs: $($VMs.Count) found" -ForegroundColor Green
# Test VM creation permissions (dry run)
$TestCluster = Get-Cluster | Select-Object -First 1
if ($TestCluster) {
Write-Host "✅ Can access cluster: $($TestCluster.Name)" -ForegroundColor Green
}
} catch {
Write-Host "❌ Permission test failed: $($_.Exception.Message)" -ForegroundColor Red
} finally {
Disconnect-VIServer -Confirm:$false
}
Best Practices and Recommendations
Security Considerations
Security should be built into your Terraform workflow from day one. Here are the key practices I follow:
Secrets Management:
- Never commit secrets to version control
- Use environment variables or secret management systems
- Implement proper RBAC for Terraform state access
- Regularly rotate service account credentials
Environment Variables Setup:
# PowerShell script to set environment variables securely
$env:TF_VAR_vsphere_user = "terraform-svc@domain.com"
$env:TF_VAR_vsphere_password = Read-Host -Prompt "Enter vSphere password" -AsSecureString | ConvertFrom-SecureString
$env:TF_VAR_vsphere_server = "vcenter.domain.com"
# For Linux/bash:
# export TF_VAR_vsphere_user="terraform-svc@domain.com"
# export TF_VAR_vsphere_password="SecurePassword123!"
# export TF_VAR_vsphere_server="vcenter.domain.com"
Performance Optimization
Large-scale deployments require careful attention to performance:
Parallel Execution:
- Use Terraform’s parallelism settings appropriately
- Consider vSphere resource limits when setting parallelism
- Monitor vCenter performance during large deployments
# Terraform commands with parallelism control
terraform plan -parallelism=5
terraform apply -parallelism=5
Conclusion: Building Scalable VM Automation
Terraform with vSphere provides a powerful foundation for infrastructure automation, but success depends on proper planning, modular design, and adherence to best practices. The configuration examples and patterns I’ve shared here represent years of refinement based on real-world enterprise deployments.
Key takeaways for successful implementation:
- Start with proper permissions and security – Get the foundation right before building complexity
- Use modular design patterns – Reusable modules save time and reduce errors
- Implement comprehensive testing – Validate configurations before production deployment
- Plan for scale – Consider performance implications from the beginning
- Maintain proper state management – Backup and version control are essential
With these foundations in place, you’ll be able to build sophisticated VM deployment automation that scales with your organization’s needs while maintaining security and reliability standards.