Background:
Some naive systems engineer (yours truly) has provisioned a buck load of Hyper-V servers and added them into a cluster (unfortunately, without adequate knowledge transfer from departed valuable SysAdmins). All functionalities seem to be working fine, except for certain anomalies where client DNS records seem to be dropped from Active Directory randomly. This has caused user problems as Active Directory requires that each machine has a resolvable a-record to be joined into its domain. Moreover, certain services, such as DHCP servers sometimes could not reach their replicating counterparts. Worst yet, since DHCP affects multiple VLANs, when those services are off-line, many users are affected. An investigation has been called to address this problem.
Issues:
Certain Windows Virtual Machine clients in Hyper-V are not able to connect to Active Directory, and their DNS records are purged by AD-Integrated DNS. Moreover, DHCP servers are placed in the same VLAN as AD/DNS servers. Sometimes, those services are unreachable as well.
Immediate Causes:
– Virtual machines cannot reach DNS server that are on the same VLAN
– DNS registrations are ‘dynamic’; hence, DNS records would get scavenged after a certain period of time if DNS clients do not contact the DNS server
– DHCP server’s config settings such as ‘Discard A and PTR records when lease is deleted’ and ‘Dynamically update DNS records for DHCP clients that do not request update (for example, clients running Windows NT 4.0), would cause client DNS records to be purged
Root Cause:
– Since the DNS virtual machine resides in VLAN that is out of scope of the main switch’s subnet or supernet, its MAC address does not advertise within the VM’s VLAN. However, Hyper-V does have a MAC Table of such DNS server. Hence, Hyper-V can route traffic from other VLAN’s toward the DNS server, while it cannot switch/bridge layer 2 frames of VM’s of the same VLAN where DNS server belongs.
– Hyper-V has a built-in protocol suite named NetworkVirtualization that is adequate for most purposes. However, it doesn’t perform bridging of out-of-scope VLAN automatically
– Docker for Windows suite named Containers has an important networking protocol, “Bridge Driver” that can perform bridging of all VLAN’s within the virtual switches
– NetworkVirtualization (a component of Hyper-V) cannot be installed along with l2bridge.sys (a component of Containers)
– Thus, the Docker component must be installed along with Hyper-V to enable intra-vlan bridging.
Summary of proposed changes:
- Record IP config prior to changes:
IPv4 Address. . . . . . . . . . . : 172.16.20.98(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 172.16.20.1
DNS Servers . . . . . . . . . . . : 172.16.10.30
172.16.10.31
- Create a local account: New-LocalUser 'backupadmin' -Password $(Read-Host -AsSecureString)
- Add account into Administrators group: Add-LocalGroupMember -Group "Administrators" -Member 'backupadmin'
- Include FailoverClusters module into PowerShell: Import-Module FailoverClusters
- Put node in maintenance mode (cordon): Suspend-ClusterNode -Name $env:computername
- Stop clustering services on this node: Stop-ClusterNode -Name $env:computername
- Remove vSwitch: Remove-VMSwitch -Name 'External-Connection' -force
- Remove conflicting features: remove-windowsfeature NetworkVirtualization -force
- Install 'Bridge Drivers' software suite: Install-WindowsFeature -Name Containers
- Reboot node: Restart-Computer -Force
- Run ncpa.cpl to verify that the vEthernet is purged
- If Not purged:
- run this command: netcfg -d
- Also do this: run devmgmt.msc > remove virtual ethernet adapter
- Remove Hyper-V to force purging of vSwitches: Remove-windowsfeature Hyper-V,NetworkVirtualization -Force -restart
- Reinstall Hyper-V: Install-windowsfeature Hyper-V,Containers -restart
- If Purged: go to the next step
- Install virtual network adapter and switch:
- Install vSwitch: New-VMSwitch -Name 'Trunk' -NetAdapterName "TEAM1" -AllowManagementOS $true
- Add Management VLAN: Get-VMNetworkAdapter -SwitchName 'Trunk' -ManagementOS|Set-VMNetworkAdapterVlan -Access -VlanId 101
# Script
$switchName='Trunk'
$adapterName='NIC1' # change this value to reflect the correct interface
$vlanId=100 # change this to the correct VLAN
function addVirtualSwitch($switchName,$adapterName,$vlanId){
New-VMSwitch -name $switchName -NetAdapterName $adapterName -AllowManagementOS $true
Enable-VMSwitchExtension -VMSwitchName $switchName -Name "Microsoft Windows Filtering Platform"
if($vlanId)
{ Get-VMNetworkAdapter -SwitchName $switchName -ManagementOS|Set-VMNetworkAdapterVlan -Access -VlanId $vlanId }
}
addVirtualSwitch $switchName $adapterName $vlanId
- Configure vEthernet adapter (if necessary)
- Enable these protocols
client for Microsoft Networks
File and Printer Sharing for Microsoft Networks
QoS Packet Scheduler
Bridge Driver
Internet Protocol Version 4 (TCP/IPv4)
Microsoft LLDP Protocol Driver
Internet Protocol Version 6 (TCP/IPv6)
Link-Layer Topology Discovery Responder
Link-Layer Topology Discovery Mapper I/O Driver
- Set Static IP
- Set default GW
- Set DNs1/DNS2
- Register with DNS Server: ipconfig /registerdns
- Ensure that all auto-start services are running
- Start clustering services on this node: Start-ClusterNode -Name $env:computername
- Wait a few minutes for node to re-associate with its original cluster
- Resume roles of all suspended nodes: Resume-ClusterNode -Name $env:computername -Failback Immediate