I ran into an interesting issue when rebuilding a two node Hyper-V cluster in my lab the other day. It caught me a little off guard because I hadn't run into that specific issue before... Keeping in mind I work as a consultant deploying a new MS Private Cloud environment every couple weeks that's fairly significant... Although, I don't always deploy using a converged network. I wish I had thought to capture some screenshots to document the issue.
I had done a clean install of Windows Server 2012 Datacenter and patched it up. For this particular exercise I was following the steps outlined in "Windows Server 2012 Hyper-V Installation and Configuration Guide" for creating a converged network. I stepped through the instructions and everything worked perfectly.
I had previously created the NIC team on both servers. Following the book's instructions, I created a virtual swich, a management vNIC, a Cluster Shared Volumes vNIC and a Live Migration NIC. I finished up by setting reasonable QOS settings.
Next I set IP addresses on all of the newly created vNICs on both hosts. I also added by servers to my lab domain. Everything seemed to be working great... Until I tested communication on my CSV and Live Migration networks. Nothing. I couldn't even get them to ping each other.
I went through everything. I checked the switch configuration to make sure my LACP teams were correct. I made sure the trunks had the correct VLANs assigned. I stepped through the VLAN assignments and QOS settings for for each vNIC that I created. Everything was correct.
In a final attempt to figure things out I ran the cluster validation wizard. The following lines in the validation report finally lead me to the answer:
Found duplicate physical address 00-15-5D-0A-B4-00 on node server1 adapter vEthernet (CSV) and node server2 adapter vEthernet (CSV).
Found duplicate physical address 00-15-5D-0A-B4-01 on node server1 adapter vEthernet (Live Migration) and node server2 adapter vEthernet (Live Migration).
It turns out that both hosts had the same default MAC address ranges for their virtual switches. Since the host vNICs were attached to the virtual switch on each host they received the first couple of MAC addresses from the switches. I have no idea why the Management LAN vNICs were okay.
I resolved the issue by taking the following steps.
- I removed the virtual switch from one of the hosts
- Then I modified the MAC address range in Hyper-V for that host.
- Recreated the virtual switch
- Recreated all of the management OS vNICs
- Set QOS settings.
- Applied IP addresses.
After making those changes I was able to communicate between my Hyper-V hosts on all of my networks and the Failover Cluster Validation Wizard completed successfully.