How to install sFlow Packet Filter virtual switch extension in SCVMM

I recently had a request from our network team to install an sFlow virtual switch extension package in to our Hyper-V clusters so that virtual machine traffic can be monitored.  Naturally the best and most controlled way to accomplish this deployment is by leveraging the logical switch extension capabilities included in SCVMM.  I have documented my experience installing Host sFlow from InMon in this blog.

  1. Download “sFlowAgentVMM-win-(version)-x64.msi” and “hsflowd-win-(version)-x64.msi” from here
  2. Place the hsflowd-win-(version)-x64.msi on the SCVMM Management server(s) at “C:\ProgramData\Switch Extension Drivers”
  3. Install “sFlowAgentVMM-win-(version)-x64.msi” on SCVMM Management server(s)
  4. Restart the System Center Virtual Machine Manager service on the management server(s)
  5. In the SCVMM console, navigate to Fabricchoose Add Resources > Network Service
  6. Name the Network Service (for instance, sFlow Packet Filter).  Click Next.  Verify that the Manufacturer and Model are Inmon Corp and sFlowFilter respectively.  Click Next.  Specify credentials in a RunAs account and click Next.  On the Connection String screen, enter the IP address or FQDN of the sFlow collector.  Click Next.  Choose applicable Host Group(s) and click Next, Finish.

  7. Navigate to Settings > Configuration Providers and verify that the virtual switch extension for sFlow is installed.

    sFlow Verification 1

  8. Navigate to Fabric > Networking > Network Service.  Right click on the sFlow Packet Filter and choose Properties.  On the Extensions screen, click Add Property and add the following properties.
    1. SFLOW_DNSSD = off
      1. This is required, otherwise the agent will default to the DNSSD configuration (and ignore the remainder of this manual configuration).
    2. SFLOW_COLLECTOR = (IP address or FQDN of the sFlow collector service)
      1. Set this to the hostname or IP address of the sFlow collector to send sFlow to. This can also be a comma separated list of multiple collectors, if required.
    3. SFLOW_POLLING = 20
      1. Set this to the required polling interval (eg 20 for every 20 seconds, the default value).
    4. SFLOW_SAMPLING = 256
      1. Set this to the required sampling rate (eg 256 for 1 in 256 sampling, the default rate).
  9. *** NOTE *** – Once you deploy a logical switch to a host with settings, you cannot add/edit these properties.  Once these settings have been verified on the Extensions screen, click OK and verify successful job completion on the Jobs window.
  10. Navigate to Fabric > Networking > Logical Switches.  Right click on the appropriate logical switch and choose Properties.  On the Extensions screen, select “sFlow agent for Hyper-V virtual switch” and click OK.  Repeat this step if you wish to install the sFlow extension in to multiple logical switches.

    sFlow Documentation 7

  11. Refresh the host/cluster in SCVMM
  12. Navigate to Fabric > Networking > Logical Switches.  By selecting the “Hosts” button on the top ribbon bar, the compliance status of each host’s switches can be viewed.  For each logical switch that is in a “Not Compliant” status, right click on the logical switch and choose Remediate.  Note – it appears that you MUST right click on the logical switch to remediate.  You cannot select the host and remediate all non-compliant logical switches at one time, even though the option appears to be there.  Review the successful remediation status in the Jobs pane.  Wait until all remediation jobs have completed before proceeding.
  13. Refresh the host/cluster in SCVMM and wait until the refresh job has completed
  14. Navigate to Fabric > Networking > Logical Switches.  All hosts and logical switches should now be fully compliant.  If there are any non-compliant hosts, verify that you successfully completed step #2 in this guide and that the RunAs account that was utilized has access to that directory/file.

Reference

http://www.sflow.net/host-sflow-scvmm.php

http://www.danielstechblog.de/inmon-sflow-fur-hyper-v/

http://blogs.technet.com/b/scvmm/archive/2013/02/04/fixing-non-compliant-virtual-switches-in-system-center-2012-virtual-machine-manager.aspx

Advertisements

SCVMM shows no available memory for a Hyper-V host

Background

System Center Virtual Machine Manager 2012 R2 shows no available memory for a Hyper-V host.  This can be observed as a “0 KB” Available Memory count in many areas in SCVMM.

  • Fabric > Servers > All Hosts > HostGroup > Server/Cluster > Fabric Resources screen.  Shows “0 KB” in the Available Memory column.
  • VMs and Services > All Hosts > HostGroup > Server/Cluster > Overview screen.  Hovering over the “Memory (GB)” section displays 0 for Available on Host.

Host Memory Status1

  • (Get-SCVMHost -Computername server01).AvailableMemory
    • Result is 0

Troubleshooting

My initial assumption was that there was most likely a WMI malfunction somewhere on the Hyper-V host.  Some brief research led me to this article.

On the afflicted host, I ran “C:\Windows\SysWOW64> lodctr /R; winmgmt /resyncperf” which should rebuild the performance counters and then register the counters with WMI, and then I restarted WMI.  This didn’t improve the situation, and in fact may have been an indirect impedance to solving this problem more quickly.

The next suggestion was to reinstall the SCVMM agent.  I have covered this sub-topic in this post.

After reinstalling the SCVMM agent did not resolve the issue, I started looking in PerfMon – specifically at \\computername – Memory – Avaliable MBytes.  I was able to confirm that performance counters were indeed tracking available memory in MBytes, so the breakage must be in the link from getting performance counters in to SCVMM.

Based on some recent (unrelated) experience where SQL Configuration Manager refused to work properly and would only generate a WMI error (covered here), I decided to check for any SCVMM-related MOFs that may need to be re-registered.  Hiding in “%systemdrive%\Program Files\Microsoft System Center 2012 R2\Virtual Machine Manager\setup”, I was able to find five seperate .mof files.  These files are lanSanDeployment.mof, NPIV.mof, scvmmswitchportsettings.mof, VMMAgent.mof, and VMMVirtualization.mof.  I *believe* that the only .mof that we are interested in for this particular problem is VMMVirtualization.mof, but just for good measure, I recompiled all of the MOFs located in this same directory.

Note – Recompiling a few of these will produce an error like

WARNING: File C:\Program Files\Microsoft System Center 2012\Virtual Machine Manager\setup\NPIV.mof does not contain #PRAGMA AUTORECOVER.  If the WMI repository is rebuilt in the future, the contents of this MOF file will not be included in the new WMI repository.  To include this MOF file when the WMI Repository is automatically reconstructed, place the #PRAGMA AUTORECOVER statement on the first line of the MOF file.

Solution

Run Notepad as Administrator.  Open each .mof file under “%systemdrive%\Program Files\Microsoft System Center 2012 R2\Virtual Machine Manager\setup” and add #PRAGMA AUTORECOVER as the first line after the comments section (designated as /*++ and –*/).  Save each modified file.  On the Hyper-V host, run command prompt as Administrator, change directory to “%systemdrive%\Program Files\Microsoft System Center 2012 R2\Virtual Machine Manager\setup”, and execute

mofcomp (filename).mof

In my case, I did not have to do anything else.  The SCVMM console and related PoSH cmdlets immediately reflected correct Available Memory for the host.  If your SCVMM infrastructure does not immediately reflect correct information, I would suggest restarting WinRM on the Hyper-V host.

Solution Update

I had this issue occur again in a different environment.  This time I was able to narrow the fix down to recompiling VMMAgent.mof, and VMMVirtualization.mof and restarting Windows Management Instrumentation service (and dependent services) on the Hyper-V host.

References

http://www.miru.ch/unveiling-cluster-overcommit-in-scvmm-2012-hyper-v/

Manually reinstall SCVMM Agent (2012 R2) on Hyper-V cluster node

Background

I was recently exposed to a troubleshooting scenario where it was necessary to manually uninstall and reinstall the SCVMM agent from a node in a Hyper-V cluster.  Since the node is clustered and managed by SCVMM, there is no option presented in the SCVMM console to be able to gracefully uninstall and reinstall the SCVMM agent.  Therefore, manual removal and re-installation is necessary.

Uninstall

Removal of the SCVMM agent is straightforward provided that you are not running Server Core on the parent partition.  Just use the Uninstall option in Add/Remove programs.  Remotely, or locally while running Server Core, you could use something like

(Get-WmiObject -Class Win32_product -Filter “Name=’Microsoft System Center Virtual Machine Manager Agent (x64)'”).uninstall()

Reinstall

Conventional instruction would have you locate the agent installation via the System Center Virtual Machine Manager original installation media.  If you have installed no Update Rollups in your SCVMM environment, then follow this link for guidance.  If you have installed Update Rollups in the SCVMM infrastructure, you will need to install the latest version of the agent, which can be located on your SCVMM Management role server(s) under “%systemdrive%\Program Files\Microsoft System Center 2012 R2\Virtual Machine Manager\agents”.  From here, select the server architecture and then select the latest agent version directory.  If you simply run vmmagent.msi, you will get an error stating that the Microsoft System Center Virtual Machine Manager Agent (x64) Setup Wizard ended prematurely.

SCVMM Agent Installation Error

Instead, copy the latest agent version directory from the remote location to a local directory.  Run a command prompt as Administrator, and change directory to the latest agent version directory that you have copied locally.  Execute

msiexec /I vmmAgent.msi

The SCVMM Agent should reinstall successfully.  In SCVMM, refesh the Hyper-V cluster and verify that there are no outstanding WMI issues on the (Host) > Properties > Status screen.

Reference

http://www.ivobeerens.nl/2015/01/14/scvmm-2012-r2-agent-update-error/

https://danielthomasclarke.wordpress.com/2014/09/09/vmm-2012-r2-uninstall-vmm-agent-from-windows-server-2012-r2-core/

NUMA is disabled

Background

While troubleshooting a completely unrelated issue recently, I discovered that one of our Hyper-V hosts was not like the others.  There were actually two problems with this particular problem host, unrelated in the end, but possibly related at first glance.

  1. Even though the server in question has two sockets, Windows is only reporting one NUMA node.
  2. SCVMM is incorrectly reporting that there is no available memory on the host

In scope for this post is problem number 1 – NUMA being reported incorrectly to the operating system.

Manifestation of this issue can be observed via two primary methods in Windows Server 2012 R2, as referenced in the (example, not factual) screenshot below.

  • Task Manager > Performance tab > CPU “Change Graph To” displays NUMA nodes as a grayed out un-selectable option
  • Get-VMHostNumaNode shows only one NUMA node (0)

Numa Node Problem Manifestation

Troubleshooting

As I was fairly certain that this was a BIOS or firmware problem and not an operating system problem, my first step was to run the vendor Update Manager software to verify that I am running the latest BIOS and chipset driver versions.  After confirmation, I started to dig around in the BIOS and came across a curious setting buried several layers deep called “Node Interleaving”, and it had been set to Enabled.  I recommend checking with your vendor or manual for the exact location of this setting in the BIOS.  In my case, the path was –

System Utilities screen, select System ConfigurationBIOS/Platform Configuration (RBSU)Performance OptionsAdvanced Performance Tuning Options Node Interleaving

Extensive documentation exists about this setting, but if you don’t know what you’re looking for in the first place, you wouldn’t know where to start.  Bottom line – Setting “Node Interleaving” to Enabled means you are, in effect, disabling NUMA; while setting “Node Interleaving” to Disabled effectively enables NUMA.

Solution

Node Interleaving = Disabled, reboot the host, verify via Task Manager and Get-VMHostNumaNode that the host parent partition sees the correct number of NUMA nodes.

Lessons Learned

Never accept a BIOS setting at face value.
Never assume that enabling a feature equates to a performance enhancement.
Always standardize and document any BIOS settings in your hypervisor build guide, and never put a not-fresh-out-of-the-box server in to production without first combing through all applicable settings.