Archive for the ‘Virtualization’ Category

Vsphere snapshot hang – How to (force) kill a VM process

April 16th, 2015 Comments off

I had a frozen VM because the snapshot hang at 95%. I could not do anything to the VM because its process is locked to the host. I couldn’t stop it, couldn’t cancel the task either. To release the lock and force kill it, I had to do the following:
- Restart the management agents
- Force stop the VM
- Consolidate the snapshot (if necessary)
- Restart the VM

1. Restart the management agents
These 2 commands should do

2. Force stop the VM
a. The VMWare way

World ID: 46664
Process ID: 0
VMX Cartel ID: 46640
UUID: 42 24 e8 f3 28 35 e1 77-dd 56 40 46 d2 a4 16 43
Display Name: plsw-ts2012-fe1
Config File: /vmfs/volumes/5156099e-0e41f131c77b4/VM-NAME/VM-NAME.vmx

Then collect the “Word ID” and run either of these commands

At this point the VM should be stopped and the lock is released. You might need to remove and re-add it. If the VM is still lock, we will need to force stop it the Linux way

~ # ps | grep 52173320
52173320 52173320 vmx /bin/vmx
52173323 52173320 vmx-vthread-4:VM-NAME /bin/vmx
52174736 52173320 vmx-vthread-5:VM-NAME /bin/vmx
52174737 52173320 vmx-mks:VM-NAME /bin/vmx
52174738 52173320 vmx-svga:VM-NAME /bin/vmx
52174741 52173320 vmx-vcpu-0:VM-NAME /bin/vmx

The second column is the master process number . Run this command to kill it

KB 1004340 should provide you with some more methods but these 2 are usually good enough

Packet capture with Nexus 1000V

January 20th, 2015 Comments off

Today I thought I’d take a look at creating a SPAN session on the 1000v to monitor traffic. I found it really easy to do! SPAN is one of those things that takes you longer to read and understand than to actually configure. I find that true with a lot of Cisco products: Fabric Path, OTV, LISP, etc.

SPAN is “Switched Port Analyzer”. Its basically port monitoring. You capture the traffic going from one port and then mirror it on another. This is one of the benefits you get out of the box for the 1000v that enables the network administrator not to have this big black box of VMs.

First I need to see which vethernet is assigned to which VM. This command can help you do that


Then create a monitor session with the following commands

And confirm the monitor session with the command


In this case, we have an error. The state is “Down”. That is because VMTEST1 and VMTEST2 are in 2 difference VM Hosts. After moving them to the same host, the state will change to up

Private VLAN, Nexus 1000v and UCS Configuration

September 23rd, 2014 Comments off

Before we start, here are a few things to remember:

  • Only isolated ports are supported in UCS. With the N1K incorporated, you can use community VLANs, but the promiscuous port must be on the N1K as well.
  • A server virtual Network Interface Controller (vNIC) in UCS cannot carry both a regular and an isolated VLAN.
  • There is no support for promiscuous ports/trunks, community ports/trunks, or isolated trunks.
  • Promiscuous ports need to be outside the UCS domain, such as an upstream switch/router or a downstream N1K
  • Now consider this scenario:

    The 4900 switch is a pVLAN aware switch. It has isolated ports on Vlan 210 and promiscuous ports on Vlan 200
    The Nexus 5K represents a network or a bunch of switches that are not pVLAN aware

    First, we need to make the UCS aware of the pVLAN structure. After defining the vlans, we will need to change the properties of them



    Next, you have to dedicate a vNIC to carry the pVLAN traffic in VMWare. Because of the UCS limitations, 1 pVLAN per vNIC only. In this case we add the isolated vlan only, and it is not a native VLAN


    Next, add 2 new VLANs to the Nexus 1000v switch , and define the private VLAN properties



    Then finally, we just have to add the vmnic to the pVLAN_uplinks port profile


    For more information on Private VLAN and Cisco UCS integration, please refer to Cisco ID 116310

    Script to remove VMs that have been off more than 30 days

    August 14th, 2014 Comments off

    My users developed a habit to keep their un-used VMs for too long and it slowly eat up our storage. I need a way to enforce our 30 days retention policy. A bit of searching and I end up with this script. Oh and it sends emails too.

    vSphere 5.5 Single Sign On the easy way

    July 14th, 2014 Comments off

    A few days ago I posted this article. For it to work, the Firewall has to open 9 TCP and 9 UDP ports. That’s a lot of opened ports, and not to mention the troubleshooting along the way.

    With VMWare Center Appliance v5.5, VMWare has added a new option for Single Sign On authentication, “Active Directory as a LDAP Server”. Things get so much easier with this option as you don’t need to join the vCSA to the domain and there is only 1 opened port, tcp:389, which is for LDAP. Surprisingly, no-one has mentioned it on the Internet.

    First, download and install vCSA with the default options: No fancy options yet. Active Directory is disabled because you don’t need to join the vCSA server to the domain.

    Then click on the SSO option and change the default password for the Administrator@vsphere.local account. Note that this is required to setup SSO.

    Then login into the web client\Administration\Single Sign-On\Configuration with the Administrator@vsphere.local . You have to login with the Administrator account to have the option. Root account doesn’t work here.. (This alone took me 2 hours to figure out)

    Then add a new Identity Source. Fill out the remaining fields as follows:

    Name: Your AD domain name; E.g. “corp.local”
    Base DN for users: Split your domain name in pieces along the dots (“.”) and prefix each part with a “dc=”. Place commas “,” in between each part; E.g. “dc=corp,dc=local”
    Domain name: Your AD domain name; E.g. “corp.local”
    Domain alias: Your netbios name of the AD domain; E.g. “CORP”
    Base DN for groups: Same a the Base DN for users; E.g. “dc=corp,dc=local”
    Primary Server URL: The Active Directory server as a URL with the protocol “ldap://” and the port 389.; E.g. ldap://
    Secondary Sever URL: Another Active Directory server or domain controller as a URL if you have one. Otherwise leave it blank; E.g. ldap://
    Username: An Active Directory username in netbios notation with privileges to read all users and groups; E.g. “CORP\Administrator”
    Password: The password of the above user.

    Hit the test button and that should be it. If it doesn’t work make sure you have tcp:389 open on the domain controller server

    Required ports for adding the ESX/ESXi host to an Active Directory domain

    July 12th, 2014 Comments off

    You need to open both TCP and UDP ports for the following

    Port 88 – Kerberos authentication
    Port 123 – NTP
    Port 135 – RPC
    Port 137 – NetBIOS Name Service
    Port 139 – NetBIOS Session Service (SMB)
    Port 389 – LDAP
    Port 445 – Microsoft-DS Active Directory, Windows shares (SMB over TCP)
    Port 464 – Kerberos – change/password changes
    Port 3268- Global Catalog search

    Cisco Nexus 1000v: VEM needs an VMK interface to connect to VSM

    June 26th, 2014 Comments off

    A VMK (VM Kernel) interface is a virtual interface that ESXi itself uses to connect to the outside world. When we first setup an ESXi, VMK0 is setup to be the management interface.

    When you install Nexus 1000v, the VEM modules need a way to communicate to the VSM modules, and we need a VMK interface on the ESXi hosts to do this.

    If you choose to use the management VMK interface (normally VMK0) for layer 3 control, that VMK will need to be moved over to the Nexus 1000V, where it will sit ‘behind’ the VEM or else VSM will not ‘see’ the VEM (i.e. it won’t appear in the output of ‘sh mod’) until the VMK interface is moved to the VEM.


    For myself I prefer to have VMK0 interface “out of band”. I leave VMK0 with vSwitch0 and create a new VMK1 interface for the VEM communication


    If you choose this option, you may need to configure static routes on the ESXi host if the two VMK interfaces are in different VLANs – for example, a default gateway would be configured via VMK0, while a more specific static route would be configured via VMK1 towards the VSM IP address.