The self-healing XenDesktop VM

At Synergy this year, I submitted an idea for the “GeekOvation” award which took place during Geek Speak Live. My idea was based on the concept of making a XenDesktop VM self-aware and self-healing. The environment that I support consists of tens of thousands of 1:1 persistent Win 7 VM’s. XenDesktop VM’s need to be registered with the DDC in order for the user to be able to access them. The registration can and does fail though for a variety of reason. If even 1% of these VM’s are unregistered with XenDesktop, that could mean several hundred users who are unable to access their resources and result in calls to the support desk.

One way to get ahead of these unregistered devices was to run a PoSH script against a DDC in each of my farms, of which I have three, and get back a list of any unregistered VM’s and then try to correct them. While this would work, it was slow and inefficient. I wanted to find a better way and the thought occurred to me, why can’t the VM know its own registration status with the DDC? If the VM was able to become “self-aware”, it could take its own corrective actions to try and get itself registered again! (I’m pretty sure historians will look back and mark this as the beginning of SkyNet!)

To enable the VM to be able to query its own registration status, I went about creating an agent to run on the VM. It would be very light weight, just a few KB in size and it could be installed in any number of fashions such as a system service or even as a scheduled task. At regular intervals of your choosing, say 30 minutes, the agent would contact a DDC and discover its registration status. How the VM would actually do the communication with the DDC became the challenge however. The obvious choice was to use PowerShell but this would require that the broker snap-in be installed on each VM and that really wasn’t practical. Enter the Monitor Service OData API

The Monitor Service API allows you to perform queries against the monitoring database on your SQL server. This is the same database that Citrix Director uses to get its information. There are two endpoints that are available for use, Methods and Data. Methods exposes the operations that are used by Director to perform trends and groupings. Data gives read-only access to the database entities. The Methods API URI is: http://{dc-host}/Citrix/Monitor/OData/v1/Methods. The Data API URI is: http://{dc-host}/Citrix/Monitor/OData/v1/Data. I am using the Data API as I am just querying a single piece of information, the VM in question.

Using the Data API URI, I add a service reference in my C# project for my database context. I have called the service reference XDWebService. I have couple of pieces of information that I need to provide for it, this includes a DDC, the VM name and credentials to authenticate with. The username that is specified here is a service account that has read-only permissions on my XenDesktop farm.

Code

So now that we have our connection set up to perform our query, I can then build a simple LINQ query which runs against the Machines table and finds the VM that we want. We can then check the CurrentRegistrationStatus of that VM. If it is 2, then VM is unregistered. If you were to view the result in a browser, this is what you would see:

Browser

 

I then called a registerVM() method to perform some corrective actions on the VM to try and get it to register itself. These can include items like restarting the Citrix Desktop and ICA services, flushing the DNS cache and verifying that the DDC entries in the registry are correct. After performing those items, if the VM is still unregistered, you could take it a step further and have the VM send a notification to your helpdesk to let support staff know it was in distress and they could attempt to fix it before there was any downtime impact to the end user.

So the end result is that I now have VM’s that are self-aware of their own registration status with XenDesktop and can try to fix themselves. My custom agent is very lightweight, only about 60K in size. There is no need for any PowerShell or additional snap-ins to be installed to the VM. All it requires to run is the .Net framework which should already be installed for the VDA to function. Actually, you’d have to wonder why this kind of functionality is not already built in to the VDA in some form or another. (Hey Citrix, I’m available for consulting on my idea!)

In the end, I came second in the Geekovation award final but I still think this was a really cool idea and even since then, I have been using the Monitor Service API a lot more, but that is for future posts. You can see my short PowerPoint from my two minute presentation on this at Synergy here.

Questions or comments, feel free to contact me. Email: sasponto(AT)gmail(DOT)com Twitter: @SasPonto

9 Comments

  • Chris Doran says:

    Saw this at Synergy…loved it…reading it again. Great trick and nice work!

  • Stan says:

    Hey there. An excellent article! Thank you! However I have a question regarding the “registerVM()” item. What exactly is that? I ask because your article gave me an idea for a power shell script to run on the VM’s. I have it querying the registration state, but I don’t know what powershell module “registerVM” translates to. Can you give me a hint?

    Thanks in advance,
    Stan 🙂

    • Shane O'Neill says:

      Hi Stan – thanks for the feedback! That is a separate custom function that then performs a couple of different tasks to try and get the VM to register, it’s not an actual PowerShell module. These include starting and stopping services, clearing the DNS cache and validating the DDC entry in the registry. I’ll happily post a “Part 2” about this topic and go in to details about that custom function and the tasks that it performs.

      Shane…

  • Sascha Meyes says:

    Hi Shane,

    incredible solution for larger vdi infrastructures, congrats!

    i read also the synergy geekovation presentation, impressive.

    Are the sources\Agent available for free download?

    thanks

    best regards sascha

    • Shane O'Neill says:

      Hi Sascha,

      Unfortunately, I can’t share the entire source code… I am just able to share the concept of it and some of the key components… Hopefully that will allow people to develop their own solution based on it…

      Shane…

  • Stan says:

    So far I’ve got a powershell script that flushes DNS cache and restarts the Citrix Broker Agent service, triggered on Citrix Desktop Service events 1002, 1017 & 1022. I’m just looking for more stuff for the powershell script to do, in an effort to get the VDI session re-registered. Honestly, it’d probably better to make this a CMD file (speed & efficiency reasons), but it’s still a work in progress. 🙂 Looking forward to part 2! Thanks again!

  • shan says:

    Great article..
    looking forward for the 2nd part.
    Just another thought of utilizing scom and orchestrator to monitor events on DDCs and deal the bad machines on the fly (not a batch of failed machines)
    like triggering a healing script on the VDIs or something.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">