Archive for the 'OpsMgr2007' Category

RSS Feed for the 'OpsMgr2007' Category

OpsMgr Eventlog analysis with Powershell

Wednesday, December 16th, 2009

The following technique should already be understood by any powersheller. Here we focus on Operations Manager log entries, even if the data mining technique shows is entirely possibly – and encouraged :-) – with any other event log.

Let’s start by getting our eventlog into a variable called $evt:

PS  >> $evt = Get-Eventlog “Operations Manager”

The above only works locally in POSH v1.

In POSH v2 you can go remotely by using the “-computername” parameter:

PS  >> $evt = Get-Eventlog “Operations Manager” –computername RMS.domain.com

Anyhow, you can get to this remotely also in POSHv1 with this other more “dotNET-tish” syntax:

PS >> $evt = (New-Object System.Diagnostics.Eventlog -ArgumentList "Operations Manager").get_Entries()

you could even export this (or any of the above) to a CLIXML file:

PS >> (New-Object System.Diagnostics.Eventlog -ArgumentList "Operations Manager").get_Entries() | export-clixml -path c:\evt\Evt-OpsMgr-RMS.MYDOMAIN.COM.xml

and then you could reload your eventlog to another machine:

PS  >> $evt = import-clixml c:\evt\Evt-OpsMgr-RMS.MYDOMAIN.COM.xml

whatever way you used to populate your $evt  variable, be it from a “live” eventlog or by re-importing it from XML, you can then start analyzing it:

PS  >> $evt | where {$_.Entrytype -match "Error"} | select EventId,Source,Message | group eventid

Count Name                      Group
—– —-                      —–
1510 4509                      {@{EventID=4509; Source=HealthService; Message=The constructor for the managed module type "Microsoft.EnterpriseManagement.Mom.DatabaseQueryModules.GroupCalculatio.
   15 20022                     {@{EventID=20022; Source=OpsMgr Connector; Message=The health service {7B0E947B-2055…
    3 26319                     {@{EventID=26319; Source=OpsMgr SDK Service; Message=An exception was thrown while p…
    1 4512                      {@{EventID=4512; Source=HealthService; Message=Converting data batch to XML failed w…

the above is functionally identical to the following:

PS  >> $evt | where {$_.Entrytype -eq 1} | select EventID,Source,Message | group eventid

Count Name                      Group
—– —-                      —–
1510 4509                      {@{EventID=4509; Source=HealthService; Message=The constructor for the managed modul…
   15 20022                     {@{EventID=20022; Source=OpsMgr Connector; Message=The health service {7B0E947B-2055…
    3 26319                     {@{EventID=26319; Source=OpsMgr SDK Service; Message=An exception was thrown while p…
    1 4512                      {@{EventID=4512; Source=HealthService; Message=Converting data batch to XML failed w…

Note that Eventlog Entries’ type is an ENUM that has values of 0,1,2 – similarly to OpsMgr health states – but beware that their order is not the same, as shown in the following table:

Code OpsMgr States Events EntryType
0 Not Monitored Information
1 Success Error
2 Warning Warning
3 Critical

Let’s now look at Information Events (Entrytype –eq 0)

PS  >> $evt | where {$_.Entrytype -eq 0} | select EventID,Source,Message | group eventid

Count Name                      Group
—– —-                      —–
4135 2110                      {@{EventID=2110; Source=HealthService; Message=Health Service successfully transferr…
1548 21025                     {@{EventID=21025; Source=OpsMgr Connector; Message=OpsMgr has received new configura…
4644 7026                      {@{EventID=7026; Source=HealthService; Message=The Health Service successfully logge…
1548 7023                      {@{EventID=7023; Source=HealthService; Message=The Health Service has downloaded sec…
1548 7025                      {@{EventID=7025; Source=HealthService; Message=The Health Service has authorized all…
1548 7024                      {@{EventID=7024; Source=HealthService; Message=The Health Service successfully logge…
1548 7028                      {@{EventID=7028; Source=HealthService; Message=All RunAs accounts for management gro…
   16 20021                     {@{EventID=20021; Source=OpsMgr Connector; Message=The health service {7B0E947B-2055…
   13 7019                      {@{EventID=7019; Source=HealthService; Message=The Health Service has validated all …
    4 4002                      {@{EventID=4002; Source=Health Service Script; Message=Microsoft.Windows.Server.Logi…

 

And “Warning” events (Entrytype –eq 2):

PS  >> $evt | where {$_.Entrytype -eq 2} | select EventID,Source,Message | group eventid

Count Name                      Group
—– —-                      —–
1511 1103                      {@{EventID=1103; Source=HealthService; Message=Summary: 1 rule(s)/monitor(s) failed …
  501 20058                     {@{EventID=20058; Source=OpsMgr Connector; Message=The Root Connector has received b…
    5 29202                     {@{EventID=29202; Source=OpsMgr Config Service; Message=OpsMgr Config Service could …
  421 31501                     {@{EventID=31501; Source=Health Service Modules; Message=No primary recipients were …
   18 10103                     {@{EventID=10103; Source=Health Service Modules; Message=In PerfDataSource, could no…
    1 29105                     {@{EventID=29105; Source=OpsMgr Config Service; Message=The request for management p…

 

 

Ok now let’s see those event 20022, for example… so we get an idea of which healthservices they are referring to (20022 indicates" “hearthbeat failure”, btw):

PS  >> $evt | where {$_.eventid -eq 20022} | select message

Message
——-
The health service {7B0E947B-2055-C12A-B6DB-DD6B311ADF39} running on host webapp3.domain1.mydomain.com and s…
The health service {E3B3CCAA-E797-4F08-860F-47558B3DA477} running on host SERVER1.domain2.mydomain.com and serving…
The health service {E3B3CCAA-E797-4F08-860F-47558B3DA477} running on host SERVER1.domain2.mydomain.com and serving…
The health service {E3B3CCAA-E797-4F08-860F-47558B3DA477} running on host SERVER1.domain2.mydomain.com and serving…
The health service {52E16F9C-EB1A-9FAF-5B9C-1AA9C8BC28E3} running on host DC4WK3.domain1.mydomain.com and se…
The health service {F96CC9E6-2EC4-7E63-EE5A-FF9286031C50} running on host VWEBDL2.domain1.mydomain.com and s…
The health service {71987EE0-909A-8465-C32D-05F315C301CC} running on host VDEVWEBPROBE2.domain2.mydomain.com….
The health service {BAF6716E-54A7-DF68-ABCB-B1101EDB2506} running on host XP2SMS002.domain2.mydomain.com and serving mana…
The health service {30C81387-D5E0-32D6-C3A3-C649F1CF66F1} running on host stgweb3.domain3.mydomain.com and…
The health service {3DCDD330-BBBB-B8E8-4FED-EF163B27DE0A} running on host VWEBDL1.domain1.mydomain.com and s…
The health service {13A47552-2693-E774-4F87-87DF68B2F0C0} running on host DC2.domain4.mydomain.com and …
The health service {920BF9A8-C315-3064-A5AA-A92AA270529C} running on host FSCLU2 and serving management group Pr…
The health service {FAA3C2B5-C162-C742-786F-F3F8DC8CAC2F} running on host WEBAPP4.domain1.mydomain.com and s…
The health service {3DCDD330-BBBB-B8E8-4FED-EF163B27DE0A} running on host WEBDL1.domain1.mydomain.com and s…
The health service {3DCDD330-BBBB-B8E8-4FED-EF163B27DE0A} running on host WEBDL1.domain1.mydomain.com and s…

 

or let’s look at some warning for the Config Service:

PS  >> $evt | where {$_.Eventid -eq 29202}

   Index Time          EntryType   Source                 InstanceID Message
   —– —-          ———   ——                 ———- ——-
5535065 Dec 07 21:18  Warning     OpsMgr Config Ser…   2147512850 OpsMgr Config Service could not retrieve a cons…
5543960 Dec 09 16:39  Warning     OpsMgr Config Ser…   2147512850 OpsMgr Config Service could not retrieve a cons…
5545536 Dec 10 01:06  Warning     OpsMgr Config Ser…   2147512850 OpsMgr Config Service could not retrieve a cons…
5553119 Dec 11 08:24  Warning     OpsMgr Config Ser…   2147512850 OpsMgr Config Service could not retrieve a cons…
5555677 Dec 11 10:34  Warning     OpsMgr Config Ser…   2147512850 OpsMgr Config Service could not retrieve a cons…

Once seen those, can you remember of any particular load you had on those days that justifies the instance space changing so quickly that the Config Service couldn’t keep up?

 

Or let’s group those events with ID 21025 by hour, so we know how many Config recalculations we’ve had (which, if many, might indicate Config Churn):

PS  >> $evt | where {$_.Eventid -eq 21025} | select TimeGenerated | % {$_.TimeGenerated.ToShortDateString()} | group

Count Name                      Group
—– —-                      —–
   39 12/7/2009                 {12/7/2009, 12/7/2009, 12/7/2009, 12/7/2009…}
  203 12/8/2009                 {12/8/2009, 12/8/2009, 12/8/2009, 12/8/2009…}
  217 12/9/2009                 {12/9/2009, 12/9/2009, 12/9/2009, 12/9/2009…}
  278 12/10/2009                {12/10/2009, 12/10/2009, 12/10/2009, 12/10/2009…}
  259 12/11/2009                {12/11/2009, 12/11/2009, 12/11/2009, 12/11/2009…}
  224 12/12/2009                {12/12/2009, 12/12/2009, 12/12/2009, 12/12/2009…}
  237 12/13/2009                {12/13/2009, 12/13/2009, 12/13/2009, 12/13/2009…}
   91 12/14/2009                {12/14/2009, 12/14/2009, 12/14/2009, 12/14/2009…}

 

Event ID 21025 shows that there is a new configuration for the Management Group.

Event ID 29103 has a similar wording, but shows that there is a new configuration for a given Healthservice. These should normally be many more events, unless your only health Service is the RMS, which is unlikely…

If we look at the event description (“message”) in search for the name (or even the GUID, as both are present) or our RMS, as follows, then they should be the same numbers of the 21025 above:

PS  >> $evt | where {$_.Eventid -eq 29103} | where {$_.message -match "myrms.domain.com"} | select TimeGenerated | % {$_.TimeGenerated.ToShortDateString()} | group

Count Name                      Group
—– —-                      —–
   39 12/7/2009                 {12/7/2009, 12/7/2009, 12/7/2009, 12/7/2009…}
  203 12/8/2009                 {12/8/2009, 12/8/2009, 12/8/2009, 12/8/2009…}
  217 12/9/2009                 {12/9/2009, 12/9/2009, 12/9/2009, 12/9/2009…}
  278 12/10/2009                {12/10/2009, 12/10/2009, 12/10/2009, 12/10/2009…}
  259 12/11/2009                {12/11/2009, 12/11/2009, 12/11/2009, 12/11/2009…}
  224 12/12/2009                {12/12/2009, 12/12/2009, 12/12/2009, 12/12/2009…}
  237 12/13/2009                {12/13/2009, 12/13/2009, 12/13/2009, 12/13/2009…}
   91 12/14/2009                {12/14/2009, 12/14/2009, 12/14/2009, 12/14/2009…}

 

Going back to the initial counts of events by their IDs, when showing the errors the counts above had spotted the presence of a lonely 4512 event, which might have gone undetected if just browsing the eventlog with the GUI, since it only occurred once.

Let’s take a look at it:

PS  >> $evt | where {$_.eventid -eq 4512}

   Index Time          EntryType   Source                 InstanceID Message
   —– —-          ———   ——                 ———- ——-
5560756 Dec 12 11:18  Error       HealthService          3221229984 Converting data batch to XML failed with error …

Now, when it is about counts, Powershell is great.  But sometimes Powershell makes it difficult to actually READ the (long) event messages (descriptions) in the console. For example, our event ID 4512 is difficult to read in its entirety and gets truncated with trailing dots…

we can of course increase the window size and/or selecting only THAT one field to read it better:

PS  >> $evt | where {$_.eventid -eq 4512} | select message

Message
——-
Converting data batch to XML failed with error "Not enough storage is available to complete this operation." (0×8007000E) in rule "Microsoft.SystemCenter.ConfigurationService.CollectionRule.Event.ConfigurationChanged" running for instance "RMS.MYDOMAIN.COM" with id:"{04F4ADED-2C7F-92EF-D620-9AF9685F736F}" in management group "SCOMPROD"

Or, worst case, if it still does not fit, we can still go and search for it in the actual, usual eventlog application… but at least we will have spotted it!

 

The above wants to give you an idea of what is easily accomplished with some simple one-liners, and how it can be a useful aid in analyzing/digging into Eventlogs.

All of the above is ALSO be possible with Logparser, and it would actually be even less heavy on memory usage and it will be quicker, to be honest!

I just like Powershell syntax a lot more, and its ubiquity, which makes it a better option for me. Your mileage may vary, of course.

Invoking Methods on the Xplat agent with WINRM

Monday, October 26th, 2009

So I was testing other stuff tonight, to be honest, but I got pinged on Instant Messenger by my geek friend and colleague Stefan Stranger who pointed me at his request for help here http://friendfeed.com/sstranger/4571f39b/help-needed-on-winrs-or-winrm-and-openwsman-to

He wanted to use WINRM or any other command line utility to interact with the Xplat agent, and call methods on the Unix machine from windows. This could be very useful to – for example – restart a service (in fact it is what the RECOVERY actions in the Xplat Management Packs do, btw).

At first I told him I had only tested enumerations – such as on this other post http://www.muscetta.com/2009/06/01/using-the-scx-agent-with-wsman-from-powershell-v2/ … but the question intrigued me, so I check out the help for winrm’s INVOKE verb:

clip_image002

Which told me that you can pass in the parameters for the method to be called/invoked either as an hashtable @{KEY=”value”;KEY2=”value”}, or as an input XML file. I first tried the XML file but I could not get its format right.

After a few more minutes of trying, I figured out the right syntax.

This one works, for example:

winrm invoke ExecuteCommand http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx @{command="ps";timeout="60"} -username:root -password:password -auth:basic -r:https://virtubuntu.huis.dom:1270/wsman -skipCACheck -encoding:UTF-8

clip_image004

Happy remote management of your unix systems from Windows :-)

The mistery of the lost registry values

Thursday, September 10th, 2009

During the OpsMgr Health Check engagement we use custom code to assess the customer’s Management group, as I wrote here already. Given that the customer tells us which machine is the RMS, one of the very first things that we do in our tool is to connect to the RMS’s registry, and check the values under HKLM\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup to see which machine holds the database. It is a rather critical piece of information for us, as we run a number of queries afterward… so we need to know where the db is, obviously :-)

I learned from here http://mybsinfo.blogspot.com/2007/01/powershell-remote-registry-and-you-part.html how to access registry remotely thru powershell, by using .Net classes. This is also one of the methods illustrated in this other article on Technet Script Center http://www.microsoft.com/technet/scriptcenter/resources/qanda/jan09/hey0105.mspx 

Therefore the “core” instructions of the function I was using to access the registry looked like the following

  1. Function GetValueFromRegistry ([string]$computername, $regkey, $value)   
  2. {  
  3.      $reg = [Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey('LocalMachine', $computername)  
  4.      $regKey= $reg.OpenSubKey("$regKey")  
  5.      $result = $regkey.GetValue("$value")  
  6.      return $result 
  7. }  

 

[Note: the actual function is bigger, and contains error handling, and logging, and a number of other things that are unnecessary here]

Therefore, the function was called as follows:
GetValueFromRegistry $RMS "SOFTWARE\\Microsoft\\Microsoft Operations Manager\\3.0\\Setup" "DatabaseServerName"
Now so far so good.

In theory.

 

Now for some reason that I could not immediately explain, we had noticed that this piece of code performing registry accessm while working most of the times, only on SOME occasions was giving errors about not being able to open the registry value…

image

When you are onsite with a customer conducting an assessment, the PFE engineer does not always has the time to troubleshoot the error… as time is critical, we have usually resorted to just running the assessment from ANOTHER machine, and this “solved” the issue… but always left me wondering WHY this was giving an error. I had suspected an issue with permissions first, but it could not be as the permissions were obviously right: performing the assessment from another machine but with the same user was working!

A few days ago my colleague and buddy Stefan Stranger figured out that this was related to the platform architecture:

  • X64 client to x64 RMS was working
  • X64 client to x86 RMS was working
  • X86 client to x86 RMS was working
  • X86 client to x64 RMS was NOT working

You don’t need to use our custom code to reproduce this, REGEDIT shows the behavior as well.

If, from a 64-bit server, you open a remote registry connection to 64-bit RMS server, you can see all OpsMgr registry keys:

clip_image002

If, anyhow, from a 32-bit server, you open a remote registry connection to 64-bit RMS server, you don’t see ALL – but only SOME – OpsMgr registry keys:
clip_image004

So here’s the reason! This is what was happening! How could I not think of this before? It was nothing related to permissions, but to registry redirection! The issue was happening because the 32 bit machine is using the 32bit registry editor and what it will do when accessing a 64bit machine will be to default to the Wow6432Node location in the registry. There all OpsMgr data won’t be in the WOW64 location on a 64bit machine, only some.

So, just like regedit, the 32bit powershell and the 32bit .Net framework were being redirected to the 32bit-compatibility registry keys… not finding the stuff we needed, whereas a 64bit application could find that. Any 32bit application by default gets redirected to a 32bit-safe registry.

So, after finally UNDERSTANDING what the issue was, I started wondering: ok… but how can I access the REAL “HLKM\SOFTWARE\Microsoft” key on a 64bit machine when running this FROM a 32bit machine – WITHOUT being redirected to “HKLM\SOFTWARE\Wow6432Node\Microsoft” ? What if my application CAN deal just fine with those values and actually NEEDs to access them?

The answer wasn’t as easy as the question. I did a bit of digging on this, and still I have NOT yet found a way to do this with the .Net classes. It seems that in a lot of situations, Powershell or even .Net classes are nice and sweet wrappers on the underlying Windows APIs… but for how sweet and easy they are, they are very often not very complete wrappers – letting you do just about enough for most situations, but not quite everything you would or could with the APi underneath. But I digress, here…

The good news is that I did manage to get this working, but I had to resort to using dear old WMI StdRegProvider… There are a number of locations on the Internet mentioning the issue of accessing 32bit registry from 64bit machines or vice versa, but all examples I have found were using VBScript. But I needed it in Powershell. Therefore I started with the VBScript example code that is present here, and I ported it to Powershell.

Handling the WMI COM object from Powershell was slightly less intuitive than in VBScript, and it took me a couple of hours to figure out how to change some stuff, especially this bit that sets the parameters collection:

Set Inparams = objStdRegProv.Methods_("GetStringValue").Inparameters

Inparams.Hdefkey = HKLM

Inparams.Ssubkeyname = RegKey

Inparams.Svaluename = RegValue

Set Outparams = objStdRegProv.ExecMethod_("GetStringValue", Inparams,,objCtx)

INTO this:

$Inparams = ($objStdRegProv.Methods_ | where {$_.name -eq "GetStringValue"}).InParameters.SpawnInstance_()

($Inparams.Properties_ | where {$_.name -eq "Hdefkey"}).Value = $HKLM

($Inparams.Properties_ | where {$_.name -eq "Ssubkeyname"}).Value = $regkey

($Inparams.Properties_ | where {$_.name -eq "Svaluename"}).Value = $value

$Outparams = $objStdRegProv.ExecMethod_("GetStringValue", $Inparams, "", $objNamedValueSet)

 

I have only done limited testing at this point and, even if the actual work now requires nearly 15 lines of code to be performed vs. the previous 3 lines in the .Net implementation, it at least seems to work just fine.

What follows is the complete code of my replacement function, in all its uglyness glory:

 

  1. Function GetValueFromRegistryThruWMI([string]$computername, $regkey, $value)  
  2. {  
  3.     #constant for the HLKM  
  4.     $HKLM = "&h80000002" 
  5.  
  6.     #creates an SwbemNamedValueSet object
  7.     $objNamedValueSet = New-Object -COM "WbemScripting.SWbemNamedValueSet" 
  8.  
  9.     #adds the actual value that will requests the target to provide 64bit-registry info
  10.     $objNamedValueSet.Add("__ProviderArchitecture", 64) | Out-Null 
  11.  
  12.     #back to all the other usual COM objects for WMI that you have used a zillion times in VBScript
  13.     $objLocator = New-Object -COM "Wbemscripting.SWbemLocator" 
  14.     $objServices = $objLocator.ConnectServer($computername,"root\default","","","","","",$objNamedValueSet)  
  15.     $objStdRegProv = $objServices.Get("StdRegProv")  
  16.  
  17.     # Obtain an InParameters object specific to the method.  
  18.     $Inparams = ($objStdRegProv.Methods_ | where {$_.name -eq "GetStringValue"}).InParameters.SpawnInstance_()  
  19.   
  20.     # Add the input parameters  
  21.     ($Inparams.Properties_ | where {$_.name -eq "Hdefkey"}).Value = $HKLM 
  22.     ($Inparams.Properties_ | where {$_.name -eq "Ssubkeyname"}).Value = $regkey 
  23.     ($Inparams.Properties_ | where {$_.name -eq "Svaluename"}).Value = $value 
  24.  
  25.     #Execute the method  
  26.     $Outparams = $objStdRegProv.ExecMethod_("GetStringValue", $Inparams, "", $objNamedValueSet)  
  27.  
  28.     #shows the return value  
  29.     ($Outparams.Properties_ | where {$_.name -eq "ReturnValue"}).Value  
  30.  
  31.     if (($Outparams.Properties_ | where {$_.name -eq "ReturnValue"}).Value -eq 0)  
  32.     {  
  33.        write-host "it worked" 
  34.        $result = ($Outparams.Properties_ | where {$_.name -eq "sValue"}).Value  
  35.        write-host "Result: $result" 
  36.        return $result 
  37.     }  
  38.     else 
  39.     {  
  40.         write-host "nope" 
  41.     }  
  42. }  

 

which can be called similarly to the previous one:
GetValueFromRegistryThruWMI $RMS "SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup" "DatabaseServerName"

[Note: you don’t need the double\escape backslashes here, compared to the .Net implementation]

Enjoy your cross-architecture registry access: from 32bit to 64bit – and back!

SCX Evolutions

Sunday, July 19th, 2009

During the beta of the Cross-Platform extensions and of System Center Operations Manager 2007 R2, the product team had promised to eventually release the SCX Providers'source code.

Now that this promise has been mantained, and the SCX providers have been released on Codeplex at http://xplatproviders.codeplex.com/ it should be finally possible to entirely build your own unsupported agent package, starting from source code, without having to modify the original package as I have shown earlier on this blog.
Of course this will still be unsupported by Microsoft Product support, but will eventually work just fine!
This is an extraordinary event in my opinion, as it is not a common event that Microsoft releases code as open source, especially when this is part of one of the product it sells. I suspect we will see more of this as we going forward.

Also, at R2 release time, some official documentation about buildilng Cross-Plaform Management Packs has been published on Technet.

Anyway, I have in the past posted a number of posts on my blog under this tag http://www.muscetta.com/tag/xplat/ (I will continue to use that tag going forward) which show/describe how I hacked/modified both the existing MPs AND the SCX agent package to let it run on unsupported distributions (and I think they are still useful as they show a number of techniques about how to test, understand and troubleshoot the Xplat agent a bit. In fact, I have first learned how to understand and modify the RedHat MPs to monitor CentOS and eventually even modified the RPM package to run on Ubuntu (which also works on Debian 5/Lenny), eventually, as you can see because I am now using it to monitor – from home, across the Internet – the machine running this blog:

www.muscetta.com Performance in OpsMgr

Or even, with or without OpsMgr 2007 R2, you could write your own scripts to interact with those providers, by using your favourite Scripting Language.

After all, those experimentations with Xplat got me a fame of being a "Unix expert at Microsoft" (this expression still makes me laugh), as I was tweeting here:
Unix expert at Microsoft

But really, I have never hidden my interest for interoperability and the fact that I have been using Linux quite a bit in the past, and still do.

Also, one more related information is that the fine people at Xandros have released their Bridgeways Management Packs and at the same time also started their own blog at http://blog.xplatxperts.com/ where they discuss some troubleshooting techniques for the Xplat agent, both similar to what I have been writing about here and also – of course – specific to their own providers, that are in their XSM namespace.

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

Using the SCX Agent with WSMan from Powershell v2

Monday, June 1st, 2009

So Powershell v2 adds a nice bunch of Ws-Man related cmdlets. Let’s see how we can use them to interact with OpenPegasus’s WSMan on a SCX Agent.

PS C:\maint> test-wsman -computer virtubuntu.huis.dom -port 1270 -authentication basic -credential (get-credential) -usessl

cmdlet Get-Credential at command pipeline position 1
Supply values for the following parameters:
Credential

image

But we do get this error:

Test-WSMan : The server certificate on the destination computer (virtubuntu.huis.dom:1270) has the following errors:
The SSL certificate could not be checked for revocation. The server used to check for revocation might be unreachable.

The SSL certificate is signed by an unknown certificate authority.
At line:1 char:11
+ test-wsman <<<<  -computer virtubuntu.huis.dom -port 1270 -authentication basic -credential (get-credential) -usessl
+ CategoryInfo          : InvalidOperation: (:) [Test-WSMan], InvalidOperationException
+ FullyQualifiedErrorId : WsManError,Microsoft.WSMan.Management.TestWSManCommand

The credentials above have to be a unix login. Which we typed correctly. But we still can't get thru, as the certificate used by the agent is not trusted by our workstation. This seems to be the “usual” issue I first faced when testing SCX with WINRM in beta1. At the time I simply dismissed it with the following sentence

[…] Of course you have to solve some other things such as DNS resolution AND trusting the self-issued certificates that the agent uses, first. Once you have done that, you can run test queries from the Windows box towards the Unix ones by using WinRM. […]

and I sincerely thought that it would explain pretty well… but eventually a lot of people got confused by this and did not know what to do, especially for the part that goes about trusting the certificate.  Anyway, in the following posts I figured out you could pass the –skipCACheck parameter to WINRM… which solved the issue with having to trust the certificate (which is fine for testing, but I would not use that for automations and scripts running in production… as it might expose your credentials to man-in-the-middle attacks).

So it seems that with the Powershell cmdlets we are back to that issue, as I can’t find a parameter to skip the CA check. Maybe it is there, but with PSv2 not having been released yet, I don't know everything about it, and the CTP documentation is not yet complete. Therefore, back to trusting the certificate.

Trusting the certificate is actually very simple, but it can be a bit tricky when passing those certs back and forth from unix to windows. So let's make the process a bit clearer.

All of the SCX-agents certificates are ultimately signed by a key on the Management server that has discovered them, but I don't currently know where that certificate/key is stored on the management server. Anyway, you can get it from the agent certificate – as you only really need the public key, not the private signing key.

Use WinSCP or any other utility to copy the certificate off one of the agents.
You can find that in the /etc/opt/microsoft/scx/ssl location:

image

that scx-host-computername.pem is your agent certificate.

Copy it to the Management server and change its extension from .pem to .cer. Now Windows will be happy to show it to you with the usual Certificate interface:

image

We need to go to the “Certification Path” tab, select the ISSUER certificate (the one called “SCX-Certificate”):

image

then go to the “Details” tab, and use the “Copy to File” button to export the certificate.

After you have the certificate in a .CER file, you can add it to the “trusted root certification authorities” store on the computer you are running your powershell tests from.

image

So after you have trusted it, the same command as above actually works now:

PS C:\maint> test-wsman -computer virtubuntu.huis.dom -port 1270 -authentication basic -credential (get-credential) -usessl

cmdlet Get-Credential at command pipeline position 1
Supply values for the following parameters:
Credential

wsmid           : http://schemas.dmtf.org/wbem/wsman/identify/1/wsmanidentity.xsd
lang            :
ProtocolVersion : http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd
ProductVendor   : Microsoft System Center Cross Platform
ProductVersion  : 1.0.4-248

Ok, we can talk to it! Now we can do something funnier, like actually returning instances and/or calling methods:

PS C:\maint> Get-WSManInstance -computer virtubuntu.huis.dom -authentication basic -credential (get-credential) -port 1270 -usessl -enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx

image

This is far from exhaustive, but should get you started on a world of possibilities about automating diagnostics and responses with Powershell v2 towards the OpsMgr 2007 R2 Cross-Platform machines. Enjoy!

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

Installing the OpsMgr 2007 R2 SCX Agent on Ubuntu

Saturday, May 30th, 2009

You know since the beta1 of Xplat I have been busy with modifying the Redhat management pack and monitor CentOS with OpsMgr. Now, CentOS is a distribution that is pretty similar to RedHat, so the RPM package just runs, and it is only a matter of hacking a modified MP.

I never went really further in my experiments, mostly due to lack of time… but then yesterday I got a comment to this older post asking about Ubuntu. Of course I know about Ubuntu, and have been using Debian-based distributions for years. I actually even prefer them over RPM-based distributions such as RedHat or SuSE (personal preference). Heck, even this weblog is running on Debian!

Anyway, I never really tried to see if one of the existing RPM packages for RedHat or SuSE could be modified to run on Ubuntu. I will eventually test this on Debian too, but for now I used Ubuntu which tends to have slightly newer packages and libraries, overall. The machine I tested on is a Ubuntu Server 8.04.2. Older/newer versions might slightly differ.

BEWARE THAT ALL THAT FOLLOWS BELOW IS NOT SUPPORTED BY MICROSOFT. It is only described here for EXPERIMENTAL (==fun) purpose. DO NOT USE THIS IN A PRODUCTION ENVIRONMENT.

So, you are warned. Now let’s hack it.

The first thing to do is to copy the Redhat agent’s RPM package off your OpsMgr2007 R2 server in the “usual” path “C:Program FilesSystem Center Operations manager 2007AgentManagementUnixAgents”. Let’s grab the RHEL5 agent, which is called scx-1.0.4-248.rhel.5.x86.rpm in R2 RTM.

First we need to CONVERT the RPM package to the DEB package format used by Ubuntu, by using the ALIEN package:

sudo apt-get update
sudo apt-get install alien
sudo bash
alien -k scx-1.0.4-248.rhel.5.x86.rpm –scripts
dpkg -i scx_1.0.4-248_i386.deb

image

The converted package will install… but the script execution will fail in a few places – most notably in the generation of the certificate, as it is not able to locate the right openssl libraries, as shown in the screenshot above.

If the libssl.so.6 file cannot be found, you might be missing the “libssl-dev” package, which you can install as follows:

apt-get install libssl-dev

But even if it is installed, you will find that the files are still missing. This is not really true: actually, the files are there, but on Ubuntu they have a different name than on RedHat, that’s all. You can therefore create hardlinks to the “right” files, so that they are aliased and get found afterwards:

cd /usr/lib
ln -s libcrypto.so.0.9.8 libcrypto.so.6
ln -s libssl.so.0.9.8 libssl.so.6

So now when installing the package, the certificate generation will work:

image

You are nearly ready to go. You have to start the service by using the init scripts – the “service” command is RedHat-specific, that will still fail.

/etc/init.d/scx-cimd start is the “standard” way of starting daemons from init on Unix.

But it still fails, as it seems that the init script provided in the RedHat package is really searching for a file called “functions” which is present on RedHat and on CentOS, which provides re-usable functions for startup scripts to include:

image

How do you fix this? I just copied the /etc/init.d/functions file from a CentOS box to my Ubuntu box.

I copied it via SCP from the CentOS box I have:

cd /etc/init.d

scp root@centos.huis.dom:/etc/init.d/functions .

You can probably also find and fetch the file from the Internet (both CentOS and RedHat should have accessible repositories with all the files in their distributions, since it is open sourced).

After you have the file in place, the init script will be able to include it, will find the functions it needs, and the daemon/service will now start (even if with minor errors I have not investigated for now, but that don’t seem to be causing troubles):

image

and here you can see it is finally running:

image

So let’s try to issue a few queries as shown in a previous posts:

image

IT WORKS!!!

But… there is a “but”: not all classes actually return instances and values just yet. Most notably the “SCX_OperatingSystem” class does not seem to return anything right awy. That is a very important class, because is the one we would use to first discover the Operating System object in the Management Packs. So we need to fix it. The reason why the class does not return anything, is that the SCX provider is looking into the /etc/redhat-release file to return what OS version/distribution the machine is running. And the file is obviously not there on Ubuntu.

On all Linuxes there is a similar file, called /etc/issue… which again, we can copy with the other name and trick the provider into working:

cd /etc

cp issue redhat-release

And NOW, the SCX_OperatingSystem Class also returns an instance:

image

The next step would be “cooking” an MP to discover Ubuntu. More on this on a later post (maybe). I did not test all classes and their implementation… you can try to poke at them by following the instructions and commands on my previous post here. But this should get you started.

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

Cross Platform in OpsMgr 2007 R2 Release Candidate

Friday, March 27th, 2009

You have heard it all over the place, System Center Operations Manager 2007 R2 has reached the Release Candidate milestone and the RC bits have been made available on connect.microsoft.com.

As it is becoming a tradition for me with each new release, I want to take a look at the Unix Monitoring stuff like I did since beta1 of Xplat, passing thru beta2. I am an integration freak and I have always insisted that interoperability is key. I will leave the most obvious “release notes” kind of things out of here, such as saying that there are now agents for the x64 version of linux distro’s, and so on…. you can read this stuff in the release notes already and in a zillion of other places.

Let’s instead look at my first impression ( = I am amazed: this product is really getting awesome) and let’s do a bit of digging, mostly to note what changed since my previous posts on Xplat (which, by the way, is the MOST visited post on this blog I ever published) – of course there is A LOT more that has changed under the hood… but those are code changes, improvements, polishing of the product itself… while that would be interesting from a code perspective, here I am more interested in what the final user (the System Administrator) will ultimately interact with directly, and what he might need to troubleshoot and understand how the pieces fit together to realize Unix Monitoring in OpsMgr.

After having hacked the RedHat MP to work on my CentOS box (as usual), I started to take a look at what is installed on the Linux box. Here are the new services:

ps -Af | grep scx

You will notice the daemons have changed names and get launched with new parameters.

Of course when you see who uses port 1270 everything becomes clearer:

netstat -anp | grep 1270

Therefore I can place the two new names and understand that SCXCIMSERVER is the WSMAN implementation, while SCXCIMPROVAGT is the CIM/WBEM implementation.

There is one more difference at the “service” (or “daemon”) level: the fact that there is only ONE init script now: /etc/init.d/scx-cimd

/etc/init.d/scx-cimd

So basically the SCX “Agent” will start and stop as a single thing, even if it is composed of multiple executables that will spawn various processes.

Another difference: if we look in “familiar” locations like /etc/opt/microsoft/scx/bin/tools/ we see that a number of configuration files is either empty (0 bytes) or missing (like the one described on Ander’s blog to enable verbose logging of WSMan requests), when compared to earlier versions:

/etc/opt/microsoft/scx/conf

But that is because I have been told we now have a nice new tool called scxadmin under /opt/microsoft/scx/bin/tools/ , which will let you configure those things:

/opt/microsoft/scx/bin/tools/scxadmin

Therefore you would enable VERBOSE logging for all components by issuing the command

./scxadmin -log-set all verbose

and you will bring it back to a less noisy setting of logging only errors with

./scxadmin -log-set all errors

the logs will be written under /var/opt/microsoft/scx/log just like they did before.

Other than this, a lot of the troubleshooting techniques I showed in one of my previous posts, like how to query CIM classes directly or thru WSMAN remotely by using winrm – they should really stay the same. I will mention them again here for reference.

SCXCIMCLI is a useful and simple tool used to query CIM directly. You can roughly compare it to wbemtest.exe in the WIndows world (other than not having a UI). This utility can also be found in /opt/microsoft/scx/bin/tools

A couple of examples of the most common/useful things you would do with scxcimcli:

1) Enumerate all Classes whose name contains “SCX_” in the root/scx namespace (the classes our Management packs use):

./scxcimcli nc -n root/scx -di |grep SCX_ | sort

./scxcimcli nc -n root/scx -di |grep SCX | sort

2) Execute a Query

./scxcimcli xq "select * from SCX_OperatingSystem" -n root/scx

./scxcimcli xq "select * from SCX_OperatingSystem" -n root/scx

Also another thing that you might want to test when troubleshooting discoveries, is running the same queries through WS-Man (possibly from the same Management Server that will or should be managing that unix box). I already showed this in the past, it is the following command:

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:root -password:password -r:https://linuxbox.mydomain.com:1270/wsman -auth:basic –skipCACheck

but if you launch it that way it will now return an error like the following (or at least it did in my test lab):

Fault
Code
Value = SOAP-ENV:Sender
Subcode
Value = wsman:EncodingLimit
Reason
Text = UTF-16 is not supported; Please use UTF-8
Detail
FaultDetail = http://schemas.dmtf.org/wbem/wsman/1/wsman/faultDetail/CharacterSet

Error number:  -2144108468 0×8033804C
The WS-Management service does not support the character set used in the request
. Change the request to use UTF-8 or UTF-16.

the error message is pretty self explanatory: you need to specify the UTF-8 Character set. You can do it by adding the “-encoding” qualifier:

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:root -password:password -r:https://linuxbox.mydomain.com:1270/wsman -auth:basic –skipCACheck –encoding:UTF-8

Hope the above is useful to figure out the differences between the earlier beta releases of the System Center CrossPlatform extensions and the version built in OpsMgr 2007 R2 Release Candidate.

There are obviously a million of other things in R2 worth writing about (either related to the Unix monitoring or to everything else) and I am sure posts will start to appear on the many, more active, blogs out there (they have already started appearing, actually). I have not had time to dig further, but will likely do so AFTER Easter – as the next couple of weeks I will be travelling, working some of the time (but without my test environment and good connectivity) AND visiting relatives the rest of the time.

One last thing I noticed about the Unix/Cross Platform Management Packs in R2 Release Candidate… their current “release date” exposed by the MP Catalog Web Service is the 20th of March

image

…which happens to be my Birthday – therefore they must be a present for me! :-)

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

Early Adoptions, Health Checks and New Year Rants.

Tuesday, December 30th, 2008

Generations

Two days ago I read the following Tweet by Hugh MacLeod:

"[...] Early Adopter Problem: How to differentiate from the bandwagon, once the bandwagon starts moving faster than you are [...]"

That makes me think of early adoption of a few technologies I have been working with, and how the community around those evolved. For example:

Operations Manager… early adoption meant that I have been working with it since the beta, had posted one of the earliest posts about how to use a script in a Unit Monitor back in may 2007 (the product was released in April 2007 and there was NO documentation back then, so we had to really try to figure out everything…), but someone seems to think it is worth repeating the very same lesson in November 2008, with not a lot of changes, as I wrote here. I don't mean being rude to Anders… repeating things will surely help the late adopters finding the information they need, of course.

Also, I started playing early with Powershell. I posted my first (and only) cmdlet back in 2006. It was not a lot more than a test for myself to learn how to write one, but that's just to say that I started playing early with it. I have been using it to automate tasks for example.

Going back to the quote above, everyone gets on the bandwagon posting examples and articles. I had been asked a few times about writing articles on OpsMgr and Powershell usage (for example by www.powershell.it) but I declined, as I was too busy using this knowledge to do stuff for work (where “work” is defined as in “work that pays your mortgage”), rather than seeking personal prestige through articles and blogs. Anyway, that kind of articles are appearing now all over the Internet and the blogosphere now. The above examples made me think of early adoption, and the bandwagon that follows later on… but even as an early adopter, I was never very noisy or visible.

Now, going back to what I do for work, (which I mentioned here and here in the past), I work in the Premier Field Engineering organization of Microsoft Services, which provides Premier services to customers. Microsoft Premier customer have a wide range of Premier agreement features and components that they can use to support their people, improve their processes, and improve the productive use of the Microsoft technology they have purchased. Some of these services we provide are known to the world as “Health Checks”, some as “Risk Assessment Programs” (or, shortly, RAPs). These are basically services where one of our technology experts goes on the customer site and there he uses a custom, private Microsoft tool to gather a huge amount of data from the product we mean to look at (be it SQL, Exchange, AD or anything else….). The Health Check or RAP tool collects the data and outputs a draft of the report that will be delivered to the customer later on, with all the right sections and chapters. This is done so that every report of the same kind will look consistent, even if the engagement is performed by a different engineer in a different part of the world. The engineer will of course analyze the collected data and write recommendations about what is configured properly and/or about what could or should be changed and/or improved in the implementation to make it adhere to Best Practices. To make sure only the right people actually go onsite to do this job we have a strict internal accreditation process that must be followed; only accredited resources that know the product well enough and know exactly how to interpret the data that the tool collects are allowed to use it and to deliver the engagement, and present/write the findings to the customer.

So why am I telling you this here, and how have I been using my early knowledge of OpsMgr and Powershell for ?

I have used that to write the Operations Manager Health Check, of course!

We had a MOM 2005 Health Check already, but since the technology has changed so much, from MOM to OpsMgr, we had to write a completely new tool. Jeff  (the original MOM2005 author, who does not have a blog that I can link to) and me are the main coders of this tool… and the tool itself is A POWERSHELL script. A longish one, of course (7000 lines, more or less), but nothing more than a Powershell script, at the end of the day. There are a few more colleagues that helped shape the features and tested the tool, including Kevin Holman. Some of the database queries on Kevin’s blog are in fact what we use to extract some of the data (beware that some of those queries have recently been updated, in case you saved them and using your local copy!), while some other information are using internal and/or custom queries. Some other times we use OpsMgr cmdlets or go to the SDK service, but a lot of times we query the database directly (we really should use the SDK all the times, but for certain stuff direct database access is way faster). It took most of the past year to write it, test it, troubleshoot it, fix it, and deliver the first engagements as “beta” to some customers to help iron out the process… and now the delivery is available! If a year seems like a long time, you have to consider this is all work that gets done next to what we all have to normally do with customers, not replacing it (i.e. I am not free to sit on my butt all day and just write the tool… I still have to deliver services to customers day in day out, in the meantime).

Occasionally, during this past calendar year, that is approaching its end, I have been willing and have found some extra time to disclose some bits and pieces, techniques and prototypes of how to use Powershell and OpsMgr together, such as innovative ways to use Powershell in OpsMgr against beta features, but in general most of my early adopter’s investment went into the private tool for this engagement, and that is one of the reasons I couldn’t blog or write much about it, being it Microsoft Intellectual Property.

But it is also true that I did not care to write other stuff when I considered it too easy or it could be found in the documentation. I like writing of ideas, thoughts, rants OR things that I discover and that are not well documented at the time I study them… so when I figure out things I might like leaving a trail for some to follow. But I am not here to spoon feed people like some in the bandwagon are doing. Now the bandwagon is busy blogging and writing continuously about some aspect of OpsMgr (known or unknown, documented or not), and the answer to the original question of Hugh is, in my opinion, that it does not really matter what the bandwagon is doing right now. I was never here to do the same thing. I think that is my differentiator. I am not saying that what a bunch of colleagues and enthusiasts is doing is not useful: blogging and writing about various things they experiment with is interesting and it will be useful to people. But blogs are useful until a certain limit. I think that blogs are best suited for conversations and thoughts (rather than for "howto's"), and what I would love to see instead is: less marketing hype when new versions are announced and more real, official documentation.

But I think I should stop caring about what the bandwagon is doing, because that's just another ego trip at the end of the day. What I should more sensibly do, would be listening to my horoscope instead:

[…] "How do you slay the dragon?" journalist Bill Moyers asked mythologist Joseph Campbell in an interview. By "dragon," he was referring to the dangerous beast that symbolizes the most unripe and uncontrollable part of each of our lives. In reply to Moyers, Campbell didn't suggest that you become a master warrior, nor did he recommend that you cultivate high levels of sleek, savage anger. "Follow your bliss," he said simply. Personally, I don't know if that's enough to slay the dragon — I'm inclined to believe that you also have to take some defensive measures — but it's definitely worth an extended experiment. Would you consider trying that in 2009? […]

Programmatically Check for Management Pack updates in OpsMgr 2007 R2

Saturday, November 29th, 2008

One of the cool new features of System Center Operations Manager 2007 R2 is the possibility to check and update Management Packs from the catalog on the Internet directly from the Operators Console:

Select Management Packs from Catalog

Even if the backend for this feature is not yet documented, I was extremely curious to see how this had actually been implemented. Especially since it took a while to have this feature available for OpsMgr, I had the suspicion that it could not be as simple as one downloadable XML file, like the old MOM2005's MPNotifier had been using in the past.

Therefore I observed the console's traffic through the lens of my proxy, and got my answer:

ISA Server Log

So that was it: a .Net Web Service.

I tried to ask the web service itself for discovery information, but failed:

WSDL

Since there is no WSDL available, but I badly wanted to interact with it, I had to figure out: what kind of requests would be allowed to it, how should they be written, what methods could they call and what parameters should I pass in the call. In order to get started on this, I thought I could just observe its network traffic. And so I did… I fired up Network Monitor and captured the traffic:

Microsoft Network Monitor 3.2

Microsoft Network Monitor is beautiful and useful for this kind of stuff, as it lets you easily identify which application a given stream of traffic belongs to, just like in the picture above. After I had isolated just the traffic from the Operations Console, I then saved those captures packets in CAP format and opened it again in Wireshark for a different kind of analysis – "Follow TCP Stream":

Wireshark: Follow TCP Stream

This showed me the reassembled conversation, and what kind of request was actually done to the Web Service. That was the information I needed.

Ready to rock at this point, I came up with this Powershell script (to be run in OpsMgr Command Shell) that will:

1) connect to the web service and retrieve the complete MP list for R2 (this part is also useful on its own, as it shows how to interact with a SOAP web service in Powershell, invoking a method of the web service by issuing a specially crafted POST request. To give due credit, for this part I first looked at this PERL code, which I then adapted and ported to Powershell);

2) loop through the results of the "Get-ManagementPack" opsmgr cmdlet and compare each MP found in the Management Group with those pulled from the catalog;

3) display a table of all imported MPs with both the version imported in your Management Group AND the version available on the catalog:

Script output in OpsMgr Command Shell

Remember that this is just SAMPLE code, it is not meant to be used in production environment and it is worth mentioning again that OpsMgr2007 R2 this is BETA software at the time of writing, therefore this functionality (and its implementation) might change at any time, and the script will break. Also, at present, the MP Catalog web service still returns slightly older MP versions and it is not yet kept in sync and updated with MP Releases, but it will be ready and with complete/updated content by the time R2 gets released.

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

CentOS discovery in OpsMgr2007 R2 beta

Sunday, November 23rd, 2008

Here we go again. Now that the OpsMgr2007 R2 beta is out, with an improved and revamped version of the System Center Cross Platform Extensions, I faced the issue of how to upgrade my test lab.

I have to say that OpsMgr2007 R2 beta release notes explain the known issues, and I had no trouble whatsoever upgrading the windows part. It just took its time (I am running virtual machines in my test lab, that don't have the best performance), but it went smoothly and without a glitch. In a couple of hours I had everything upgraded: databases, RMS, reporting, agents, gateway. All right then. The new purple icons in System Center look cute, and the new UI has some great stuff, such as a long-awaited way to update your management packs directly from the Internet, better display of Overrides (kind of what we used to rely on Override Explorer for)… and  A LOT more new stuff that I won't be wasting my Sunday writing about since everybody else has already done it two days ago:

opsmgr aggregated feed on Twitter

Therefore let's get back to my upgrade, which is a lot more interesting (to me) than the marketing tam-tam :-)

As part of the upgrade to R2, I had to first uninstall the Xplat beta refresh bits, which I had installed, including all Unix Management Packs. Including my CentOS Management Pack I had improvised.

So this is the new start page of the integrated Discovery Wizard:

Discovery Wizard

Looks nice and integrates the functionality of discovering and deploying Windows machines, SNMP Devices, and Unix/Linux machines.

Of course, my CentOS machine would not be discovered, and showed up as an unsupported platform. Of course my old Management Pack I had hacked together in XPlat Beta 1 did not work anymore. Therefore, I figured out I had to see what changes were there, and how to make it work again (of course it IS possible – It is NOT SUPPORTED, but I don't care, as long as it works).

Since the existing agent could not be discovered, the first step I took was logging on the Linux box, un-install the old agent, and install the new one:

XPlat Agent RPM Install on CentOS

There I tried to discover again, but of course it still failed.

At that point I started taking a look at the new layout of things on the unix side. Most stuff is located in the same directories where beta1 was installed, and there are a bunch of useful commands under /opt/microsoft/scx/bin/tools.
You can check out the Open Pegasus version used:

[root@centos tools]# ./scxcimconfig –version
Version 2.7.0

Let's take a look at what SCX classes we have available:

./scxcimcli nc -n root/scx -di |grep SCX | sort

./scxcimcli nc -n root/scx -di |grep SCX | sort

Nice. That's the stuff we will be querying over WS-Man from the Management Server.

So let's look at the OS Discovery, and we test it from the OpsMgr 2007 box:

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:root -password:password -r:https://centos:1270/wsman -auth:basic -skipCACheck

it returns results:

OS WS-Man Query

At first I assumed this worked like in Beta1, therefore I exported RedHat management pack and I made my own version of it, replacing the strings it is expecting to find to discover CentOS instead than Redhat.

While the MP was syntactically correct and would import fine, the Discovery wizard still didn't work.

I took one more look at the discoveries in the MP, and I found there are two more, targeted to Management Server, which is probably what gets used by the Discovery Wizard to understand what kind of agent kit needs to be deployed.

MP XML - Discoveries

So basically this discovery checks for the returned value from the module to determine if the discovered platform is a supported one:

Discovery Settings

But how does the module get its data?

Look at the layout of the /AgentManagement/UnixAgents folder on the Management Server:

/AgentManagement/unixAgents

That's it: GetOSVersion.sh – a shell script. A nice, open, clear text, hackable shell script. Let's take a look at it:

Discovery Script Hack

So that's it, and how my modification looks like. What happens during the discovery wizard is that we probably copy the script over SCP to the box, execute it, look at a number of things, and return the discovery data we need.

If you do those steps manually, you see how the script returns something very similar to a PropertyBag, just like discoveries done by VBScript on Windows machines:

Discovery Script Output

So after modifying the script… here we go. The Wizard now thinks CentOS is Red Hat, and can install an agent on it:

Discovery Wizard

Deploying Agent

Only when the Management Server discovery finally considers the CentOS machine worth managing, then the other discoveries that use WS-Man queries start kicking in, like the old one did, and find the OS objects and all the other hosted objects. In order for this to work you don't only need to hack the shell script, but to have a hacked MP – the "regular" Red Har one won't find CentOS, which is and remains an UNSUPPORTED platform.

CentOS Health Model

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.