Cross Platform in OpsMgr 2007 R2 Release Candidate

You have heard it all over the place, System Center Operations Manager 2007 R2 has reached the Release Candidate milestone and the RC bits have been made available on connect.microsoft.com.

As it is becoming a tradition for me with each new release, I want to take a look at the Unix Monitoring stuff like I did since beta1 of Xplat, passing thru beta2. I am an integration freak and I have always insisted that interoperability is key. I will leave the most obvious “release notes” kind of things out of here, such as saying that there are now agents for the x64 version of linux distro’s, and so on…. you can read this stuff in the release notes already and in a zillion of other places.

Let’s instead look at my first impression ( = I am amazed: this product is really getting awesome) and let’s do a bit of digging, mostly to note what changed since my previous posts on Xplat (which, by the way, is the MOST visited post on this blog I ever published) – of course there is A LOT more that has changed under the hood… but those are code changes, improvements, polishing of the product itself… while that would be interesting from a code perspective, here I am more interested in what the final user (the System Administrator) will ultimately interact with directly, and what he might need to troubleshoot and understand how the pieces fit together to realize Unix Monitoring in OpsMgr.

After having hacked the RedHat MP to work on my CentOS box (as usual), I started to take a look at what is installed on the Linux box. Here are the new services:

ps -Af | grep scx

You will notice the daemons have changed names and get launched with new parameters.

Of course when you see who uses port 1270 everything becomes clearer:

netstat -anp | grep 1270

Therefore I can place the two new names and understand that SCXCIMSERVER is the WSMAN implementation, while SCXCIMPROVAGT is the CIM/WBEM implementation.

There is one more difference at the “service” (or “daemon”) level: the fact that there is only ONE init script now: /etc/init.d/scx-cimd

/etc/init.d/scx-cimd

So basically the SCX “Agent” will start and stop as a single thing, even if it is composed of multiple executables that will spawn various processes.

Another difference: if we look in “familiar” locations like /etc/opt/microsoft/scx/bin/tools/ we see that a number of configuration files is either empty (0 bytes) or missing (like the one described on Ander’s blog to enable verbose logging of WSMan requests), when compared to earlier versions:

/etc/opt/microsoft/scx/conf

But that is because I have been told we now have a nice new tool called scxadmin under /opt/microsoft/scx/bin/tools/ , which will let you configure those things:

/opt/microsoft/scx/bin/tools/scxadmin

Therefore you would enable VERBOSE logging for all components by issuing the command

./scxadmin -log-set all verbose

and you will bring it back to a less noisy setting of logging only errors with

./scxadmin -log-set all errors

the logs will be written under /var/opt/microsoft/scx/log just like they did before.

Other than this, a lot of the troubleshooting techniques I showed in one of my previous posts, like how to query CIM classes directly or thru WSMAN remotely by using winrm – they should really stay the same. I will mention them again here for reference.

SCXCIMCLI is a useful and simple tool used to query CIM directly. You can roughly compare it to wbemtest.exe in the WIndows world (other than not having a UI). This utility can also be found in /opt/microsoft/scx/bin/tools

A couple of examples of the most common/useful things you would do with scxcimcli:

1) Enumerate all Classes whose name contains “SCX_” in the root/scx namespace (the classes our Management packs use):

./scxcimcli nc -n root/scx -di |grep SCX_ | sort

./scxcimcli nc -n root/scx -di |grep SCX | sort

2) Execute a Query

./scxcimcli xq “select * from SCX_OperatingSystem” -n root/scx

./scxcimcli xq "select * from SCX_OperatingSystem" -n root/scx

Also another thing that you might want to test when troubleshooting discoveries, is running the same queries through WS-Man (possibly from the same Management Server that will or should be managing that unix box). I already showed this in the past, it is the following command:

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:root -password:password -r:https://linuxbox.mydomain.com:1270/wsman -auth:basic –skipCACheck

but if you launch it that way it will now return an error like the following (or at least it did in my test lab):

Fault
Code
Value = SOAP-ENV:Sender
Subcode
Value = wsman:EncodingLimit
Reason
Text = UTF-16 is not supported; Please use UTF-8
Detail
FaultDetail = http://schemas.dmtf.org/wbem/wsman/1/wsman/faultDetail/CharacterSet

Error number:  -2144108468 0x8033804C
The WS-Management service does not support the character set used in the request
. Change the request to use UTF-8 or UTF-16.

the error message is pretty self explanatory: you need to specify the UTF-8 Character set. You can do it by adding the “-encoding” qualifier:

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:root -password:password -r:https://linuxbox.mydomain.com:1270/wsman -auth:basic –skipCACheck –encoding:UTF-8

Hope the above is useful to figure out the differences between the earlier beta releases of the System Center CrossPlatform extensions and the version built in OpsMgr 2007 R2 Release Candidate.

There are obviously a million of other things in R2 worth writing about (either related to the Unix monitoring or to everything else) and I am sure posts will start to appear on the many, more active, blogs out there (they have already started appearing, actually). I have not had time to dig further, but will likely do so AFTER Easter – as the next couple of weeks I will be travelling, working some of the time (but without my test environment and good connectivity) AND visiting relatives the rest of the time.

One last thing I noticed about the Unix/Cross Platform Management Packs in R2 Release Candidate… their current “release date” exposed by the MP Catalog Web Service is the 20th of March

image

…which happens to be my Birthday – therefore they must be a present for me! 🙂

Disclaimer

The information in this weblog is provided “AS IS” with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided “AS IS” without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I’VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

CentOS discovery in OpsMgr2007 R2 beta

Here we go again. Now that the OpsMgr2007 R2 beta is out, with an improved and revamped version of the System Center Cross Platform Extensions, I faced the issue of how to upgrade my test lab.

I have to say that OpsMgr2007 R2 beta release notes explain the known issues, and I had no trouble whatsoever upgrading the windows part. It just took its time (I am running virtual machines in my test lab, that don’t have the best performance), but it went smoothly and without a glitch. In a couple of hours I had everything upgraded: databases, RMS, reporting, agents, gateway. All right then. The new purple icons in System Center look cute, and the new UI has some great stuff, such as a long-awaited way to update your management packs directly from the Internet, better display of Overrides (kind of what we used to rely on Override Explorer for)… and  A LOT more new stuff that I won’t be wasting my Sunday writing about since everybody else has already done it two days ago:

opsmgr aggregated feed on Twitter

Therefore let’s get back to my upgrade, which is a lot more interesting (to me) than the marketing tam-tam 🙂

As part of the upgrade to R2, I had to first uninstall the Xplat beta refresh bits, which I had installed, including all Unix Management Packs. Including my CentOS Management Pack I had improvised.

So this is the new start page of the integrated Discovery Wizard:

Discovery Wizard

Looks nice and integrates the functionality of discovering and deploying Windows machines, SNMP Devices, and Unix/Linux machines.

Of course, my CentOS machine would not be discovered, and showed up as an unsupported platform. Of course my old Management Pack I had hacked together in XPlat Beta 1 did not work anymore. Therefore, I figured out I had to see what changes were there, and how to make it work again (of course it IS possible – It is NOT SUPPORTED, but I don’t care, as long as it works).

Since the existing agent could not be discovered, the first step I took was logging on the Linux box, un-install the old agent, and install the new one:

XPlat Agent RPM Install on CentOS

There I tried to discover again, but of course it still failed.

At that point I started taking a look at the new layout of things on the unix side. Most stuff is located in the same directories where beta1 was installed, and there are a bunch of useful commands under /opt/microsoft/scx/bin/tools.
You can check out the Open Pegasus version used:

[root@centos tools]# ./scxcimconfig –version
Version 2.7.0

Let’s take a look at what SCX classes we have available:

./scxcimcli nc -n root/scx -di |grep SCX | sort

./scxcimcli nc -n root/scx -di |grep SCX | sort

Nice. That’s the stuff we will be querying over WS-Man from the Management Server.

So let’s look at the OS Discovery, and we test it from the OpsMgr 2007 box:

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:root -password:password -r:https://centos:1270/wsman -auth:basic -skipCACheck

it returns results:

OS WS-Man Query

At first I assumed this worked like in Beta1, therefore I exported RedHat management pack and I made my own version of it, replacing the strings it is expecting to find to discover CentOS instead than Redhat.

While the MP was syntactically correct and would import fine, the Discovery wizard still didn’t work.

I took one more look at the discoveries in the MP, and I found there are two more, targeted to Management Server, which is probably what gets used by the Discovery Wizard to understand what kind of agent kit needs to be deployed.

MP XML - Discoveries

So basically this discovery checks for the returned value from the module to determine if the discovered platform is a supported one:

Discovery Settings

But how does the module get its data?

Look at the layout of the /AgentManagement/UnixAgents folder on the Management Server:

/AgentManagement/unixAgents

That’s it: GetOSVersion.sh – a shell script. A nice, open, clear text, hackable shell script. Let’s take a look at it:

Discovery Script Hack

So that’s it, and how my modification looks like. What happens during the discovery wizard is that we probably copy the script over SCP to the box, execute it, look at a number of things, and return the discovery data we need.

If you do those steps manually, you see how the script returns something very similar to a PropertyBag, just like discoveries done by VBScript on Windows machines:

Discovery Script Output

So after modifying the script… here we go. The Wizard now thinks CentOS is Red Hat, and can install an agent on it:

Discovery Wizard

Deploying Agent

Only when the Management Server discovery finally considers the CentOS machine worth managing, then the other discoveries that use WS-Man queries start kicking in, like the old one did, and find the OS objects and all the other hosted objects. In order for this to work you don’t only need to hack the shell script, but to have a hacked MP – the “regular” Red Har one won’t find CentOS, which is and remains an UNSUPPORTED platform.

CentOS Health Model

Disclaimer

The information in this weblog is provided “AS IS” with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided “AS IS” without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I’VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

Testing System Center Cross Plaform Extentions

I am testing the beta bits of the cross-platform extensions that were released on Microsoft Connect 

This post wants to describe my limited testing so far – I hope this can benefit/help everyone testing the beta for some stuff that might currently not be incredibly clear – unless you attended the MMS class, at least :-))

I started out with the White Paper that has been posted on the web, which describes the architecture pretty well, but from a higher level (with diagrams and the like). Then I downloaded the beta bits, which contain another document about setting the thing up. It is pretty well done, to be honest (especially if you consider that it is beta documentation for a beta product!), but it does not really go all the way down to troubleshooting things a lot, yet. I will try to cover some of that here.

I installed the agent manually – it’s just a RPM package, not much that can go wrong with that. There is a reason why I did not use the push discovery and deployment of the agent, which you will figure out reading later on. Once installed, I tried to figure out how things were looking like on the linux machine. It is all pretty understandable, after all, if you look around on the machine (documented or not, linux and open source stuff is easy to figure out by reading configuration files and the like, and by searching on the web).

Basically the “agent” is not properly an “agent” the way the windows agent is, since it does not really “sends” stuff to the Management Server on its own: It consists of a  couple of services/daemons, based on existing opensource projects, but configured in their own folder, with their own name, and using different ports than a standard install of those,  not to conflict with possible existing ones on those machines.

The Management Service uses these services remotely (similar to doing agentless monitoring towards a windows box) using these services. The two services are:

 scx-services commands

It is easy to figure out how they are layed out. Even if undocumented, you look at the processes

SCX processes

and you can figure out WHERE they live (/opt/microsoft/scx/bin/….) and where their configuration files are located (/etc/opt/microsoft/scx/conf …).

SCX Configuration

The files are self explanatory, and the documentation of the opensource projects can be found on the Internet: 

for wsmand

for cimd

 

I still have to delve into them properly as I would like to, but I already figured out a bunch of interesting things by quickly looking at them.

Agent Communication someone must have decided to “recycle” the 1270 port number that was used in MOM2005 🙂 Basically openwsman listens as a SSL listener (with basic auth – connected via PAM module with the “regular” unix /etc/passwd users, so you can authenticate as those without having to define specific users for the service). So all that happens is that the Management Server asks things/executes WS-Man queries and commands on this channel. The Management Server connects every time to the agent on port 1270 using SSL, authenticates as “root” (or as the specified “Action Account”) and does its stuff, or asks the agent to do it. So the communication is happening from the Management Server to the agent… not the other way around like it happens with Windows “agents”. That’s why it feels to me more like an “agentless” thing, at least for what concerns the “direction” of traffic and who does the actual querying.

For the rest, the provided Management Packs have “normal” discoveries and “normal” monitors. Pretty much like the Windows Management Packs often discover thing by querying WMI, here they use WS-Man to run CIM queries against the Unix boxes.

The Service Model is totally cool to actually *SEE* in action, don’t you think so ?

Service Model

 

A few more debugging/troubleshooting information:

I searched a bit and found the openwsman.org documentation and forum to be useful to figure some things out. For example I banged my head a few times before managing to actually TEST a query from windows to linux using WINRM. This document helped a lot.

Of course you have to solve some other things such as DNS resolution AND trusting the self-issued certificates that the agent uses, first. Once you have done that, you can run test queries from the Windows box towards the Unix ones by using WinRM.

For example, this is how I tested what the discovery for a Linux RedHat Computer type should be returning (I read that by opening the MP in authoring console, as one would usually do for any MP):

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:root -password:password -r:https://centos:1270/wsman -auth:basic

If you need to test the query directly *ON* the linux box (querying the CIMD instead than WSMAND), the WBEMEXEC utility is packaged with the agent (under /opt/microsoft/scx/bin/tools ). It is not as easy as some windows administrators (that have used WBEMTEST or WMI Tools in the past) would hope, but not even that bad. Just to run a few queries to the CIM daemon locally it is not really interactive, so you need to create a XML file that looks like the following (basically you build the RAW request the way the CIMD accepts it):

 

 

<?xml version=”1.0″ ?>

<CIM CIMVERSION=”2.0″ DTDVERSION=”2.0″>

<MESSAGE ID=”50000″ PROTOCOLVERSION=”1.0″>

<SIMPLEREQ>

<IMETHODCALL NAME=”EnumerateInstanceNames”>

<LOCALNAMESPACEPATH>

<NAMESPACE NAME=”root”/>

<NAMESPACE NAME=”scx”/>

</LOCALNAMESPACEPATH>

<IPARAMVALUE NAME=”ClassName”>

<CLASSNAME NAME=”SCX_OperatingSystem”/>

</IPARAMVALUE>

</IMETHODCALL>

</SIMPLEREQ>

</MESSAGE>

</CIM>

 

 

Once you have made such a file, you can execute the query in the file with the tool like the following:

./wbemexec -d2 query.xml

 

As you can see from here, CIMD uses HTTP already. This differs from Windows’ WMI that uses RPC/DCOM. In a way, this is much simpler to troubleshoot, and more firewall-friendly.

 

I have not really found an activity or debug log for any of those components, yet… but in the end they are not doing anything ON THEIR OWN, unless asked by the MS…. So the “healthservice” logic is all on the MS anyway. Errors about failed discoveries, permissions of the Action Account user, and anything else will be logged by the HealthService on the Windows machine (the Management Server) that is actually performing monitoring towards the Unix box.

It really is *just* getting the WMI and WinRM-equivalent layer on linux/Unix up and running– after that, everything is done from windows anyway!

After this common management infrastructure has been provided, 3rd parties will be facilitated in writing *just* MPs, without having to worry about the TRANSPORT of information anymore.

 

As you have probably noticed from the screenshots and commandlines, I don’t have a “real” Redhat Enterprise Linux or “supported” linux distribution… Therefore I started my testing using CentOS 5 (which is very similar to RHEL 5) – the agent installed fine as you can see, but I was not getting anything really “discovered” – the MP had only found a “linux computer” but was not finding any “RedHat” or “SuSe” or any other “Operating System” instances… and if you are somewhat familiar with the way Operations Manager targeting works, you would understand that monitors are targeted at object classes. If I don’t have any instance of those objects being discovered, NO MONITORING actually happens, even if the infrastructure is in place and the pieces are talking to each other:

 CentOS not discovered

Therefore my machine was not being monitored.

In the end, I actually even got it to work, but I had to create a new Management Pack (exporting and modifying the RHEL5 one as a base) that would actually search for different Property values and discover CentOS instead as if it were RedHat:

CentOS Discovered 

After importing my hacked Management Pack the machine started to be monitored. Here you can see Health Explorer in all of its glory:

image008

Of course this is a hack I made just to have a test setup somewhat working and to familiarize myself with the SCX components. It is not guaranteed that my Management pack actually works on CentOS the way it is supposed to work and that there aren’t other – more subtle – differences between RedHat and CentOS that will make it fail. I only modified a couple of Discoveries to let it discover the “Operating System” instance… everything else should follow, but not necessarily. One difference you see already in the screenshot above is that I am not yet seeing the hardware being monitored, so my hack is already only partially working and it is definitely something that won’t be supported, so I cannot provide it here. Also, this is a beta, so I I think that the Management Packs will be re-released with following beta versions, and this change is something that would need to be re-done all over again. Also, the unsupported distribution is the reason why I installed the agent manually in the first place, as the “Discovery Wizard” would not really “agree” to go and let me install the agent remotely on an unsupported “platform!”.

But I could not wait to see this working, while waiting two business days (we are on a weekend!) for confirmation that I am allowed to actually download a 30-day-unsupported-Trial of the “real” RedHat Enteprise Linux, so I cheated 🙂

 

 

Disclaimer

The information in this weblog is provided “AS IS” with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided “AS IS” without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I’VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION.

On this website we use first or third-party tools that store small files (cookie) on your device. Cookies are normally used to allow the site to run properly (technical cookies), to generate navigation usage reports (statistics cookies) and to suitable advertise our services/products (profiling cookies). We can directly use technical cookies, but you have the right to choose whether or not to enable statistical and profiling cookies. Enabling these cookies, you help us to offer you a better experience.