Archive for the 'MOM' Category

RSS Feed for the 'MOM' Category

Using the SCX Agent with WSMan from Powershell v2

Monday, June 1st, 2009

So Powershell v2 adds a nice bunch of Ws-Man related cmdlets. Let’s see how we can use them to interact with OpenPegasus’s WSMan on a SCX Agent.

PS C:\maint> test-wsman -computer virtubuntu.huis.dom -port 1270 -authentication basic -credential (get-credential) -usessl

cmdlet Get-Credential at command pipeline position 1
Supply values for the following parameters:
Credential

image

But we do get this error:

Test-WSMan : The server certificate on the destination computer (virtubuntu.huis.dom:1270) has the following errors:
The SSL certificate could not be checked for revocation. The server used to check for revocation might be unreachable.

The SSL certificate is signed by an unknown certificate authority.
At line:1 char:11
+ test-wsman <<<<  -computer virtubuntu.huis.dom -port 1270 -authentication basic -credential (get-credential) -usessl
+ CategoryInfo          : InvalidOperation: (:) [Test-WSMan], InvalidOperationException
+ FullyQualifiedErrorId : WsManError,Microsoft.WSMan.Management.TestWSManCommand

The credentials above have to be a unix login. Which we typed correctly. But we still can't get thru, as the certificate used by the agent is not trusted by our workstation. This seems to be the “usual” issue I first faced when testing SCX with WINRM in beta1. At the time I simply dismissed it with the following sentence

[…] Of course you have to solve some other things such as DNS resolution AND trusting the self-issued certificates that the agent uses, first. Once you have done that, you can run test queries from the Windows box towards the Unix ones by using WinRM. […]

and I sincerely thought that it would explain pretty well… but eventually a lot of people got confused by this and did not know what to do, especially for the part that goes about trusting the certificate.  Anyway, in the following posts I figured out you could pass the –skipCACheck parameter to WINRM… which solved the issue with having to trust the certificate (which is fine for testing, but I would not use that for automations and scripts running in production… as it might expose your credentials to man-in-the-middle attacks).

So it seems that with the Powershell cmdlets we are back to that issue, as I can’t find a parameter to skip the CA check. Maybe it is there, but with PSv2 not having been released yet, I don't know everything about it, and the CTP documentation is not yet complete. Therefore, back to trusting the certificate.

Trusting the certificate is actually very simple, but it can be a bit tricky when passing those certs back and forth from unix to windows. So let's make the process a bit clearer.

All of the SCX-agents certificates are ultimately signed by a key on the Management server that has discovered them, but I don't currently know where that certificate/key is stored on the management server. Anyway, you can get it from the agent certificate – as you only really need the public key, not the private signing key.

Use WinSCP or any other utility to copy the certificate off one of the agents.
You can find that in the /etc/opt/microsoft/scx/ssl location:

image

that scx-host-computername.pem is your agent certificate.

Copy it to the Management server and change its extension from .pem to .cer. Now Windows will be happy to show it to you with the usual Certificate interface:

image

We need to go to the “Certification Path” tab, select the ISSUER certificate (the one called “SCX-Certificate”):

image

then go to the “Details” tab, and use the “Copy to File” button to export the certificate.

After you have the certificate in a .CER file, you can add it to the “trusted root certification authorities” store on the computer you are running your powershell tests from.

image

So after you have trusted it, the same command as above actually works now:

PS C:\maint> test-wsman -computer virtubuntu.huis.dom -port 1270 -authentication basic -credential (get-credential) -usessl

cmdlet Get-Credential at command pipeline position 1
Supply values for the following parameters:
Credential

wsmid           : http://schemas.dmtf.org/wbem/wsman/identify/1/wsmanidentity.xsd
lang            :
ProtocolVersion : http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd
ProductVendor   : Microsoft System Center Cross Platform
ProductVersion  : 1.0.4-248

Ok, we can talk to it! Now we can do something funnier, like actually returning instances and/or calling methods:

PS C:\maint> Get-WSManInstance -computer virtubuntu.huis.dom -authentication basic -credential (get-credential) -port 1270 -usessl -enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx

image

This is far from exhaustive, but should get you started on a world of possibilities about automating diagnostics and responses with Powershell v2 towards the OpsMgr 2007 R2 Cross-Platform machines. Enjoy!

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

Installing the OpsMgr 2007 R2 SCX Agent on Ubuntu

Saturday, May 30th, 2009

You know since the beta1 of Xplat I have been busy with modifying the Redhat management pack and monitor CentOS with OpsMgr. Now, CentOS is a distribution that is pretty similar to RedHat, so the RPM package just runs, and it is only a matter of hacking a modified MP.

I never went really further in my experiments, mostly due to lack of time… but then yesterday I got a comment to this older post asking about Ubuntu. Of course I know about Ubuntu, and have been using Debian-based distributions for years. I actually even prefer them over RPM-based distributions such as RedHat or SuSE (personal preference). Heck, even this weblog is running on Debian!

Anyway, I never really tried to see if one of the existing RPM packages for RedHat or SuSE could be modified to run on Ubuntu. I will eventually test this on Debian too, but for now I used Ubuntu which tends to have slightly newer packages and libraries, overall. The machine I tested on is a Ubuntu Server 8.04.2. Older/newer versions might slightly differ.

BEWARE THAT ALL THAT FOLLOWS BELOW IS NOT SUPPORTED BY MICROSOFT. It is only described here for EXPERIMENTAL (==fun) purpose. DO NOT USE THIS IN A PRODUCTION ENVIRONMENT.

So, you are warned. Now let’s hack it.

The first thing to do is to copy the Redhat agent’s RPM package off your OpsMgr2007 R2 server in the “usual” path “C:Program FilesSystem Center Operations manager 2007AgentManagementUnixAgents”. Let’s grab the RHEL5 agent, which is called scx-1.0.4-248.rhel.5.x86.rpm in R2 RTM.

First we need to CONVERT the RPM package to the DEB package format used by Ubuntu, by using the ALIEN package:

sudo apt-get update
sudo apt-get install alien
sudo bash
alien -k scx-1.0.4-248.rhel.5.x86.rpm –scripts
dpkg -i scx_1.0.4-248_i386.deb

image

The converted package will install… but the script execution will fail in a few places – most notably in the generation of the certificate, as it is not able to locate the right openssl libraries, as shown in the screenshot above.

If the libssl.so.6 file cannot be found, you might be missing the “libssl-dev” package, which you can install as follows:

apt-get install libssl-dev

But even if it is installed, you will find that the files are still missing. This is not really true: actually, the files are there, but on Ubuntu they have a different name than on RedHat, that’s all. You can therefore create hardlinks to the “right” files, so that they are aliased and get found afterwards:

cd /usr/lib
ln -s libcrypto.so.0.9.8 libcrypto.so.6
ln -s libssl.so.0.9.8 libssl.so.6

So now when installing the package, the certificate generation will work:

image

You are nearly ready to go. You have to start the service by using the init scripts – the “service” command is RedHat-specific, that will still fail.

/etc/init.d/scx-cimd start is the “standard” way of starting daemons from init on Unix.

But it still fails, as it seems that the init script provided in the RedHat package is really searching for a file called “functions” which is present on RedHat and on CentOS, which provides re-usable functions for startup scripts to include:

image

How do you fix this? I just copied the /etc/init.d/functions file from a CentOS box to my Ubuntu box.

I copied it via SCP from the CentOS box I have:

cd /etc/init.d

scp root@centos.huis.dom:/etc/init.d/functions .

You can probably also find and fetch the file from the Internet (both CentOS and RedHat should have accessible repositories with all the files in their distributions, since it is open sourced).

After you have the file in place, the init script will be able to include it, will find the functions it needs, and the daemon/service will now start (even if with minor errors I have not investigated for now, but that don’t seem to be causing troubles):

image

and here you can see it is finally running:

image

So let’s try to issue a few queries as shown in a previous posts:

image

IT WORKS!!!

But… there is a “but”: not all classes actually return instances and values just yet. Most notably the “SCX_OperatingSystem” class does not seem to return anything right awy. That is a very important class, because is the one we would use to first discover the Operating System object in the Management Packs. So we need to fix it. The reason why the class does not return anything, is that the SCX provider is looking into the /etc/redhat-release file to return what OS version/distribution the machine is running. And the file is obviously not there on Ubuntu.

On all Linuxes there is a similar file, called /etc/issue… which again, we can copy with the other name and trick the provider into working:

cd /etc

cp issue redhat-release

And NOW, the SCX_OperatingSystem Class also returns an instance:

image

The next step would be “cooking” an MP to discover Ubuntu. More on this on a later post (maybe). I did not test all classes and their implementation… you can try to poke at them by following the instructions and commands on my previous post here. But this should get you started.

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

Early Adoptions, Health Checks and New Year Rants.

Tuesday, December 30th, 2008

Generations

Two days ago I read the following Tweet by Hugh MacLeod:

"[...] Early Adopter Problem: How to differentiate from the bandwagon, once the bandwagon starts moving faster than you are [...]"

That makes me think of early adoption of a few technologies I have been working with, and how the community around those evolved. For example:

Operations Manager… early adoption meant that I have been working with it since the beta, had posted one of the earliest posts about how to use a script in a Unit Monitor back in may 2007 (the product was released in April 2007 and there was NO documentation back then, so we had to really try to figure out everything…), but someone seems to think it is worth repeating the very same lesson in November 2008, with not a lot of changes, as I wrote here. I don't mean being rude to Anders… repeating things will surely help the late adopters finding the information they need, of course.

Also, I started playing early with Powershell. I posted my first (and only) cmdlet back in 2006. It was not a lot more than a test for myself to learn how to write one, but that's just to say that I started playing early with it. I have been using it to automate tasks for example.

Going back to the quote above, everyone gets on the bandwagon posting examples and articles. I had been asked a few times about writing articles on OpsMgr and Powershell usage (for example by www.powershell.it) but I declined, as I was too busy using this knowledge to do stuff for work (where “work” is defined as in “work that pays your mortgage”), rather than seeking personal prestige through articles and blogs. Anyway, that kind of articles are appearing now all over the Internet and the blogosphere now. The above examples made me think of early adoption, and the bandwagon that follows later on… but even as an early adopter, I was never very noisy or visible.

Now, going back to what I do for work, (which I mentioned here and here in the past), I work in the Premier Field Engineering organization of Microsoft Services, which provides Premier services to customers. Microsoft Premier customer have a wide range of Premier agreement features and components that they can use to support their people, improve their processes, and improve the productive use of the Microsoft technology they have purchased. Some of these services we provide are known to the world as “Health Checks”, some as “Risk Assessment Programs” (or, shortly, RAPs). These are basically services where one of our technology experts goes on the customer site and there he uses a custom, private Microsoft tool to gather a huge amount of data from the product we mean to look at (be it SQL, Exchange, AD or anything else….). The Health Check or RAP tool collects the data and outputs a draft of the report that will be delivered to the customer later on, with all the right sections and chapters. This is done so that every report of the same kind will look consistent, even if the engagement is performed by a different engineer in a different part of the world. The engineer will of course analyze the collected data and write recommendations about what is configured properly and/or about what could or should be changed and/or improved in the implementation to make it adhere to Best Practices. To make sure only the right people actually go onsite to do this job we have a strict internal accreditation process that must be followed; only accredited resources that know the product well enough and know exactly how to interpret the data that the tool collects are allowed to use it and to deliver the engagement, and present/write the findings to the customer.

So why am I telling you this here, and how have I been using my early knowledge of OpsMgr and Powershell for ?

I have used that to write the Operations Manager Health Check, of course!

We had a MOM 2005 Health Check already, but since the technology has changed so much, from MOM to OpsMgr, we had to write a completely new tool. Jeff  (the original MOM2005 author, who does not have a blog that I can link to) and me are the main coders of this tool… and the tool itself is A POWERSHELL script. A longish one, of course (7000 lines, more or less), but nothing more than a Powershell script, at the end of the day. There are a few more colleagues that helped shape the features and tested the tool, including Kevin Holman. Some of the database queries on Kevin’s blog are in fact what we use to extract some of the data (beware that some of those queries have recently been updated, in case you saved them and using your local copy!), while some other information are using internal and/or custom queries. Some other times we use OpsMgr cmdlets or go to the SDK service, but a lot of times we query the database directly (we really should use the SDK all the times, but for certain stuff direct database access is way faster). It took most of the past year to write it, test it, troubleshoot it, fix it, and deliver the first engagements as “beta” to some customers to help iron out the process… and now the delivery is available! If a year seems like a long time, you have to consider this is all work that gets done next to what we all have to normally do with customers, not replacing it (i.e. I am not free to sit on my butt all day and just write the tool… I still have to deliver services to customers day in day out, in the meantime).

Occasionally, during this past calendar year, that is approaching its end, I have been willing and have found some extra time to disclose some bits and pieces, techniques and prototypes of how to use Powershell and OpsMgr together, such as innovative ways to use Powershell in OpsMgr against beta features, but in general most of my early adopter’s investment went into the private tool for this engagement, and that is one of the reasons I couldn’t blog or write much about it, being it Microsoft Intellectual Property.

But it is also true that I did not care to write other stuff when I considered it too easy or it could be found in the documentation. I like writing of ideas, thoughts, rants OR things that I discover and that are not well documented at the time I study them… so when I figure out things I might like leaving a trail for some to follow. But I am not here to spoon feed people like some in the bandwagon are doing. Now the bandwagon is busy blogging and writing continuously about some aspect of OpsMgr (known or unknown, documented or not), and the answer to the original question of Hugh is, in my opinion, that it does not really matter what the bandwagon is doing right now. I was never here to do the same thing. I think that is my differentiator. I am not saying that what a bunch of colleagues and enthusiasts is doing is not useful: blogging and writing about various things they experiment with is interesting and it will be useful to people. But blogs are useful until a certain limit. I think that blogs are best suited for conversations and thoughts (rather than for "howto's"), and what I would love to see instead is: less marketing hype when new versions are announced and more real, official documentation.

But I think I should stop caring about what the bandwagon is doing, because that's just another ego trip at the end of the day. What I should more sensibly do, would be listening to my horoscope instead:

[…] "How do you slay the dragon?" journalist Bill Moyers asked mythologist Joseph Campbell in an interview. By "dragon," he was referring to the dangerous beast that symbolizes the most unripe and uncontrollable part of each of our lives. In reply to Moyers, Campbell didn't suggest that you become a master warrior, nor did he recommend that you cultivate high levels of sleek, savage anger. "Follow your bliss," he said simply. Personally, I don't know if that's enough to slay the dragon — I'm inclined to believe that you also have to take some defensive measures — but it's definitely worth an extended experiment. Would you consider trying that in 2009? […]

CentOS discovery in OpsMgr2007 R2 beta

Sunday, November 23rd, 2008

Here we go again. Now that the OpsMgr2007 R2 beta is out, with an improved and revamped version of the System Center Cross Platform Extensions, I faced the issue of how to upgrade my test lab.

I have to say that OpsMgr2007 R2 beta release notes explain the known issues, and I had no trouble whatsoever upgrading the windows part. It just took its time (I am running virtual machines in my test lab, that don't have the best performance), but it went smoothly and without a glitch. In a couple of hours I had everything upgraded: databases, RMS, reporting, agents, gateway. All right then. The new purple icons in System Center look cute, and the new UI has some great stuff, such as a long-awaited way to update your management packs directly from the Internet, better display of Overrides (kind of what we used to rely on Override Explorer for)… and  A LOT more new stuff that I won't be wasting my Sunday writing about since everybody else has already done it two days ago:

opsmgr aggregated feed on Twitter

Therefore let's get back to my upgrade, which is a lot more interesting (to me) than the marketing tam-tam :-)

As part of the upgrade to R2, I had to first uninstall the Xplat beta refresh bits, which I had installed, including all Unix Management Packs. Including my CentOS Management Pack I had improvised.

So this is the new start page of the integrated Discovery Wizard:

Discovery Wizard

Looks nice and integrates the functionality of discovering and deploying Windows machines, SNMP Devices, and Unix/Linux machines.

Of course, my CentOS machine would not be discovered, and showed up as an unsupported platform. Of course my old Management Pack I had hacked together in XPlat Beta 1 did not work anymore. Therefore, I figured out I had to see what changes were there, and how to make it work again (of course it IS possible – It is NOT SUPPORTED, but I don't care, as long as it works).

Since the existing agent could not be discovered, the first step I took was logging on the Linux box, un-install the old agent, and install the new one:

XPlat Agent RPM Install on CentOS

There I tried to discover again, but of course it still failed.

At that point I started taking a look at the new layout of things on the unix side. Most stuff is located in the same directories where beta1 was installed, and there are a bunch of useful commands under /opt/microsoft/scx/bin/tools.
You can check out the Open Pegasus version used:

[root@centos tools]# ./scxcimconfig –version
Version 2.7.0

Let's take a look at what SCX classes we have available:

./scxcimcli nc -n root/scx -di |grep SCX | sort

./scxcimcli nc -n root/scx -di |grep SCX | sort

Nice. That's the stuff we will be querying over WS-Man from the Management Server.

So let's look at the OS Discovery, and we test it from the OpsMgr 2007 box:

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:root -password:password -r:https://centos:1270/wsman -auth:basic -skipCACheck

it returns results:

OS WS-Man Query

At first I assumed this worked like in Beta1, therefore I exported RedHat management pack and I made my own version of it, replacing the strings it is expecting to find to discover CentOS instead than Redhat.

While the MP was syntactically correct and would import fine, the Discovery wizard still didn't work.

I took one more look at the discoveries in the MP, and I found there are two more, targeted to Management Server, which is probably what gets used by the Discovery Wizard to understand what kind of agent kit needs to be deployed.

MP XML - Discoveries

So basically this discovery checks for the returned value from the module to determine if the discovered platform is a supported one:

Discovery Settings

But how does the module get its data?

Look at the layout of the /AgentManagement/UnixAgents folder on the Management Server:

/AgentManagement/unixAgents

That's it: GetOSVersion.sh – a shell script. A nice, open, clear text, hackable shell script. Let's take a look at it:

Discovery Script Hack

So that's it, and how my modification looks like. What happens during the discovery wizard is that we probably copy the script over SCP to the box, execute it, look at a number of things, and return the discovery data we need.

If you do those steps manually, you see how the script returns something very similar to a PropertyBag, just like discoveries done by VBScript on Windows machines:

Discovery Script Output

So after modifying the script… here we go. The Wizard now thinks CentOS is Red Hat, and can install an agent on it:

Discovery Wizard

Deploying Agent

Only when the Management Server discovery finally considers the CentOS machine worth managing, then the other discoveries that use WS-Man queries start kicking in, like the old one did, and find the OS objects and all the other hosted objects. In order for this to work you don't only need to hack the shell script, but to have a hacked MP – the "regular" Red Har one won't find CentOS, which is and remains an UNSUPPORTED platform.

CentOS Health Model

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

Protecting custom Resolution State in OpsMgr 2007

Saturday, September 13th, 2008

In System Center Operations Manager 2007, you can add and remove resolution states for your alerts at will. Other than states "0" ("New") and "255" ("Closed") you can create other 254 resolution states to suit your needs. This is a simple feature that was already present in previous MOM versions, and it is very useful to do a kind of tricks with your alerts. The amount of possible states you can create should be able to satisfy any kind of alert and incident management process you might have in place, and any kind of filtering or forwarding or escalation need you might want to perform by using resolution states.

image

By default, only OpsMgr Administrators can change these settings, with the exception of the two built-in states of "New" and "Closed": those two states are REQUIRED if you want the product to continue working, therefore the GUI won't let you change, edit or delete them. Which is good.

This is not true for your own resolution states, which can be edited or even deleted any time. All that is really saved in an alert when you change an alert's resolution state is the NUMBER associated with it. In fact you even use that number when querying for alerts in the Command Shell:

get-alert | where {$_.resolutionstate -eq 0}

That means that if by accident you delete a resolution state you have defined, you won't see its description anymore in the GUI. Also, if you try to re-organize your resolution state, you can easily change the IDs for existing ones… Sure, you need to have the permissions in order to change or delete them, but what if you have implemented your important Alert and Incident management process by using resolution states and you want a bit of extra protection from mistakes or unintended deletion for them?

Then you can protect them by making the product think they were "built-in" too, just like "New" and "Closed".

How would you do this? In an UNSUPPORTED WAY: editing the database :-) In fact, those resolution states are written in a table in the database, called "ResolutionState" (who would have guessed it?), that looks like the following picture:

dbo.ResolutionState

Can you see the "IsPredefined" column? That can be set to "True" or "False" and that value is used by the SDK service to tell the GUI if that Resolution State can be edited/deleted or not.

Of course changing the database directly IS NOT SUPPORTED by Microsoft. You do this at your own risk, and if it was me, I would *NEVER* touch, change or remove the default two states ("New" and "Closed") as THAT really would BREAK the product. For example, Alerts that are not set to "Closed" (255) won't be ever groomed. And that is VERY BAD. NEVER, NEVER DO THAT.

On the other end, changing a custom Resolution State to make the product believe it is Predefined/Built-in has not had any negative impact in my (limited) testing so far, and has added the advantage of "protecting" my resolution state from unintended deletion, as shown below:

image

As usual, do this at your own risk. Remember what's written in my Disclaimer:

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MICROSOFT, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS HACK.

CentOS 5 Management Pack for OpsMgr SCX

Tuesday, May 13th, 2008

As I mentioned here, I have been testing the SCX beta.

Not having one of the "supported" platforms pushed me into playing with the provided Management Packs, and in turn I managed to use the MP for Red Hat Enterprise Linux 5 as a base, and replaced a couple of strings in the discoveries in order to get a working CentOS 5 Management Pack.

CentOS_HealthExplorer01_NEW

I still have not looked into the "hardware" monitors and health model / service model, so those are not currently monitored. But it is a start.

A lot of people have asked me a lot of information and would like to get the file – both in the blog's comment, on the newsgroup, or via mail. I am sorry, but I cannot provide you with the file, because it has not been throughly tested and might render your systems unstable, and also because there might be licensing and copyright issues that I have not checked within Microsoft.

Keep also in mind that using CentOS as a monitored platform is NOT a SUPPORTED scenario/platform for SCX. I only used it because I did not have a Suse or Redhat handy that day, and because I wanted to understand how the Management Packs using WS-Man worked.

This said, should you wish to try to do the same "MP Hacking" I did,  I pretty much explained all you need to know in my previous post and its comments, so that should not be that difficult.

Actually, I still think that the best way to figure out how things are done is by looking at the actual implementation, so I encourage you to look at the management packs and figure out how those work. There are a few mature tools out there that will help you author/edit Management Packs if you don't want to edit the XML directly: the Authoring Console, and Silect MP Studio Lite, for example. If you want to delve in the XML details, instead, then I suggest you read the Authoring Guide and peek at Steve Wilson's AuthorMPs.com site.

Disclaimer
The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS PROGRAM.

Testing System Center Cross Plaform Extentions

Sunday, May 4th, 2008

I am testing the beta bits of the cross-platform extensions that were released on Microsoft Connect 

This post wants to describe my limited testing so far – I hope this can benefit/help everyone testing the beta for some stuff that might currently not be incredibly clear – unless you attended the MMS class, at least :-) )

I started out with the White Paper that has been posted on the web, which describes the architecture pretty well, but from a higher level (with diagrams and the like). Then I downloaded the beta bits, which contain another document about setting the thing up. It is pretty well done, to be honest (especially if you consider that it is beta documentation for a beta product!), but it does not really go all the way down to troubleshooting things a lot, yet. I will try to cover some of that here.

I installed the agent manually – it’s just a RPM package, not much that can go wrong with that. There is a reason why I did not use the push discovery and deployment of the agent, which you will figure out reading later on. Once installed, I tried to figure out how things were looking like on the linux machine. It is all pretty understandable, after all, if you look around on the machine (documented or not, linux and open source stuff is easy to figure out by reading configuration files and the like, and by searching on the web).

Basically the “agent” is not properly an "agent" the way the windows agent is, since it does not really "sends" stuff to the Management Server on its own: It consists of a  couple of services/daemons, based on existing opensource projects, but configured in their own folder, with their own name, and using different ports than a standard install of those,  not to conflict with possible existing ones on those machines.

The Management Service uses these services remotely (similar to doing agentless monitoring towards a windows box) using these services. The two services are:

 scx-services commands

It is easy to figure out how they are layed out. Even if undocumented, you look at the processes

SCX processes

and you can figure out WHERE they live (/opt/microsoft/scx/bin/….) and where their configuration files are located (/etc/opt/microsoft/scx/conf …).

SCX Configuration

The files are self explanatory, and the documentation of the opensource projects can be found on the Internet: 

for wsmand

for cimd

 

I still have to delve into them properly as I would like to, but I already figured out a bunch of interesting things by quickly looking at them.

Agent Communication someone must have decided to “recycle” the 1270 port number that was used in MOM2005 :-) Basically openwsman listens as a SSL listener (with basic auth – connected via PAM module with the “regular” unix /etc/passwd users, so you can authenticate as those without having to define specific users for the service). So all that happens is that the Management Server asks things/executes WS-Man queries and commands on this channel. The Management Server connects every time to the agent on port 1270 using SSL, authenticates as “root” (or as the specified "Action Account") and does its stuff, or asks the agent to do it. So the communication is happening from the Management Server to the agent… not the other way around like it happens with Windows "agents". That’s why it feels to me more like an “agentless” thing, at least for what concerns the “direction” of traffic and who does the actual querying.

For the rest, the provided Management Packs have “normal” discoveries and “normal” monitors. Pretty much like the Windows Management Packs often discover thing by querying WMI, here they use WS-Man to run CIM queries against the Unix boxes.

The Service Model is totally cool to actually *SEE* in action, don’t you think so ?

Service Model

 

A few more debugging/troubleshooting information:

I searched a bit and found the openwsman.org documentation and forum to be useful to figure some things out. For example I banged my head a few times before managing to actually TEST a query from windows to linux using WINRM. This document helped a lot.

Of course you have to solve some other things such as DNS resolution AND trusting the self-issued certificates that the agent uses, first. Once you have done that, you can run test queries from the Windows box towards the Unix ones by using WinRM.

For example, this is how I tested what the discovery for a Linux RedHat Computer type should be returning (I read that by opening the MP in authoring console, as one would usually do for any MP):

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:root -password:password -r:https://centos:1270/wsman -auth:basic

If you need to test the query directly *ON* the linux box (querying the CIMD instead than WSMAND), the WBEMEXEC utility is packaged with the agent (under /opt/microsoft/scx/bin/tools ). It is not as easy as some windows administrators (that have used WBEMTEST or WMI Tools in the past) would hope, but not even that bad. Just to run a few queries to the CIM daemon locally it is not really interactive, so you need to create a XML file that looks like the following (basically you build the RAW request the way the CIMD accepts it):

 

 

<?xml version="1.0" ?>

<CIM CIMVERSION="2.0" DTDVERSION="2.0">

<MESSAGE ID="50000" PROTOCOLVERSION="1.0">

<SIMPLEREQ>

<IMETHODCALL NAME="EnumerateInstanceNames">

<LOCALNAMESPACEPATH>

<NAMESPACE NAME="root"/>

<NAMESPACE NAME="scx"/>

</LOCALNAMESPACEPATH>

<IPARAMVALUE NAME="ClassName">

<CLASSNAME NAME="SCX_OperatingSystem"/>

</IPARAMVALUE>

</IMETHODCALL>

</SIMPLEREQ>

</MESSAGE>

</CIM>

 

 

Once you have made such a file, you can execute the query in the file with the tool like the following:

./wbemexec -d2 query.xml

 

As you can see from here, CIMD uses HTTP already. This differs from Windows' WMI that uses RPC/DCOM. In a way, this is much simpler to troubleshoot, and more firewall-friendly.

 

I have not really found an activity or debug log for any of those components, yet… but in the end they are not doing anything ON THEIR OWN, unless asked by the MS…. So the “healthservice” logic is all on the MS anyway. Errors about failed discoveries, permissions of the Action Account user, and anything else will be logged by the HealthService on the Windows machine (the Management Server) that is actually performing monitoring towards the Unix box.

It really is *just* getting the WMI and WinRM-equivalent layer on linux/Unix up and running– after that, everything is done from windows anyway!

After this common management infrastructure has been provided, 3rd parties will be facilitated in writing *just* MPs, without having to worry about the TRANSPORT of information anymore.

 

As you have probably noticed from the screenshots and commandlines, I don’t have a “real” Redhat Enterprise Linux or “supported” linux distribution… Therefore I started my testing using CentOS 5 (which is very similar to RHEL 5) – the agent installed fine as you can see, but I was not getting anything really “discovered” – the MP had only found a “linux computer” but was not finding any “RedHat” or “SuSe” or any other "Operating System" instances… and if you are somewhat familiar with the way Operations Manager targeting works, you would understand that monitors are targeted at object classes. If I don't have any instance of those objects being discovered, NO MONITORING actually happens, even if the infrastructure is in place and the pieces are talking to each other:

 CentOS not discovered

Therefore my machine was not being monitored.

In the end, I actually even got it to work, but I had to create a new Management Pack (exporting and modifying the RHEL5 one as a base) that would actually search for different Property values and discover CentOS instead as if it were RedHat:

CentOS Discovered 

After importing my hacked Management Pack the machine started to be monitored. Here you can see Health Explorer in all of its glory:

image008

Of course this is a hack I made just to have a test setup somewhat working and to familiarize myself with the SCX components. It is not guaranteed that my Management pack actually works on CentOS the way it is supposed to work and that there aren't other – more subtle – differences between RedHat and CentOS that will make it fail. I only modified a couple of Discoveries to let it discover the "Operating System" instance… everything else should follow, but not necessarily. One difference you see already in the screenshot above is that I am not yet seeing the hardware being monitored, so my hack is already only partially working and it is definitely something that won't be supported, so I cannot provide it here. Also, this is a beta, so I I think that the Management Packs will be re-released with following beta versions, and this change is something that would need to be re-done all over again. Also, the unsupported distribution is the reason why I installed the agent manually in the first place, as the "Discovery Wizard" would not really "agree" to go and let me install the agent remotely on an unsupported "platform!".

But I could not wait to see this working, while waiting two business days (we are on a weekend!) for confirmation that I am allowed to actually download a 30-day-unsupported-Trial of the "real" RedHat Enteprise Linux, so I cheated :-)

 

 

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION.

Looking at OpsMgr2007 Alert trend with Command Shell

Friday, January 25th, 2008

It's friday night, I am quite tired and I can't be asked of writing a long post. But I have not written much all week, not even updated my Twitter, and now I want to finish the week with at least some goodies. So this is the turn of a couple of Powershell commands/snippets/scripts that will count alerts and events generated each day: this information could help you understand the trends of events and alerts over time in a Management Group. It is nothing fancy at all, but they can still be useful to someone out there. In the past (MOM 2005) I used to gather this kind of information with SQL Queries against the operations database. But now, with Powershell, everything is exposed as objects and it is much easier to get information without really getting your hands dirty with the database :-)

#Number of Alerts per day

$alerttimes = Get-Alert | Select-Object TimeRaised
$array=@()

foreach ($datetime in $alerttimes){
$array += $datetime.timeraised.date
}

$array | Group-Object Date

#Number of Events per day

$eventtimes = Get-Event | Select-Object TimeGenerated
$array=@()

foreach ($datetime in $eventtimes){
$array += $datetime.timegenerated.date
}

$array | Group-Object Date

Beware that these "queries" might take a long time to execute (especially the events one) depending on the amount of data and your retention policy.

This is of course just scratching the surface of the amount of amazing things you can do with Powershell in Operations Manager 2007. For this kind of information you might want to keep an eye on the official "System Center Operations Manager Command Shell" blog: http://blogs.msdn.com/scshell/

Simply Works

Thursday, December 27th, 2007

Simply Works

Simply Works, uploaded by Daniele Muscetta on Flickr.

I don't know about other people, but I do get a lot to think when the end of the year approaches: all that I've done, what I have not yet done, what I would like to do, and so on…

And it is a period when memories surface.

I found the two old CD-ROMs you can see in the picture. And those are memories.
missioncritical software was the company that invented a lot of stuff that became Microsoft's products: for example ADMT and Operations Manager.

The black CD contains SeNTry, the "enterprise event manager", what later became Operations Manager.
On the back of the CD, the company motto at the time: "software that works simply and simply works".
So true. I might digress on this concept, but I won't do that right now.

I have already explained in my other blog what I do for work. Well, that was a couple of years ago anyway. Several things have changed, and we are moving towards offering services that are more measurable and professional. So, since it happens that in a certain job you need to be an "expert" and "specialize" in order to be "seen" or "noticed".
You know I don't really believe in specialization. I have written it all over the place. But you need to make other people happy as well and let them believe what they want, so when you "specialize" they are happier. No, really, it might make a difference in your carrer :-)

In this regard, I did also mention my "meeting again" with Operations Manager.
That's where Operations manager helped me: it let me "specialize" in systems and applications management… a field where you need to know a bit of everything anyway: infrastructure, security, logging, scripting, databases, and so on… :-)
This way, everyone wins.

Don't misunderstand me, this does not mean I want to know everything. One cannot possibly know everything, and the more I learn the more I believe I know nothing at all, to be honest. I don't know everything, so please don't ask me everything – I work with mainframes :-)
While that can be a great excuse to avoid neighbours and relatives annoyances with their PCs though, on the serious side I still believe that any intelligent individual cannot be locked into doing a narrow thing and know only that one bit just because it is common thought that you have to act that way.

If I would stop where I have to stop I would be the standard "IT Pro". I would be fine, sure, but I would get bored soon. I would not learn anything. But I don't feel I am the standard "IT Pro". In fact, funnily enough, on some other blogs out there I have been referenced as a "Dev" (find it on your own, look at their blogrolls :-) ). But I am not a Dev either then… I don't write code for work. I would love to, but I rarely actually do, other than some scripts. Anyway, I tend to escape the definition of the usual "expert" on something… mostly because I want to escape it. I don't see myself represented by those generalization.

As Phil puts it, when asked "Are software developers – engineers or artists?":

"[...] Don’t take this as a copout, but a little of both. I see it more as craftsmanship. Engineering relies on a lot of science. Much of it is demonstrably empirical and constrained by the laws of physics. Software is less constrained by physics as it is by the limits of the mind. [...]"

Craftmanship. Not science.
And stop calling me an "engineer". I am not an engineer. I was even crap in math, in school!

Anyway, what does this all mean? In practical terms, it means that in the end, wether I want it or not, I do get considered an "expert" on MOM and OpsMgr… and that I will mostly work on those products for the next year too. But that is not bad, because, as I said, working on that product means working on many more things too. Also, I can point to different audiences: those believing in "experts" and those going beyond schemes. It also means that I will have to continue teaching a couple of scripting classes (both VBScript and PowerShell) that nobody else seems to be willing to do (because they are all *expert* in something narrow), and that I will still be hacking together my other stuff (my facebook apps, my wordpress theme and plugins, my server, etc) and even continue to have strong opinions in those other fields that I find interesting and where I am not considered an *expert* ;-)

Well, I suppose I've been ranting enough for today…and for this year :-)
I really want to wish everybody again a great beginning of 2008!!! What are you going to be busy with, in 2008 ?

Monitoring Syslog with OpsMgr 2007

Friday, November 9th, 2007

I had missed it… finally guidance on how to collect and monitor UNIX syslog in System Center Operations Manager 2007 has been published!

This is much more sysadmin-oriented than what was availble before (that remais of course still relevant, but more from a Management Pack developer's point of view, who wants to know how things work "behind the hood").