I am testing the beta bits of the cross-platform extensions that were released on Microsoft Connect
This post wants to describe my limited testing so far – I hope this can benefit/help everyone testing the beta for some stuff that might currently not be incredibly clear – unless you attended the MMS class, at least :-))
I started out with the White Paper that has been posted on the web, which describes the architecture pretty well, but from a higher level (with diagrams and the like). Then I downloaded the beta bits, which contain another document about setting the thing up. It is pretty well done, to be honest (especially if you consider that it is beta documentation for a beta product!), but it does not really go all the way down to troubleshooting things a lot, yet. I will try to cover some of that here.
I installed the agent manually – it’s just a RPM package, not much that can go wrong with that. There is a reason why I did not use the push discovery and deployment of the agent, which you will figure out reading later on. Once installed, I tried to figure out how things were looking like on the linux machine. It is all pretty understandable, after all, if you look around on the machine (documented or not, linux and open source stuff is easy to figure out by reading configuration files and the like, and by searching on the web).
Basically the “agent” is not properly an “agent” the way the windows agent is, since it does not really “sends” stuff to the Management Server on its own: It consists of a couple of services/daemons, based on existing opensource projects, but configured in their own folder, with their own name, and using different ports than a standard install of those, not to conflict with possible existing ones on those machines.
The Management Service uses these services remotely (similar to doing agentless monitoring towards a windows box) using these services. The two services are:
- scx-cimd which implements the CIM daemon (openpegasus.org)
- scx-wsmand which implements Ws-Man daemon (openwsman.org)
It is easy to figure out how they are layed out. Even if undocumented, you look at the processes
and you can figure out WHERE they live (/opt/microsoft/scx/bin/….) and where their configuration files are located (/etc/opt/microsoft/scx/conf …).
The files are self explanatory, and the documentation of the opensource projects can be found on the Internet:
for wsmand
- at openwsman.org (for wsmand)
for cimd
- at openpegasus site (http://www.openpegasus.org/documents.tpl?CALLER=doc.tpl&dcat= )
- on the openpegasus wiki (http://wiki.opengroup.org/pegasus-wiki/doku.php?id=start )
- at the linux management IBM page http://www.ibm.com/developerworks/linux/library/os-ltc-systemsmanagement/
I still have to delve into them properly as I would like to, but I already figured out a bunch of interesting things by quickly looking at them.
Agent Communication someone must have decided to “recycle” the 1270 port number that was used in MOM2005 🙂 Basically openwsman listens as a SSL listener (with basic auth – connected via PAM module with the “regular” unix /etc/passwd users, so you can authenticate as those without having to define specific users for the service). So all that happens is that the Management Server asks things/executes WS-Man queries and commands on this channel. The Management Server connects every time to the agent on port 1270 using SSL, authenticates as “root” (or as the specified “Action Account”) and does its stuff, or asks the agent to do it. So the communication is happening from the Management Server to the agent… not the other way around like it happens with Windows “agents”. That’s why it feels to me more like an “agentless” thing, at least for what concerns the “direction” of traffic and who does the actual querying.
For the rest, the provided Management Packs have “normal” discoveries and “normal” monitors. Pretty much like the Windows Management Packs often discover thing by querying WMI, here they use WS-Man to run CIM queries against the Unix boxes.
The Service Model is totally cool to actually *SEE* in action, don’t you think so ?
A few more debugging/troubleshooting information:
I searched a bit and found the openwsman.org documentation and forum to be useful to figure some things out. For example I banged my head a few times before managing to actually TEST a query from windows to linux using WINRM. This document helped a lot.
Of course you have to solve some other things such as DNS resolution AND trusting the self-issued certificates that the agent uses, first. Once you have done that, you can run test queries from the Windows box towards the Unix ones by using WinRM.
For example, this is how I tested what the discovery for a Linux RedHat Computer type should be returning (I read that by opening the MP in authoring console, as one would usually do for any MP):
winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx -username:root -password:password -r:https://centos:1270/wsman -auth:basic
If you need to test the query directly *ON* the linux box (querying the CIMD instead than WSMAND), the WBEMEXEC utility is packaged with the agent (under /opt/microsoft/scx/bin/tools ). It is not as easy as some windows administrators (that have used WBEMTEST or WMI Tools in the past) would hope, but not even that bad. Just to run a few queries to the CIM daemon locally it is not really interactive, so you need to create a XML file that looks like the following (basically you build the RAW request the way the CIMD accepts it):
<?xml version=”1.0″ ?>
<CIM CIMVERSION=”2.0″ DTDVERSION=”2.0″>
<MESSAGE ID=”50000″ PROTOCOLVERSION=”1.0″>
<SIMPLEREQ>
<IMETHODCALL NAME=”EnumerateInstanceNames”>
<LOCALNAMESPACEPATH>
<NAMESPACE NAME=”root”/>
<NAMESPACE NAME=”scx”/>
</LOCALNAMESPACEPATH>
<IPARAMVALUE NAME=”ClassName”>
<CLASSNAME NAME=”SCX_OperatingSystem”/>
</IPARAMVALUE>
</IMETHODCALL>
</SIMPLEREQ>
</MESSAGE>
</CIM>
Once you have made such a file, you can execute the query in the file with the tool like the following:
As you can see from here, CIMD uses HTTP already. This differs from Windows’ WMI that uses RPC/DCOM. In a way, this is much simpler to troubleshoot, and more firewall-friendly.
I have not really found an activity or debug log for any of those components, yet… but in the end they are not doing anything ON THEIR OWN, unless asked by the MS…. So the “healthservice” logic is all on the MS anyway. Errors about failed discoveries, permissions of the Action Account user, and anything else will be logged by the HealthService on the Windows machine (the Management Server) that is actually performing monitoring towards the Unix box.
It really is *just* getting the WMI and WinRM-equivalent layer on linux/Unix up and running– after that, everything is done from windows anyway!
After this common management infrastructure has been provided, 3rd parties will be facilitated in writing *just* MPs, without having to worry about the TRANSPORT of information anymore.
As you have probably noticed from the screenshots and commandlines, I don’t have a “real” Redhat Enterprise Linux or “supported” linux distribution… Therefore I started my testing using CentOS 5 (which is very similar to RHEL 5) – the agent installed fine as you can see, but I was not getting anything really “discovered” – the MP had only found a “linux computer” but was not finding any “RedHat” or “SuSe” or any other “Operating System” instances… and if you are somewhat familiar with the way Operations Manager targeting works, you would understand that monitors are targeted at object classes. If I don’t have any instance of those objects being discovered, NO MONITORING actually happens, even if the infrastructure is in place and the pieces are talking to each other:
Therefore my machine was not being monitored.
In the end, I actually even got it to work, but I had to create a new Management Pack (exporting and modifying the RHEL5 one as a base) that would actually search for different Property values and discover CentOS instead as if it were RedHat:
After importing my hacked Management Pack the machine started to be monitored. Here you can see Health Explorer in all of its glory:
Of course this is a hack I made just to have a test setup somewhat working and to familiarize myself with the SCX components. It is not guaranteed that my Management pack actually works on CentOS the way it is supposed to work and that there aren’t other – more subtle – differences between RedHat and CentOS that will make it fail. I only modified a couple of Discoveries to let it discover the “Operating System” instance… everything else should follow, but not necessarily. One difference you see already in the screenshot above is that I am not yet seeing the hardware being monitored, so my hack is already only partially working and it is definitely something that won’t be supported, so I cannot provide it here. Also, this is a beta, so I I think that the Management Packs will be re-released with following beta versions, and this change is something that would need to be re-done all over again. Also, the unsupported distribution is the reason why I installed the agent manually in the first place, as the “Discovery Wizard” would not really “agree” to go and let me install the agent remotely on an unsupported “platform!”.
But I could not wait to see this working, while waiting two business days (we are on a weekend!) for confirmation that I am allowed to actually download a 30-day-unsupported-Trial of the “real” RedHat Enteprise Linux, so I cheated 🙂
Disclaimer
The information in this weblog is provided “AS IS” with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided “AS IS” without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I’VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION.
Thanks Steve, I will check that out!
You can also use the cimcli command under the tools directory. The syntax would be;
cimcli xq “select * from SCX_UnixProcess” -n root/scx
Could you post this xml? I am having trouble importing the XML mgmt pack after I changed a couple things:
‘The service threw an unknown exception. See inner exception for details. The size necessary to buffer the XML content exceeded the buffer quota.’
After importing my hacked Management Pack…
Craig, I won’t post my XML right now as it really is a hack at the moment – it will find CentOS instead than RedHat, but it won’t find RedHat anymore!
I might post something later one – if and when I manage to make a complete NEW one that can work on CentOS, but that won’t let you REPLACE the current RedHat one.
My goal is still using all Microsoft’s sealed MPs on the Mgmt Group… I would like to ADD something, not *repleace* it.
In the meantime, if you want to fiddle with the XML of MPs, I highly suggest you to use the Authoring Console: it will be much easier to produce good, working MPs and understand existing ones. You can download it here http://www.microsoft.com/downloads/details.aspx?FamilyID=6c8911c3-c495-4a03-96df-9731c37aa6d7&DisplayLang=en
Thanks for the link. I was more curious as to where in the XML it actually ‘discovers’ the OS. I got a hacked MP to import that finds RHEL during the discovery by changing the redhat-release file. But after verifying it goes back to not being monitored. Apparently there is another string it is searching for somewhere else as well.
No, you should not change stuff on the box or on any file.
You just have to change the discoveries in the MP.
It is written up here – look at the query for the SCX_OperatingSystemClass and what is it trying to find.
I was able to unseal the Microsoft.Linux.RedHat.Library.mp and convert it to XML to use as a baseline or a starting point. Can you give a hint as to what you changed to make this work?
I got all this stuff working thx to your article. (I used real rh5)
You say that you exported one of MSFT managment packs. How? they are all ‘sealed’ and so cannot be exported. I want to experiment with tweaking their MP
Eric: I cannot show you the XML yet, but I gave plenty of hints up here already. You might also want to check out http://www.authormps.com to learn more about MP format and how they are written.
Paul: “sealed” means pretty much “digitally SIGNED. That means the actual content is not ENCRYPTED or anything: it is just SIGNED, so you know that it really came from Microsoft or [insert vendor here]… anyway, that it came from a source you trust and that it has not been tampered with.
In a way it is a security feature: imagine a Management Pack coming from someone you do NOT know and trust, that deploys a SCRIPT on all of your machines… now that would pretty much look like a worm to me. So MPs are sealed to prove their genuinity. And once they are sealed, the GUI recognizes it, and won’t let you CHANGE anything in them, as that would of course break the signature. This does NOT mean they are encrypted. So you can still poke at them and see how the look like “from the inside”.
There are a couple of techniques to get to the RAW xml. Boris blogged about one of those in the past, for example: http://blogs.msdn.com/boris_yanushpolsky/archive/2007/08/16/unsealing-a-management-pack.aspx
Hi Eric,
With regards to this comment you made, “Of course you have to solve some other things such as DNS resolution AND trusting the self-issued certificates that the agent uses, first.”, how did you go about accomplishing this?
I’m having an issue where the Hardware Availability and OS Availability rollups aren’t being monitored or showing any info. same with the Performance node. I’m wondering if even though though there is no mention of having to do anything with regards to trusting the self-signed cert in the setup guide this is what is causing the problem.
Thanks
Jay, the self signed certs gets used automatically by OpsMgr and that should just work.
The step of trusting the certificate is only required for TESTING the WS-Man calls/queries using windows tools, such as WINRM, which would otherwise complain and refuse to connect.
Stuff not showing up as monitored only depends on a lot of patience on your side: all the discoveries run every 24 hours by default, so stuff will show unmonitored for a few days until all discoveries are finished. In a test environment you could set overrides to speed up the discoveries, tough.
Hi !
Thanks a lot for post !
After installing an agent with cross-platform MP for AIX 5.3
(version 1.0.2-104 agent). There is a high CPU loading process scxopenwsmand – from 75% to 85%. Who knows where to dig 🙂 ? Thank you !
Hi Corwin, I have also seen high CPU usage in some cases on some of my test linux boxes, but it is not a constant or normal condition.
AS you have understood, that process is responsible for the WSMan implementation, so it is the process that is queries by the management server. I would suspect that in that case there is a monitor or a discovery that runs too often, so I would look there at first… but that’s just my idea.
Also keep in mind that is a BETA software, so has not been optimized yet for production use… true, opsnwsman and openpegasus are existing components, but the CIM provider used underneath is now and under active development.
Nonetheless, I would suggest that you report this on the product newsgroups so you will make sure that the product team knows about this issue 😉
Hi Daniele !
Thanks a lot !
I already wrote in connect newsgroups, thank you for advice. I would be explored further symptoms.
Since most people land to this page from the Internet, I thought it made sense to warn them that my experiments with Xplat have continued, and that there is a serie of other posts on this blog about the topic:
http://www.muscetta.com/2008/11/23/centos-discovery-in-opsmgr2007-r2-beta/
http://www.muscetta.com/2009/03/27/cross-platform-in-opsmgr-2007-r2-release-candidate/
http://www.muscetta.com/2009/05/30/installing-the-opsmgr-2007-r2-scx-agent-on-ubuntu/