Archive for the 'Microsoft' Category

RSS Feed for the 'Microsoft' Category

Microsoft Way

Sunday, July 18th, 2010

Microsoft Way

In the last couple of weeks we have been driving thru America from the east coast (New York) to the west coast (Seattle).

I figured out I needed to show my family the Microsoft campus too. Of course they know I work at Microsoft… but having only seen the office of a subsidiary – the one in Rome, with about 250 people at its max – might not have given them (especially the kids) an idea of the actual size of the company.

OpsMgr Event IDs Spreadsheet

Tuesday, June 22nd, 2010

I work in support (mostly with System Center Operations Manager, as you know), and I work with event logs every day. The following are typical situations:

  1. I get a colleague or a customer telling me “I am having a problem and the SCOM agent is showing 21037 events and 20002 events.  What’s wrong with it?”   
  2. I want to tune an OpsMgr environment and reduce load on the database by turning off a few event collections, as my friend Kevin Holman suggests here http://blogs.technet.com/kevinholman/archive/2009/11/25/tuning-tip-turning-off-some-over-collection-of-events.aspx .
  3. I am analyzing, sorting and grouping Events with Powershell like I have written on my blog lately http://www.muscetta.com/2009/12/16/opsmgr-eventlog-analysis-with-powershell/ but I can’t read those long descriptions properly.
  4. I exported an EVT from a customer environment and I load it on a machine that does not have OpsMgr message DLLs installed – all I see are EventIDs and type (Warning, Error) – but no real description – and I still want to figure out what those events are trying to tell me.

Getting to the point: I, like everyone – don’t have every OpsMgr event memorized.

This is why I thought of building this spreadsheet, and I hope it might come in handy to more people.

The spreadsheet contains an “AllEvents” list – and then the same events are broken down by event source as well:

clip_image002

When you want to search for an events (in one of the situations described above) just open up the spreadsheet, go to the “AllEvents” tab, hit CTRL+F (“Find”) and type in the Event ID you are searching for:

clip_image004

And this will take you to the row containing the event, so you can look up its description:

clip_image006

The description shows the event standard text (which is in the message DLL, therefore is the part you will not see if opening an EVT on another machine that does not have OpsMgr installed), and where the event parameters are (%1, %2, etc – which will be the strings you see in the EVT anyway).

That way you can get an understanding of what the original message would have looked like on the original machine.

This is just one possible usage pattern of this reference. It can also be useful to just read/study the events, learning about new ones you have never encountered, or remembering those you HAVE seen in the past but did not quite remember. And of course you can also find other creative ways to use it.

You can get it from here.

 

A few last words to give due credit: this spreadsheet has been compiled by using Eventlog Explorer (http://blogs.technet.com/momteam/archive/2008/04/02/eventlog-explorer.aspx ) to extract the event information out of the message DLLs on a OpsMgr2007 R2 installation. That info has been then copied and pasted in Excel in order to have an “offline” reference. Also I would like to thank Kevin Holman for pointing me to Eventlog Explorer first, and then for insisting I should not keep this spreadsheet in my drawer, as it could be useful to more people!

How to convert (and fixup) the RedHat RPM to run on Debian/Ubuntu

Monday, June 21st, 2010

In an earlier post I had shown how I got the Xplat agent running on Ubuntu. I perfected the technique over time, and what follows is a step-by-step process on how to convert and change the RedHat package to run on Debian/Ubuntu. Of course this is still a hack… but some people asked me to detail it a bit more. At the same time, the cross platform team is working to update the the source code on codeplex with extra bits that will make more straightforward to grab it, modify it and re-compile it than it is today. Until then, here is how I got it to work.

I assume you have already copied the right .RPM package off the OpsMgr server’s /AgentManagement directory to the Linux box here. The examples below refer to the 32bit package, but of course the same identical technique would work for the 64bit version.

We start by converting the RPM package to DEB format:

root# alien -k scx-1.0.4-258.rhel.5.x86.rpm –scripts

scx_1.0.4-258_i386.deb generated

 

Then we need to create a folder where we will extract the content of the package, modify stuff, and repackage it:

root# mkdir scx_1.0.4-258_i386

root# cd scx_1.0.4-258_i386

root# ar -x ../scx_1.0.4-258_i386.deb

root# mkdir debian

root# cd debian

root# mkdir DEBIAN

root# cd DEBIAN

root# cd ../..

root# rm debian-binary

root# mv control.tar.gz debian/DEBIAN/

root# mv data.tar.gz debian/

root# cd debian

root# tar -xvzf data.tar.gz

root# rm data.tar.gz

root# cd DEBIAN/

root# tar -xvzf control.tar.gz

root# rm control.tar.gz

Now we have the “skeleton” of the package easily laid out on the filesystem and we are ready to modify the package and add/change stuff to and in it.

 

First, we need to add some stuff to it, which is expected to be found on a redhat distro, but is not present in debian. In particular:

1. You should copy the file “functions” (that you can get from a redhat/centos box under /etc/init.d) under the debian/etc/init.d folder in our package folder. This file is required/included by our startup scripts, so it needs to be deployed too.

Then we need to chang some of the packacge behavior by editing files under debian/DEBIAN:

2. edit the “control” file (a file describing what the package is, and does):

clip_image002

3. edit the “preinst” file (pre-installation instructions): we need to add instructions to copy the “issue” file onto “redhat-release” (as the SCX_OperatingSystem class will look into that file, and this is hard-coded in the binary, we need to let it find it):

clip_image004

these are the actual command lines to add for both packages (DEBIAN or UBUNTU):

# symbolic links for libaries called differently on Ubuntu and Debian vs. RedHat

ln -s /usr/lib/libcrypto.so.0.9.8 /usr/lib/libcrypto.so.6

ln -s /usr/lib/libssl.so.0.9.8 /usr/lib/libssl.so.6

the following bit would be Ubuntu-specific:

#we need this file for the OS provider relies on it, so we convert what we have in /etc/issue

#this is ok for Ubuntu (“Ubuntu 9.0.4 \n \l” becomes “Ubuntu 9.0.4”)

cat /etc/issue | awk '/\\n/ {print $1, $2}' > /etc/redhat-release

while the following bit is Debian-specific:

#this is ok for Debian (“Debian GNU/Linux 5.0 \n \l” becomes “Debian GNU/Linux 5.0”)

cat /etc/issue | awk '/\\n/ {print $1, $2, $3}' > /etc/redhat-release

 

4. Then we edit/modify the “postinst” file (post-installation instructions) as follows:

a. remove the 2nd and 3rd lines which look like the following

RPM_INSTALL_PREFIX=

export RPM_INSTALL_PREFIX

as they are only useful for the RPM system, not DEB/APT, so we don’t need them.

b. change the following 2 functions which contain RedHat-specific commands:

configure_pegasus_service() {

           /usr/lib/lsb/install_initd /etc/init.d/scx-cimd

}

start_pegasus_service() {

           service scx-cimd start

}

c. We need to change in the Debian equivalents for registering a service in INIT and starting it:

configure_pegasus_service() {

               update-rc.d scx-cimd defaults

}

start_pegasus_service() {

              /etc/init.d/scx-cimd start

}

5. Modify the “prerm” file (pre-removal instructions):

a. Just like “postinst”, remove the lines

RPM_INSTALL_PREFIX=

export RPM_INSTALL_PREFIX

b. Locate the two functions stopping and un-installing the service

stop_pegasus_service() {

         service scx-cimd stop

}

unregister_pegasus_service() {

          /usr/lib/lsb/remove_initd /etc/init.d/scx-cimd

}

c. Change those two functions with the Debian-equivalent command lines

stop_pegasus_service() {

           /etc/init.d/scx-cimd stop

}

unregister_pegasus_service() {

           update-rc.d -f scx-cimd remove

}

At this point the change we needed have been put in place, and we can re-build the DEB package.

Move yourself in the main folder of the application (the scx_1.0.4-258_i386 folder):

root# cd ../..

Create the package starting from the folders

root# dpkg-deb –build debian

dpkg-deb: building package `scx' in `debian.deb'.

Rename the package (for Ubuntu)

root# mv debian.deb scx_1.0.4-258_Ubuntu_9_i386.deb

Rename the package (for Debian)

root# mv debian.deb scx_1.0.4-258_Debian_5_i386.deb

Install it

root# dpkg -i scx_1.0.4-258_Platform_Version_i386.deb

All done! It should install and work!

 

Next step would be creating a Management Pack to monitor Debian and Ubuntu. It is pretty similar to what Robert Hearn has described step by step for CentOS, but with some different replacements of strings, as you can imagine. I have done this but have not written down the procedure yet, so I will post another article on how to do this as soon as I manage to get it standardized and reliable. There is a bit more work involved for Ubuntu/Debian… as some of the daemons/services have different names, and certain files too… but nothing terribly difficult to change so you might want to try it already and have a go at it!

In the meantime, as a teaser, here’s my server’s (http://www.muscetta.com) performance, being monitored with this “hack”:

image

 

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

Audit Collection Services Database Partitions Size Report

Wednesday, May 5th, 2010

A number of people I have talked to liked my previous post on ACS sizing. One thing that was not extremely easy or clear to them in that post was *how* exactly I did one thing I wrote:

[…] use the dtEvent_GUID table to get the number of events for that day, and use the stored procedure “sp_spaceused”  against that same table to get an overall idea of how much space that day is taking in the database […]

To be completely honest, I do not expect people to do this manually a hundred times if they have a hundred partitions. In fact, I have been doing this for a while with a script which will do the looping for me and run that sp_spaceused for me a number of time. I cannot share that script, but I do realize that this automation is very useful, therefore I wrote a “stand-alone” SQL query which, using a couple of temporary tables, produces a similar type of output. I also went a step further and packaged it into a SQL Server Reporting Services Report for everyone’s consumption. The report should look like the following screenshot, featuring a chart and the table with the numerical information about each and every partition in the database:

ACS Partitions Report

You can download the report from here.

You need to upload it to your report server, and change the data source to the shared Data Source that also the built-in ACS Reports use, and it should work.

Enjoy!

A few thoughts on sizing Audit Collection System

Thursday, March 18th, 2010

People were already collecting logs with MOM, so why not the security log? Some people were doing that, but it did not scale enough; for this reason, a few years ago Eric Fitzgerald announced that he was working on Microsoft Audit Collection System. Anyhow, the tool as it was had no interface… and the rest is history: it has been integrated into System Center Operations Manager. Anyhow, ACS remains a lesser-known component of OpsMgr.

There are a number of resources on the web that is worth mentioning and linking to:

and, of course, many more, I cannot link them all.

As for myself, I have been playing with ACS since those early beta days (before I joined Microsoft and before going back to MOM, when I was working in Security), but I never really blogged about this piece.

Since I have been doing quite a lot of work around ACS lately, again, I thought it might be worth consolidating some thoughts about it, hence this post.

Anatomy of an “Online” Sizing Calculation

What I would like to explain here is the strategy and process I go thru when analyzing the data stored in a ACS database, in order to determine a filtering strategy: what to keep and what not to keep, by applying a filter on the ACS Collector.

So, the first thing I usually start with is using one of the many “ACS sizer” Excel spreadsheets around… which usually tell you that you need more space than it really is necessary… basically giving you a “worst case” scenario. I don’t know how some people can actually do this from a purely theoretical point of view, but I usually prefer a bottom up approach: I look at the actual data that the ACS is collecting without filters, and start from there for a better/more accurate sizing.

In the case of a new install this is easy – you just turn ACS on, set the retention to a few days (one or two weeks maximum), give the DB plenty of space to make sure it will make it, add all your forwarders… sit back and wait.

Then you come back 2 weeks later and start looking at the data that has been collected.

What/How much data are we collecting?

First of all, if we have not changed the default settings, the grooming and partitioning algorithm will create new partitioned tables every day. So my first step is to see how big each “partition” is.

But… what is a partition, anyway? A partition is a set of 4 tables joint together:

  1. dtEvent_GUID
  2. dtEventData_GUID
  3. dtPrincipal_GUID
  4. dtSTrings_GUID

where GUID is a new GUID every day, and of course the 4 tables that make up a daily partition will have the same GUID.

The dtPartition table contains a list of all partitions and their GUIDs, together with their start and closing time.

Just to get a rough estimate we can ignore the space used by the last three tables – which are usually very small – and only use the dtEvent_GUID table to get the number of events for that day, and use the stored procedure “sp_spaceused”  against that same table to get an overall idea of how much space that day is taking in the database.

By following this process, I come up with something like the following:

Partition ID Status Partition Start Time Partition Close Time Rows Reserved  KB Total GB
9b45a567_c848_4a32_9c35_39b402ea0ee2 0 2/1/2010 2:00 2/1/2010 2:00 29,749,366 7,663,488 7,484
8d8c8ee1_4c5c_4dea_b6df_82233c52e346 2 1/31/2010 2:00 2/1/2010 2:00 28,067,438 9,076,904 8,864
34ce995b_689b_46ae_b9d3_c644cfb66e01 2 1/30/2010 2:00 1/31/2010 2:00 30,485,110 9,857,896 9,627
bb7ea5d3_f751_473a_a835_1d1d42683039 2 1/29/2010 2:00 1/30/2010 2:00 48,464,952 15,670,792 15,304
ee262692_beae_4d81_8079_470a54567946 2 1/28/2010 2:00 1/29/2010 2:00 48,980,178 15,836,416 15,465
7984b5b8_ddea_4e9c_9e51_0ee7a413b4c9 2 1/27/2010 2:00 1/28/2010 2:00 51,295,777 16,585,408 16,197
d93b9f0e_2ec3_4f61_b5e0_b600bbe173d2 2 1/26/2010 2:00 1/27/2010 2:00 53,385,239 17,262,232 16,858
8ce1b69a_7839_4a05_8785_29fd6bfeda5f 2 1/25/2010 2:00 1/26/2010 2:00 55,997,546 18,105,840 17,681
19aeb336_252d_4099_9a55_81895bfe5860 2 1/24/2010 2:00 1/24/2010 2:00 28,525,304 7,345,120 7,173
1cf70e01_3465_44dc_9d5c_4f3700dc408a 2 1/23/2010 2:00 1/23/2010 2:00 26,046,092 6,673,472 6,517
f5ec207f_158c_47a8_b15f_8aab177a6305 2 1/22/2010 2:00 1/22/2010 2:00 47,818,322 12,302,208 12,014
b48dabe6_a483_4c60_bb4d_93b7d3549b3e 2 1/21/2010 2:00 1/21/2010 2:00 55,060,150 14,155,392 13,824
efe66c10_0cf2_4327_adbf_bebb97551c93 2 1/20/2010 2:00 1/20/2010 2:00 58,322,217 15,029,216 14,677
0231463e_8d50_4a42_a834_baf55e6b4dcd 2 1/19/2010 2:00 1/19/2010 2:00 61,257,393 15,741,248 15,372
510acc08_dc59_482e_a353_bfae1f85e648 2 1/18/2010 2:00 1/18/2010 2:00 64,579,122 16,612,512 16,223

If you have just installed ACS and let it run without filters with your agents for a couple of weeks, you should get some numbers like those above for your “couple of weeks” of analysis. If you graph your numbers in Excel (both size and number of rows/events per day) you should get some similar lines that show a pattern or trend:

Trend: Space user by day

Trend: Number of events by day

So, in my example above, we can clearly observe a “weekly” pattern (monday-to-friday being busier than the weekend) and we can see that – for that environment – the biggest partition is roughly 17GB. If we round this up to 20GB – and also considering the weekends are much quieter – we can forecast 20*7 = 140GB per week. This has an excess “buffer” which will let the system survive event storms, should they happen. We also always recommend having some free space to allow for re-indexing operations.

In fact, especially when collecting everything without filters, the daily size is a lot less predictable: imagine worms “trying out” administrator account’s passwords, and so on… those things can easily create event storms.

Anyway, in the example above, the customer would have liked to keep 6 MONTHS (180days) of data online, which would become 20*180 = 3600GB = THREE TERABYTE and a HALF! Therefore we need a filtering strategy – and badly – to reduce this size.

[edited on May 7th 2010 - if you want to automate the above analysis and produce a table and graphs like those just shown, you should look at my following post.]

Filtering Strategies

Ok, then we need to look at WHAT actually comprises that amount of events we are collecting without filters. As I wrote above, I usually run queries to get this type of information.

I will not get into HOW TO write a filter here – a collector’s filter is a WMI notification query and it is already described pretty well elsewhere how to configure it.

Here, instead, I want to walk thru the process and the queries I use to understand where the noise comes from and what could be filtered – and get an estimate of how much space we could be saving if filter one way or another.

Number of Events per User

–event count by User (with Percentages)
declare @total float
select @total = count(HeaderUser) from AdtServer.dvHeader
select count(HeaderUser),HeaderUser, cast(convert(float,(count(HeaderUser)) / (convert(float,@total)) * 100) as decimal(10,2))
from AdtServer.dvHeader
group by HeaderUser
order by count(HeaderUser) desc

In our example above, over the 14 days we were observing, we obtained percentages like the following ones:

#evt HeaderUser Account Percent
204,904,332 SYSTEM 40.79 %
18,811,139 LOCAL SERVICE 3.74 %
14,883,946 ANONYMOUS LOGON 2.96 %
10,536,317 appintrauser 2.09 %
5,590,434 mossfarmusr

Just by looking at this, it is pretty clear that filtering out events tracked by the accounts “SYSTEM”, “LOCAL SERVICE” and “ANONYMOUS”, we would save over 45% of the disk space!

Number of Events by EventID

Similarly, we can look at how different Event IDs have different weights on the total amount of events tracked in the database:

–event count by ID (with Percentages)
declare @total float
select @total = count(EventId) from AdtServer.dvHeader
select count(EventId),EventId, cast(convert(float,(count(EventId)) / (convert(float,@total)) * 100) as decimal(10,2))
from AdtServer.dvHeader
group by EventId
order by count(EventId) desc

We would get some similar information here:

Event ID Meaning Sum of events Percent
538 A user logged off 99,494,648 27.63
540 Successful Network Logon 97,819,640 27.16
672 Authentication Ticket Request 52,281,129 14.52
680 Account Used for Logon by (Windows 2000) 35,141,235 9.76
576 Specified privileges were added to a user's access token. 26,154,761 7.26
8086 Custom Application ID 18,789,599 5.21
673 Service Ticket Request 10,641,090 2.95
675 Pre-Authentication Failed 7,890,823 2.19
552 Logon attempt using explicit credentials 4,143,741 1.15
539 Logon Failure – Account locked out 2,383,809 0.66
528 Successful Logon 1,764,697 0.49

Also, do not forget that ACS provides some report to do this type of analysis out of the box, even if for my experience they are generally slower – on large datasets – than the queries provided here. Also, a number of reports have been buggy over time, so I just prefer to run queries and be on the safe side.

Below an example of such report (even if run against a different environment – just in case you were wondering why the numbers were not the same ones :-) ):Event Counts ACS Default Report

The numbers and percentages we got from the two queries above should already point us in the right direction about what we might want to adjust in either our auditing policy directly on Windows and/or decide if there is something we want to filter out at the collector level (here you should ask yourself the question: “if they aren’t worth collecting are they worth generating?” – but I digress).

Also, a permutation of the above two queries should let you see which user is generating the most “noise” in regards to some events and not other ones… for example:

–event distribution for a specific user (change the @user) – with percentages for the user and compared with the total #events in the DB
declare @user varchar(255)
set @user = 'SYSTEM'
declare @total float
select @total = count(Id) from AdtServer.dvHeader
declare @totalforuser float
select @totalforuser = count(Id) from AdtServer.dvHeader where HeaderUser = @user
select count(Id), EventID, cast(convert(float,(count(Id)) / convert(float,@totalforuser) * 100) as decimal(10,2)) as PercentageForUser, cast(convert(float,(count(Id)) / (convert(float,@total)) * 100) as decimal(10,2)) as PercentageTotal
from AdtServer.dvHeader
where HeaderUser = @user
group by EventID
order by count(Id) desc

The above is particularly important, as we might want to filter out a number of events for the SYSTEM account (i.e. logons that occur when starting and stopping services) but we might want to keep other events that are tracked by the SYSTEM account too, such as an administrator having wiped the Security Log clean – which might be something you want to keep:

Event ID 517 Audit Log was cleared

of course the amount of EventIDs 517 over the total of events tracked by the SYSTEM account will not be as many, and we can still filter the other ones out.

Number of Events by EventID and by User

We could also combine the two approaches above – by EventID and by User:

select count(Id),HeaderUser, EventId

from AdtServer.dvHeader

group by HeaderUser, EventId

order by count(Id) desc

This will produce a table like the following one

SQL Query: Events by EventID and by User

which can be easily copied/pasted into Excel in order to produce a pivot Table:

Pivot Table

Cluster EventLog Replication

One more aspect that is less widely known, but I think is worth showing, is the way that clusters behave when in ACS. I don’t mean all clusters… but if you keep the “eventlog replication” feature of clusters enabled (you should disable it also from a monitoring perspective, but I digress), each cluster node’s security eventlog will have events not just for itself, but for all other nodes as well.

Albeit I have not found a reliable way to filter out – other than disabling eventlog replication altogether.

Anyway, just to get an idea of how much this type of “duplicate” events weights on the total, I use the following query, that tells you how many events for each machine are tracked by another machine:

–to spot machines that are cluster nodes with eventlog repliation and write duplicate events (slow)

select Count(Id) as Total,replace(right(AgentMachine, (len(AgentMachine) – patindex('%\%',AgentMachine))),'$',") as ForwarderMachine, EventMachine

from AdtServer.dvHeader

–where ForwarderMachine <> EventMachine

group by EventMachine,replace(right(AgentMachine, (len(AgentMachine) – patindex('%\%',AgentMachine))),'$',")

order by ForwarderMachine,EventMachine

Cluster Events

Those presented above are just some of the approaches I usually look into at first. Of course there are a number more. Here I am including the same queries already shown in action, plus a few more that can be useful in this process.

I have even considered building a page with all these queries – a bit like those that Kevin is collecting for OpsMgr (we actually wrote some of them together when building the OpsMgr Health Check)… shall I move the below queries on such a page? I though I’d list them here and give some background on how I normally use them, to start off with.

Some more Useful Queries

–top event ids
select count(EventId), EventId
from AdtServer.dvHeader
group by EventId
order by count(EventId) desc

–event count by ID (with Percentages)
declare @total float
select @total = count(EventId) from AdtServer.dvHeader
select count(EventId),EventId, cast(convert(float,(count(EventId)) / (convert(float,@total)) * 100) as decimal(10,2))
from AdtServer.dvHeader
group by EventId
order by count(EventId) desc

–which machines have ever written event 538
select distinct EventMachine, count(EventId) as total
from AdtServer.dvHeader
where EventID = 538
group by EventMachine

–machines
select * from dtMachine

–machines (more readable)
select replace(right(Description, (len(Description) – patindex('%\%',Description))),'$',")
from dtMachine

–events by machine
select count(EventMachine), EventMachine
from AdtServer.dvHeader
group by EventMachine

–rows where EventMachine field not available (typically events written by ACS itself for chekpointing)
select *
from AdtServer.dvHeader
where EventMachine = 'n/a'

–event count by day
select convert(varchar(20), CreationTime, 102) as Date, count(EventMachine) as total
from AdtServer.dvHeader
group by convert(varchar(20), CreationTime, 102)
order by convert(varchar(20), CreationTime, 102)

–event count by day and by machine
select convert(varchar(20), CreationTime, 102) as Date, EventMachine, count(EventMachine) as total
from AdtServer.dvHeader
group by EventMachine, convert(varchar(20), CreationTime, 102)
order by convert(varchar(20), CreationTime, 102)

–event count by machine and by date (distinuishes between AgentMachine and EventMachine
select convert(varchar(10),CreationTime,102),Count(Id),EventMachine,AgentMachine
from AdtServer.dvHeader
group by convert(varchar(10),CreationTime,102),EventMachine,AgentMachine
order by convert(varchar(10),CreationTime,102) desc ,EventMachine

–event count by User
select count(Id),HeaderUser
from AdtServer.dvHeader
group by HeaderUser
order by count(Id) desc

–event count by User (with Percentages)
declare @total float
select @total = count(HeaderUser) from AdtServer.dvHeader
select count(HeaderUser),HeaderUser, cast(convert(float,(count(HeaderUser)) / (convert(float,@total)) * 100) as decimal(10,2))
from AdtServer.dvHeader
group by HeaderUser
order by count(HeaderUser) desc

–event distribution for a specific user (change the @user) – with percentages for the user and compared with the total #events in the DB
declare @user varchar(255)
set @user = 'SYSTEM'
declare @total float
select @total = count(Id) from AdtServer.dvHeader
declare @totalforuser float
select @totalforuser = count(Id) from AdtServer.dvHeader where HeaderUser = @user
select count(Id), EventID, cast(convert(float,(count(Id)) / convert(float,@totalforuser) * 100) as decimal(10,2)) as PercentageForUser, cast(convert(float,(count(Id)) / (convert(float,@total)) * 100) as decimal(10,2)) as PercentageTotal
from AdtServer.dvHeader
where HeaderUser = @user
group by EventID
order by count(Id) desc

–to spot machines that write duplicate events (such as cluster nodes with eventlog replication enabled)
select Count(Id),EventMachine,AgentMachine
from AdtServer.dvHeader
group by EventMachine,AgentMachine
order by EventMachine

–to spot machines that are cluster nodes with eventlog repliation and write duplicate events (better but slower)
select Count(Id) as Total,replace(right(AgentMachine, (len(AgentMachine) – patindex('%\%',AgentMachine))),'$',") as ForwarderMachine, EventMachine
from AdtServer.dvHeader
–where ForwarderMachine <> EventMachine
group by EventMachine,replace(right(AgentMachine, (len(AgentMachine) – patindex('%\%',AgentMachine))),'$',")
order by ForwarderMachine,EventMachine

–which user and from which machine is target of elevation (network service doing "runas" is a 552 event)
select count(Id),EventMachine, TargetUser
from AdtServer.dvHeader
where HeaderUser = 'NETWORK SERVICE'
and EventID = 552
group by EventMachine, TargetUser
order by count(Id) desc

–by hour, minute and user
–(change the timestamp)… this query is useful to search which users are active in a given time period…
–helpful to spot "peaks" of activities such as password brute force attacks, or other activities limited in time.
select datepart(hour,CreationTime) as Hours, datepart(minute,CreationTime) as Minutes, HeaderUser, count(Id) as total
from AdtServer.dvHeader
where CreationTime < '2010-02-22T16:00:00.000'
and CreationTime > '2010-02-22T15:00:00.000'
group by datepart(hour,CreationTime), datepart(minute,CreationTime),HeaderUser
order by datepart(hour,CreationTime), datepart(minute,CreationTime),HeaderUser

OpsMgr Eventlog analysis with Powershell

Wednesday, December 16th, 2009

The following technique should already be understood by any powersheller. Here we focus on Operations Manager log entries, even if the data mining technique shows is entirely possibly – and encouraged :-) – with any other event log.

Let’s start by getting our eventlog into a variable called $evt:

PS  >> $evt = Get-Eventlog “Operations Manager”

The above only works locally in POSH v1.

In POSH v2 you can go remotely by using the “-computername” parameter:

PS  >> $evt = Get-Eventlog “Operations Manager” –computername RMS.domain.com

Anyhow, you can get to this remotely also in POSHv1 with this other more “dotNET-tish” syntax:

PS >> $evt = (New-Object System.Diagnostics.Eventlog -ArgumentList "Operations Manager").get_Entries()

you could even export this (or any of the above) to a CLIXML file:

PS >> (New-Object System.Diagnostics.Eventlog -ArgumentList "Operations Manager").get_Entries() | export-clixml -path c:\evt\Evt-OpsMgr-RMS.MYDOMAIN.COM.xml

and then you could reload your eventlog to another machine:

PS  >> $evt = import-clixml c:\evt\Evt-OpsMgr-RMS.MYDOMAIN.COM.xml

whatever way you used to populate your $evt  variable, be it from a “live” eventlog or by re-importing it from XML, you can then start analyzing it:

PS  >> $evt | where {$_.Entrytype -match "Error"} | select EventId,Source,Message | group eventid

Count Name                      Group
—– —-                      —–
1510 4509                      {@{EventID=4509; Source=HealthService; Message=The constructor for the managed module type "Microsoft.EnterpriseManagement.Mom.DatabaseQueryModules.GroupCalculatio.
   15 20022                     {@{EventID=20022; Source=OpsMgr Connector; Message=The health service {7B0E947B-2055…
    3 26319                     {@{EventID=26319; Source=OpsMgr SDK Service; Message=An exception was thrown while p…
    1 4512                      {@{EventID=4512; Source=HealthService; Message=Converting data batch to XML failed w…

the above is functionally identical to the following:

PS  >> $evt | where {$_.Entrytype -eq 1} | select EventID,Source,Message | group eventid

Count Name                      Group
—– —-                      —–
1510 4509                      {@{EventID=4509; Source=HealthService; Message=The constructor for the managed modul…
   15 20022                     {@{EventID=20022; Source=OpsMgr Connector; Message=The health service {7B0E947B-2055…
    3 26319                     {@{EventID=26319; Source=OpsMgr SDK Service; Message=An exception was thrown while p…
    1 4512                      {@{EventID=4512; Source=HealthService; Message=Converting data batch to XML failed w…

Note that Eventlog Entries’ type is an ENUM that has values of 0,1,2 – similarly to OpsMgr health states – but beware that their order is not the same, as shown in the following table:

Code OpsMgr States Events EntryType
0 Not Monitored Information
1 Success Error
2 Warning Warning
3 Critical

Let’s now look at Information Events (Entrytype –eq 0)

PS  >> $evt | where {$_.Entrytype -eq 0} | select EventID,Source,Message | group eventid

Count Name                      Group
—– —-                      —–
4135 2110                      {@{EventID=2110; Source=HealthService; Message=Health Service successfully transferr…
1548 21025                     {@{EventID=21025; Source=OpsMgr Connector; Message=OpsMgr has received new configura…
4644 7026                      {@{EventID=7026; Source=HealthService; Message=The Health Service successfully logge…
1548 7023                      {@{EventID=7023; Source=HealthService; Message=The Health Service has downloaded sec…
1548 7025                      {@{EventID=7025; Source=HealthService; Message=The Health Service has authorized all…
1548 7024                      {@{EventID=7024; Source=HealthService; Message=The Health Service successfully logge…
1548 7028                      {@{EventID=7028; Source=HealthService; Message=All RunAs accounts for management gro…
   16 20021                     {@{EventID=20021; Source=OpsMgr Connector; Message=The health service {7B0E947B-2055…
   13 7019                      {@{EventID=7019; Source=HealthService; Message=The Health Service has validated all …
    4 4002                      {@{EventID=4002; Source=Health Service Script; Message=Microsoft.Windows.Server.Logi…

 

And “Warning” events (Entrytype –eq 2):

PS  >> $evt | where {$_.Entrytype -eq 2} | select EventID,Source,Message | group eventid

Count Name                      Group
—– —-                      —–
1511 1103                      {@{EventID=1103; Source=HealthService; Message=Summary: 1 rule(s)/monitor(s) failed …
  501 20058                     {@{EventID=20058; Source=OpsMgr Connector; Message=The Root Connector has received b…
    5 29202                     {@{EventID=29202; Source=OpsMgr Config Service; Message=OpsMgr Config Service could …
  421 31501                     {@{EventID=31501; Source=Health Service Modules; Message=No primary recipients were …
   18 10103                     {@{EventID=10103; Source=Health Service Modules; Message=In PerfDataSource, could no…
    1 29105                     {@{EventID=29105; Source=OpsMgr Config Service; Message=The request for management p…

 

 

Ok now let’s see those event 20022, for example… so we get an idea of which healthservices they are referring to (20022 indicates" “hearthbeat failure”, btw):

PS  >> $evt | where {$_.eventid -eq 20022} | select message

Message
——-
The health service {7B0E947B-2055-C12A-B6DB-DD6B311ADF39} running on host webapp3.domain1.mydomain.com and s…
The health service {E3B3CCAA-E797-4F08-860F-47558B3DA477} running on host SERVER1.domain2.mydomain.com and serving…
The health service {E3B3CCAA-E797-4F08-860F-47558B3DA477} running on host SERVER1.domain2.mydomain.com and serving…
The health service {E3B3CCAA-E797-4F08-860F-47558B3DA477} running on host SERVER1.domain2.mydomain.com and serving…
The health service {52E16F9C-EB1A-9FAF-5B9C-1AA9C8BC28E3} running on host DC4WK3.domain1.mydomain.com and se…
The health service {F96CC9E6-2EC4-7E63-EE5A-FF9286031C50} running on host VWEBDL2.domain1.mydomain.com and s…
The health service {71987EE0-909A-8465-C32D-05F315C301CC} running on host VDEVWEBPROBE2.domain2.mydomain.com….
The health service {BAF6716E-54A7-DF68-ABCB-B1101EDB2506} running on host XP2SMS002.domain2.mydomain.com and serving mana…
The health service {30C81387-D5E0-32D6-C3A3-C649F1CF66F1} running on host stgweb3.domain3.mydomain.com and…
The health service {3DCDD330-BBBB-B8E8-4FED-EF163B27DE0A} running on host VWEBDL1.domain1.mydomain.com and s…
The health service {13A47552-2693-E774-4F87-87DF68B2F0C0} running on host DC2.domain4.mydomain.com and …
The health service {920BF9A8-C315-3064-A5AA-A92AA270529C} running on host FSCLU2 and serving management group Pr…
The health service {FAA3C2B5-C162-C742-786F-F3F8DC8CAC2F} running on host WEBAPP4.domain1.mydomain.com and s…
The health service {3DCDD330-BBBB-B8E8-4FED-EF163B27DE0A} running on host WEBDL1.domain1.mydomain.com and s…
The health service {3DCDD330-BBBB-B8E8-4FED-EF163B27DE0A} running on host WEBDL1.domain1.mydomain.com and s…

 

or let’s look at some warning for the Config Service:

PS  >> $evt | where {$_.Eventid -eq 29202}

   Index Time          EntryType   Source                 InstanceID Message
   —– —-          ———   ——                 ———- ——-
5535065 Dec 07 21:18  Warning     OpsMgr Config Ser…   2147512850 OpsMgr Config Service could not retrieve a cons…
5543960 Dec 09 16:39  Warning     OpsMgr Config Ser…   2147512850 OpsMgr Config Service could not retrieve a cons…
5545536 Dec 10 01:06  Warning     OpsMgr Config Ser…   2147512850 OpsMgr Config Service could not retrieve a cons…
5553119 Dec 11 08:24  Warning     OpsMgr Config Ser…   2147512850 OpsMgr Config Service could not retrieve a cons…
5555677 Dec 11 10:34  Warning     OpsMgr Config Ser…   2147512850 OpsMgr Config Service could not retrieve a cons…

Once seen those, can you remember of any particular load you had on those days that justifies the instance space changing so quickly that the Config Service couldn’t keep up?

 

Or let’s group those events with ID 21025 by hour, so we know how many Config recalculations we’ve had (which, if many, might indicate Config Churn):

PS  >> $evt | where {$_.Eventid -eq 21025} | select TimeGenerated | % {$_.TimeGenerated.ToShortDateString()} | group

Count Name                      Group
—– —-                      —–
   39 12/7/2009                 {12/7/2009, 12/7/2009, 12/7/2009, 12/7/2009…}
  203 12/8/2009                 {12/8/2009, 12/8/2009, 12/8/2009, 12/8/2009…}
  217 12/9/2009                 {12/9/2009, 12/9/2009, 12/9/2009, 12/9/2009…}
  278 12/10/2009                {12/10/2009, 12/10/2009, 12/10/2009, 12/10/2009…}
  259 12/11/2009                {12/11/2009, 12/11/2009, 12/11/2009, 12/11/2009…}
  224 12/12/2009                {12/12/2009, 12/12/2009, 12/12/2009, 12/12/2009…}
  237 12/13/2009                {12/13/2009, 12/13/2009, 12/13/2009, 12/13/2009…}
   91 12/14/2009                {12/14/2009, 12/14/2009, 12/14/2009, 12/14/2009…}

 

Event ID 21025 shows that there is a new configuration for the Management Group.

Event ID 29103 has a similar wording, but shows that there is a new configuration for a given Healthservice. These should normally be many more events, unless your only health Service is the RMS, which is unlikely…

If we look at the event description (“message”) in search for the name (or even the GUID, as both are present) or our RMS, as follows, then they should be the same numbers of the 21025 above:

PS  >> $evt | where {$_.Eventid -eq 29103} | where {$_.message -match "myrms.domain.com"} | select TimeGenerated | % {$_.TimeGenerated.ToShortDateString()} | group

Count Name                      Group
—– —-                      —–
   39 12/7/2009                 {12/7/2009, 12/7/2009, 12/7/2009, 12/7/2009…}
  203 12/8/2009                 {12/8/2009, 12/8/2009, 12/8/2009, 12/8/2009…}
  217 12/9/2009                 {12/9/2009, 12/9/2009, 12/9/2009, 12/9/2009…}
  278 12/10/2009                {12/10/2009, 12/10/2009, 12/10/2009, 12/10/2009…}
  259 12/11/2009                {12/11/2009, 12/11/2009, 12/11/2009, 12/11/2009…}
  224 12/12/2009                {12/12/2009, 12/12/2009, 12/12/2009, 12/12/2009…}
  237 12/13/2009                {12/13/2009, 12/13/2009, 12/13/2009, 12/13/2009…}
   91 12/14/2009                {12/14/2009, 12/14/2009, 12/14/2009, 12/14/2009…}

 

Going back to the initial counts of events by their IDs, when showing the errors the counts above had spotted the presence of a lonely 4512 event, which might have gone undetected if just browsing the eventlog with the GUI, since it only occurred once.

Let’s take a look at it:

PS  >> $evt | where {$_.eventid -eq 4512}

   Index Time          EntryType   Source                 InstanceID Message
   —– —-          ———   ——                 ———- ——-
5560756 Dec 12 11:18  Error       HealthService          3221229984 Converting data batch to XML failed with error …

Now, when it is about counts, Powershell is great.  But sometimes Powershell makes it difficult to actually READ the (long) event messages (descriptions) in the console. For example, our event ID 4512 is difficult to read in its entirety and gets truncated with trailing dots…

we can of course increase the window size and/or selecting only THAT one field to read it better:

PS  >> $evt | where {$_.eventid -eq 4512} | select message

Message
——-
Converting data batch to XML failed with error "Not enough storage is available to complete this operation." (0x8007000E) in rule "Microsoft.SystemCenter.ConfigurationService.CollectionRule.Event.ConfigurationChanged" running for instance "RMS.MYDOMAIN.COM" with id:"{04F4ADED-2C7F-92EF-D620-9AF9685F736F}" in management group "SCOMPROD"

Or, worst case, if it still does not fit, we can still go and search for it in the actual, usual eventlog application… but at least we will have spotted it!

 

The above wants to give you an idea of what is easily accomplished with some simple one-liners, and how it can be a useful aid in analyzing/digging into Eventlogs.

All of the above is ALSO be possible with Logparser, and it would actually be even less heavy on memory usage and it will be quicker, to be honest!

I just like Powershell syntax a lot more, and its ubiquity, which makes it a better option for me. Your mileage may vary, of course.

Invoking Methods on the Xplat agent with WINRM

Monday, October 26th, 2009

So I was testing other stuff tonight, to be honest, but I got pinged on Instant Messenger by my geek friend and colleague Stefan Stranger who pointed me at his request for help here http://friendfeed.com/sstranger/4571f39b/help-needed-on-winrs-or-winrm-and-openwsman-to

He wanted to use WINRM or any other command line utility to interact with the Xplat agent, and call methods on the Unix machine from windows. This could be very useful to – for example – restart a service (in fact it is what the RECOVERY actions in the Xplat Management Packs do, btw).

At first I told him I had only tested enumerations – such as on this other post http://www.muscetta.com/2009/06/01/using-the-scx-agent-with-wsman-from-powershell-v2/ … but the question intrigued me, so I check out the help for winrm’s INVOKE verb:

clip_image002

Which told me that you can pass in the parameters for the method to be called/invoked either as an hashtable @{KEY=”value”;KEY2=”value”}, or as an input XML file. I first tried the XML file but I could not get its format right.

After a few more minutes of trying, I figured out the right syntax.

This one works, for example:

winrm invoke ExecuteCommand http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_OperatingSystem?__cimnamespace=root/scx @{command="ps";timeout="60"} -username:root -password:password -auth:basic -r:https://virtubuntu.huis.dom:1270/wsman -skipCACheck -encoding:UTF-8

clip_image004

Happy remote management of your unix systems from Windows :-)

PS> Get-Milk

Thursday, September 17th, 2009

PS> Get-Milk

I printed a tshirt for Sara with a baby-friendly Powershell cmdlet ("Get-Milk").
She already seems to be wondering what script she can write with it.

PS> Get-Milk

PS> Get-Milk

The mystery of the lost registry values

Thursday, September 10th, 2009

During the OpsMgr Health Check engagement we use custom code to assess the customer’s Management group, as I wrote here already. Given that the customer tells us which machine is the RMS, one of the very first things that we do in our tool is to connect to the RMS’s registry, and check the values under HKLM\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup to see which machine holds the database. It is a rather critical piece of information for us, as we run a number of queries afterward… so we need to know where the db is, obviously :-)

I learned from here http://mybsinfo.blogspot.com/2007/01/powershell-remote-registry-and-you-part.html how to access registry remotely thru powershell, by using .Net classes. This is also one of the methods illustrated in this other article on Technet Script Center http://www.microsoft.com/technet/scriptcenter/resources/qanda/jan09/hey0105.mspx 

Therefore the “core” instructions of the function I was using to access the registry looked like the following

  1. Function GetValueFromRegistry ([string]$computername, $regkey, $value)   
  2. {  
  3.      $reg = [Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey('LocalMachine', $computername)  
  4.      $regKey= $reg.OpenSubKey("$regKey")  
  5.      $result = $regkey.GetValue("$value")  
  6.      return $result 
  7. }  

 

[Note: the actual function is bigger, and contains error handling, and logging, and a number of other things that are unnecessary here]

Therefore, the function was called as follows:
GetValueFromRegistry $RMS "SOFTWARE\\Microsoft\\Microsoft Operations Manager\\3.0\\Setup" "DatabaseServerName"
Now so far so good.

In theory.

 

Now for some reason that I could not immediately explain, we had noticed that this piece of code performing registry accessm while working most of the times, only on SOME occasions was giving errors about not being able to open the registry value…

image

When you are onsite with a customer conducting an assessment, the PFE engineer does not always has the time to troubleshoot the error… as time is critical, we have usually resorted to just running the assessment from ANOTHER machine, and this “solved” the issue… but always left me wondering WHY this was giving an error. I had suspected an issue with permissions first, but it could not be as the permissions were obviously right: performing the assessment from another machine but with the same user was working!

A few days ago my colleague and buddy Stefan Stranger figured out that this was related to the platform architecture:

  • X64 client to x64 RMS was working
  • X64 client to x86 RMS was working
  • X86 client to x86 RMS was working
  • X86 client to x64 RMS was NOT working

You don’t need to use our custom code to reproduce this, REGEDIT shows the behavior as well.

If, from a 64-bit server, you open a remote registry connection to 64-bit RMS server, you can see all OpsMgr registry keys:

clip_image002

If, anyhow, from a 32-bit server, you open a remote registry connection to 64-bit RMS server, you don’t see ALL – but only SOME – OpsMgr registry keys:
clip_image004

So here’s the reason! This is what was happening! How could I not think of this before? It was nothing related to permissions, but to registry redirection! The issue was happening because the 32 bit machine is using the 32bit registry editor and what it will do when accessing a 64bit machine will be to default to the Wow6432Node location in the registry. There all OpsMgr data won’t be in the WOW64 location on a 64bit machine, only some.

So, just like regedit, the 32bit powershell and the 32bit .Net framework were being redirected to the 32bit-compatibility registry keys… not finding the stuff we needed, whereas a 64bit application could find that. Any 32bit application by default gets redirected to a 32bit-safe registry.

So, after finally UNDERSTANDING what the issue was, I started wondering: ok… but how can I access the REAL “HLKM\SOFTWARE\Microsoft” key on a 64bit machine when running this FROM a 32bit machine – WITHOUT being redirected to “HKLM\SOFTWARE\Wow6432Node\Microsoft” ? What if my application CAN deal just fine with those values and actually NEEDs to access them?

The answer wasn’t as easy as the question. I did a bit of digging on this, and still I have NOT yet found a way to do this with the .Net classes. It seems that in a lot of situations, Powershell or even .Net classes are nice and sweet wrappers on the underlying Windows APIs… but for how sweet and easy they are, they are very often not very complete wrappers – letting you do just about enough for most situations, but not quite everything you would or could with the APi underneath. But I digress, here…

The good news is that I did manage to get this working, but I had to resort to using dear old WMI StdRegProvider… There are a number of locations on the Internet mentioning the issue of accessing 32bit registry from 64bit machines or vice versa, but all examples I have found were using VBScript. But I needed it in Powershell. Therefore I started with the VBScript example code that is present here, and I ported it to Powershell.

Handling the WMI COM object from Powershell was slightly less intuitive than in VBScript, and it took me a couple of hours to figure out how to change some stuff, especially this bit that sets the parameters collection:

Set Inparams = objStdRegProv.Methods_("GetStringValue").Inparameters

Inparams.Hdefkey = HKLM

Inparams.Ssubkeyname = RegKey

Inparams.Svaluename = RegValue

Set Outparams = objStdRegProv.ExecMethod_("GetStringValue", Inparams,,objCtx)

INTO this:

$Inparams = ($objStdRegProv.Methods_ | where {$_.name -eq "GetStringValue"}).InParameters.SpawnInstance_()

($Inparams.Properties_ | where {$_.name -eq "Hdefkey"}).Value = $HKLM

($Inparams.Properties_ | where {$_.name -eq "Ssubkeyname"}).Value = $regkey

($Inparams.Properties_ | where {$_.name -eq "Svaluename"}).Value = $value

$Outparams = $objStdRegProv.ExecMethod_("GetStringValue", $Inparams, "", $objNamedValueSet)

 

I have only done limited testing at this point and, even if the actual work now requires nearly 15 lines of code to be performed vs. the previous 3 lines in the .Net implementation, it at least seems to work just fine.

What follows is the complete code of my replacement function, in all its uglyness glory:

 

  1. Function GetValueFromRegistryThruWMI([string]$computername, $regkey, $value)  
  2. {  
  3.     #constant for the HLKM  
  4.     $HKLM = "&h80000002" 
  5.  
  6.     #creates an SwbemNamedValueSet object
  7.     $objNamedValueSet = New-Object -COM "WbemScripting.SWbemNamedValueSet" 
  8.  
  9.     #adds the actual value that will requests the target to provide 64bit-registry info
  10.     $objNamedValueSet.Add("__ProviderArchitecture", 64) | Out-Null 
  11.  
  12.     #back to all the other usual COM objects for WMI that you have used a zillion times in VBScript
  13.     $objLocator = New-Object -COM "Wbemscripting.SWbemLocator" 
  14.     $objServices = $objLocator.ConnectServer($computername,"root\default","","","","","",$objNamedValueSet)  
  15.     $objStdRegProv = $objServices.Get("StdRegProv")  
  16.  
  17.     # Obtain an InParameters object specific to the method.  
  18.     $Inparams = ($objStdRegProv.Methods_ | where {$_.name -eq "GetStringValue"}).InParameters.SpawnInstance_()  
  19.   
  20.     # Add the input parameters  
  21.     ($Inparams.Properties_ | where {$_.name -eq "Hdefkey"}).Value = $HKLM 
  22.     ($Inparams.Properties_ | where {$_.name -eq "Ssubkeyname"}).Value = $regkey 
  23.     ($Inparams.Properties_ | where {$_.name -eq "Svaluename"}).Value = $value 
  24.  
  25.     #Execute the method  
  26.     $Outparams = $objStdRegProv.ExecMethod_("GetStringValue", $Inparams, "", $objNamedValueSet)  
  27.  
  28.     #shows the return value  
  29.     ($Outparams.Properties_ | where {$_.name -eq "ReturnValue"}).Value  
  30.  
  31.     if (($Outparams.Properties_ | where {$_.name -eq "ReturnValue"}).Value -eq 0)  
  32.     {  
  33.        write-host "it worked" 
  34.        $result = ($Outparams.Properties_ | where {$_.name -eq "sValue"}).Value  
  35.        write-host "Result: $result" 
  36.        return $result 
  37.     }  
  38.     else 
  39.     {  
  40.         write-host "nope" 
  41.     }  
  42. }  

 

which can be called similarly to the previous one:
GetValueFromRegistryThruWMI $RMS "SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Setup" "DatabaseServerName"

[Note: you don’t need the double\escape backslashes here, compared to the .Net implementation]

Enjoy your cross-architecture registry access: from 32bit to 64bit – and back!

SCX Evolutions

Sunday, July 19th, 2009

During the beta of the Cross-Platform extensions and of System Center Operations Manager 2007 R2, the product team had promised to eventually release the SCX Providers'source code.

Now that this promise has been mantained, and the SCX providers have been released on Codeplex at http://xplatproviders.codeplex.com/ it should be finally possible to entirely build your own unsupported agent package, starting from source code, without having to modify the original package as I have shown earlier on this blog.
Of course this will still be unsupported by Microsoft Product support, but will eventually work just fine!
This is an extraordinary event in my opinion, as it is not a common event that Microsoft releases code as open source, especially when this is part of one of the product it sells. I suspect we will see more of this as we going forward.

Also, at R2 release time, some official documentation about buildilng Cross-Plaform Management Packs has been published on Technet.

Anyway, I have in the past posted a number of posts on my blog under this tag http://www.muscetta.com/tag/xplat/ (I will continue to use that tag going forward) which show/describe how I hacked/modified both the existing MPs AND the SCX agent package to let it run on unsupported distributions (and I think they are still useful as they show a number of techniques about how to test, understand and troubleshoot the Xplat agent a bit. In fact, I have first learned how to understand and modify the RedHat MPs to monitor CentOS and eventually even modified the RPM package to run on Ubuntu (which also works on Debian 5/Lenny), eventually, as you can see because I am now using it to monitor – from home, across the Internet – the machine running this blog:

www.muscetta.com Performance in OpsMgr

Or even, with or without OpsMgr 2007 R2, you could write your own scripts to interact with those providers, by using your favourite Scripting Language.

After all, those experimentations with Xplat got me a fame of being a "Unix expert at Microsoft" (this expression still makes me laugh), as I was tweeting here:
Unix expert at Microsoft

But really, I have never hidden my interest for interoperability and the fact that I have been using Linux quite a bit in the past, and still do.

Also, one more related information is that the fine people at Xandros have released their Bridgeways Management Packs and at the same time also started their own blog at http://blog.xplatxperts.com/ where they discuss some troubleshooting techniques for the Xplat agent, both similar to what I have been writing about here and also – of course – specific to their own providers, that are in their XSM namespace.

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I'VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.