Three quarters of 2015, my IT career and various ramblings

September is over. The first three quarters of 2015 are over.
This has been a very important year so far – difficult, but revealing. Everything has been about change, healing and renewal.

We moved back to Europe first, and you might have now also read my other post about leaving Microsoft, more recently.

This was a hard choice – it took many months to reach the conclusion this is what I needed to do.

Most people have gone thru strong programming: they think you have to be ‘successful’ at something. Success is externally defined, anyhow (as opposed to satisfaction which we define ourselves) and therefore you are supposed to study in college a certain field, then use that at work to build your career in the same field… and keep doing the same thing.

I was never like that – I didn’t go to college, I didn’t study as an ‘engineer’. I just saw there was a market opportunity to find a job when I started, studied on the job, eventually excelled at it. But it never was *the* road. It just was one road; it has served me well so far, but it was just one thing I tried, and it worked out.
How did it start? As a pre-teen, I had been interested in computers, then left that for a while, did ‘normal’ high school (in Italy at the time, this was really non-technological), then I tried to study sociology for a little bit – I really enjoyed the Cultural Anthropology lessons there, and we were smoking good weed with some folks outside of the university, but I really could not be asked to spend the following 5 or 10 years or my life just studying and ‘hanging around’ – I wanted money and independence to move out of my parent’s house.

So, without much fanfare, I revived my IT knowledge: upgraded my skill from the ‘hobbyist’ world of the Commodore 64 and Amiga scene (I had been passionate about modems and the BBS world then), looked at the PC world of the time, rode the ‘Internet wave’ and applied for a simple job at an IT company.

A lot of my friends were either not even searching for a job, with the excuse that there weren’t any, or spending time in university, in a time of change, where all the university-level jobs were taken anyway so that would have meant waiting even more after they had finished studying… I am not even sure they realized this until much later.
But I just applied, played my cards, and got my job.

When I went to sign it, they also reminded me they expected hard work at the simplest and humblest level: I would have to fix PC’s, printers, help users with networking issues and tasks like those – at a customer of theirs, a big company.
I was ready to roll up my sleeves and help that IT department however I would be capable of, and I did.
It all grew from there.

And that’s how my IT career started. I learned all I know of IT on the job and by working my ass off and studying extra hours and watching older/more expert colleagues and making experience.

I am not an engineer.
I am, at most, a mechanic.
I did learn a lot of companies and the market, languages, designs, politics, the human and technical factors in software engineering and the IT marketplace/worlds, over the course of the past 18 years.

But when I started, I was just trying to lend a honest hand, to get paid some money in return – isn’t that what work was about?

Over time IT got out of control. Like Venom, in the Marvel comics, that made its appearance as a costume that SpiderMan started wearing… and it slowly took over, as the ‘costume’ was in reality some sort of alien symbiotic organism (like a pest).

You might be wondering what I mean. From the outside I was a successful Senior Program Manager of a ‘hot’ Microsoft product.
Someone must have mistaken my diligence and hard work for ‘talent’ or ‘desire of career’ – but it never was.
I got pushed up, taught to never turn down ‘opportunities’.

But I don’t feel this is my path anymore.
That type of work takes too much metal energy off me, and made me neglect myself and my family. Success at the expense of my own health and my family’s isn’t worth it. Some other people wrote that too – in my case I stopped hopefully earlier.

So what am I doing now?

First and foremost, I am taking time for myself and my family.
I am reading (and writing)
I am cooking again
I have been catching up on sleep – and have dreams again
I am helping my father in law to build a shed in his yard
We bought a 14-years old Volkswagen van that we are turning into a Camper
I have not stopped building guitars – in fact I am getting setup to do it ‘seriously’ – so I am also standing up a separate site to promote that activity
I am making music and discovering new music and instruments
I am meeting new people and new situations

There’s a lot of folks out there who either think I am crazy (they might be right, but I am happy this way), or think this is some sort of lateral move – I am not searching for another IT job, thanks. Stop the noise on LinkedIn please: I don’t fit in your algorithms, I just made you believe I did, all these years.

Repost: Useful SetSPN tips

I just saw that my former colleague (PFE) Tristan has posted an interesting note about the use of SetSPN “–A” vs SetSPN “–S”. I normally don’t repost other people’s content, but I thought this would be useful as there are a few SPN used in OpsMgr and it is not always easy to get them all right… and you can find a few tricks I was not aware of, by reading his post.

Check out the original post at http://blogs.technet.com/b/tristank/archive/2011/10/10/psa-you-really-need-to-update-your-kerberos-setup-documentation.aspx

I have been chosen; Farewell my friends…

I have been in Premier Field Engineering for nearly 7 years (it was not even called PFE when I joined – it was just “another type of support”…) and I have to admit that it has been a fun, fun ride: I worked with awesome people and managed to make a difference with our products and services for many customers – directly working with some of those customers, as well as indirectly thru the OpsMgr Health Check program – the service I led for the last 3+ years, which nowadays gets delivered hundreds of times a year around the globe by my other fellow PFEs.

But it is time to move on: I have decided to go thru a big life change for me and my family, and I won’t be working as a Premier Field Engineer anymore as of next week.

But don’t panic – I am staying at Microsoft!

I have actually never been closer to Microsoft than now: we are packing and moving to Seattle the coming weekend, and on July 18th I will start working as a Program Manager in the Operations Manager product team, in Redmond. I am hoping this will enable me to make a difference with even more customers.

Exciting times ahead – wish me luck!

 

That said – PFE is hiring! If you are interested in working for Microsoft – we have open positions (including my vacant position in Italy) for almost all the Microsoft technologies. Simply visit http://careers.microsoft.com and search on “PFE”.

As for the OpsMgr Health Check, don’t you worry: it will continue being improved – I left it in the hands of some capable colleagues: Bruno Gabrielli, Stefan Stranger and Tim McFadden – and they have a plan and commitment to update it to OpsMgr 2012.

Improved ACS Partitions Query

This has been sitting on my hard drive for a long time. Long story short, the report I posted at Permanent Link to Audit Collection Services Database Partitions Size Report had a couple of bugs:

  1. it did not consider the size of the dtString_XXX tables but only the size of dtEvent_XXX tables – this would still give you an idea of the trends, but it could lead to quite different SIZE calculations
  2. the query was failing on some instances that have been installed with the wrong (unsupported) Collation settings.

I fixed both bugs, but I don’t have a machine with SQL 2005 and Visual Studio 2005 anymore… so I can’t rebuild my report – but I don’t want to distribute one that only works on SQL 2008 because I know that SQL2005 is still out there. This is partially the reason that held this post back.

Without waiting so much longer, therefore, I decided I’ll just give you the fixed query. Enjoy Smile

--Query to get the Partition Table
--for each partition we launch the sp_spaceused stored procedure to determine the size and other info

--partition list
select PartitionId,Status,PartitionStartTime,PartitionCloseTime 
into #t1
from dbo.dtPartition with (nolock)
order by PartitionStartTime Desc 

--sp_spaceused holder table for dtEvent
create table #t2 (
    PartitionId nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
    rows nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
    reserved nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
    data nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
    index_size nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
    unused nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS    
)

--sp_spaceused holder table for dtString
create table #t3 (
    PartitionId nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
    rows nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
    reserved nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
    data nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
    index_size nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS,
    unused nvarchar(MAX) Collate SQL_Latin1_General_CP1_CI_AS    
)

set nocount on

--vars used for building Partition GUID and main table name
declare @partGUID nvarchar(MAX)
declare @tblName nvarchar(MAX)
declare @tblNameComplete nvarchar(MAX)
declare @schema nvarchar(MAX)
DECLARE @vQuery NVARCHAR(MAX)

--cursor
declare c cursor for 
    select PartitionID from #t1
open c
fetch next from c into @partGUID

--start cursor usage
while @@FETCH_STATUS = 0
begin

--tblName - first usage for dtEvent
set @tblName = 'dtEvent_' + @partGUID

--retrieve the schema name
SET @vQuery = 'SELECT @dbschema = TABLE_SCHEMA from INFORMATION_SCHEMA.tables where TABLE_NAME = ''' + @tblName + ''''
EXEC sp_executesql @vQuery,N'@dbschema nvarchar(max) out, @dbtblName nvarchar(max)',@schema out, @tblname

--tblNameComplete
set @tblNameComplete = @schema + '.' + @tblName

INSERT #t2 
    EXEC sp_spaceused @tblNameComplete

--tblName - second usage for dtString
set @tblName = 'dtString_' + @partGUID

--retrieve the schema name
SET @vQuery = 'SELECT @dbschema = TABLE_SCHEMA from INFORMATION_SCHEMA.tables where TABLE_NAME = ''' + @tblName + ''''
EXEC sp_executesql @vQuery,N'@dbschema nvarchar(max) out, @dbtblName nvarchar(max)',@schema out, @tblname

--tblNameComplete
set @tblNameComplete = @schema + '.' + @tblName

INSERT #t3 
    EXEC sp_spaceused @tblNameComplete

fetch next from c into @partGUID
end
close c
deallocate c

--select * from #t2
--select * from #t3

--results
select #t1.PartitionId, 
    #t1.Status, 
    #t1.PartitionStartTime, 
    #t1.PartitionCloseTime, 
    #t2.rows,
    (CAST(LEFT(#t2.reserved,LEN(#t2.reserved)-3) AS NUMERIC(18,0)) + CAST(LEFT(#t2.reserved,LEN(#t2.reserved)-3) AS NUMERIC(18,0))) as 'reservedKB', 
    (CAST(LEFT(#t2.data,LEN(#t2.data)-3) AS NUMERIC(18,0)) + CAST(LEFT(#t3.data,LEN(#t3.data)-3) AS NUMERIC(18,0)))as 'dataKB', 
    (CAST(LEFT(#t2.index_size,LEN(#t2.index_size)-3) AS NUMERIC(18,0)) + CAST(LEFT(#t3.index_size,LEN(#t3.index_size)-3) AS NUMERIC(18,0))) as 'indexKB', 
    (CAST(LEFT(#t2.unused,LEN(#t2.unused)-3) AS NUMERIC(18,0)) + CAST(LEFT(#t3.unused,LEN(#t3.unused)-3) AS NUMERIC(18,0))) as 'unusedKB'
from #t1
join #t2
on #t2.PartitionId = ('dtEvent_' + #t1.PartitionId)
join #t3
on #t3.PartitionId = ('dtString_' + #t1.PartitionId)
order by PartitionStartTime desc

--cleanup
drop table #t1
drop table #t2
drop table #t3

OpsMgr Agents and Gateways Failover Queries

The following article by Jimmy Harper explains very well how to set up agents and gateways’ failover paths thru Powershell http://blogs.technet.com/b/jimmyharper/archive/2010/07/23/powershell-commands-to-configure-gateway-server-agent-failover.aspx . This is the approach I also recommend, and that article is great – I encourage you to check it out if you haven’t done it yet!

Anyhow, when checking for the actual failover paths that have been configured, the use of Powershell suggested by Jimmy is rather slow – especially if your agent count is high. In the Operations Manager Health Check tool I was also using that technique at the beginning, but eventually moved to the use of SQL queries just for performance reasons. Since then, we have been using these SQL queries quite successfully for about 3 years now.

But this the season of giving… and I guess SQL Queries can be a gift, right? Therefore I am now donating them as Christmas Gift to the OpsMrg community Smile

Enjoy – and Merry Christmas!

 

--GetAgentForWhichServerIsPrimary
SELECT SourceBME.DisplayName as Agent,TargetBME.DisplayName as Server
FROM Relationship R WITH (NOLOCK) 
JOIN BaseManagedEntity SourceBME 
ON R.SourceEntityID = SourceBME.BaseManagedEntityID 
JOIN BaseManagedEntity TargetBME 
ON R.TargetEntityID = TargetBME.BaseManagedEntityID 
WHERE R.RelationshipTypeId = dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceCommunication() 
AND SourceBME.DisplayName not in (select DisplayName 
from dbo.ManagedEntityGenericView WITH (NOLOCK) 
where MonitoringClassId in (select ManagedTypeId 
from dbo.ManagedType WITH (NOLOCK) 
where TypeName = 'Microsoft.SystemCenter.GatewayManagementServer') 
and IsDeleted ='0') 
AND SourceBME.DisplayName not in (select DisplayName from dbo.ManagedEntityGenericView WITH (NOLOCK) 
where MonitoringClassId in (select ManagedTypeId from dbo.ManagedType WITH (NOLOCK) 
where TypeName = 'Microsoft.SystemCenter.ManagementServer') 
and IsDeleted ='0') 
AND R.IsDeleted = '0'


--GetAgentForWhichServerIsFailover
SELECT SourceBME.DisplayName as Agent,TargetBME.DisplayName as Server
FROM Relationship R WITH (NOLOCK) 
JOIN BaseManagedEntity SourceBME 
ON R.SourceEntityID = SourceBME.BaseManagedEntityID 
JOIN BaseManagedEntity TargetBME 
ON R.TargetEntityID = TargetBME.BaseManagedEntityID 
WHERE R.RelationshipTypeId = dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceSecondaryCommunication() 
AND SourceBME.DisplayName not in (select DisplayName 
from dbo.ManagedEntityGenericView WITH (NOLOCK) 
where MonitoringClassId in (select ManagedTypeId 
from dbo.ManagedType WITH (NOLOCK) 
where TypeName = 'Microsoft.SystemCenter.GatewayManagementServer') 
and IsDeleted ='0') 
AND SourceBME.DisplayName not in (select DisplayName 
from dbo.ManagedEntityGenericView WITH (NOLOCK) 
where MonitoringClassId in (select ManagedTypeId 
from dbo.ManagedType WITH (NOLOCK) 
where TypeName = 'Microsoft.SystemCenter.ManagementServer') 
and IsDeleted ='0') 
AND R.IsDeleted = '0'


--GetGatewayForWhichServerIsPrimary
SELECT SourceBME.DisplayName as Gateway, TargetBME.DisplayName as Server
FROM Relationship R WITH (NOLOCK) 
JOIN BaseManagedEntity SourceBME 
ON R.SourceEntityID = SourceBME.BaseManagedEntityID 
JOIN BaseManagedEntity TargetBME 
ON R.TargetEntityID = TargetBME.BaseManagedEntityID 
WHERE R.RelationshipTypeId = dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceCommunication() 
AND SourceBME.DisplayName in (select DisplayName 
from dbo.ManagedEntityGenericView WITH (NOLOCK) 
where MonitoringClassId in (select ManagedTypeId 
from dbo.ManagedType WITH (NOLOCK) 
where TypeName = 'Microsoft.SystemCenter.GatewayManagementServer') 
and IsDeleted ='0') 
AND R.IsDeleted = '0'
    

--GetGatewayForWhichServerIsFailover
SELECT SourceBME.DisplayName As Gateway, TargetBME.DisplayName as Server
FROM Relationship R WITH (NOLOCK) 
JOIN BaseManagedEntity SourceBME 
ON R.SourceEntityID = SourceBME.BaseManagedEntityID 
JOIN BaseManagedEntity TargetBME 
ON R.TargetEntityID = TargetBME.BaseManagedEntityID 
WHERE R.RelationshipTypeId = dbo.fn_ManagedTypeId_MicrosoftSystemCenterHealthServiceSecondaryCommunication() 
AND SourceBME.DisplayName in (select DisplayName 
from dbo.ManagedEntityGenericView WITH (NOLOCK) 
where MonitoringClassId in (select ManagedTypeId 
from dbo.ManagedType WITH (NOLOCK) 
where TypeName = 'Microsoft.SystemCenter.GatewayManagementServer') 
and IsDeleted ='0') 
AND R.IsDeleted = '0'


--xplat agents
select bme2.DisplayName as XPlatAgent, bme.DisplayName as Server
from dbo.Relationship r with (nolock) 
join dbo.RelationshipType rt with (nolock) 
on r.RelationshipTypeId = rt.RelationshipTypeId 
join dbo.BasemanagedEntity bme with (nolock) 
on bme.basemanagedentityid = r.SourceEntityId 
join dbo.BasemanagedEntity bme2 with (nolock) 
on r.TargetEntityId = bme2.BaseManagedEntityId 
where rt.RelationshipTypeName = 'Microsoft.SystemCenter.HealthServiceManagesEntity' 
and bme.IsDeleted = 0 
and r.IsDeleted = 0 
and bme2.basemanagedtypeid in (SELECT DerivedTypeId 
FROM DerivedManagedTypes with (nolock) 
WHERE BaseTypeId = (select managedtypeid 
from managedtype where typename = 'Microsoft.Unix.Computer') 
and DerivedIsAbstract = 0)

Got Orphaned OpsMgr Objects?

Have you ever wondered what would happen if, in Operations Manager, you’d delete a Management Server or Gateway that managed objects (such as network devices) or has agents pointing uniquely to it as their primary server?

The answer is simple, but not very pleasant: you get ORPHANED objects, which will linger in the database but you won’t be able to “see” or re-assign anymore from the GUI.

So the first thing I want to share is a query to determine IF you have any of those orphaned agents. Or even if you know, since you are not able to “see” them from the console, you might have to dig their name out of the database. Here’s a query I got from a colleague in our reactive support team:


-- Check for orphaned health services (e.g. agent).
declare @DiscoverySourceId uniqueidentifier;
SET @DiscoverySourceId = dbo.fn_DiscoverySourceId_User();
SELECT TME.[TypedManagedEntityid], HS.PrincipalName
FROM MTV_HealthService HS
INNER JOIN dbo.[BaseManagedEntity] BHS WITH(nolock)
ON BHS.[BaseManagedEntityId] = HS.[BaseManagedEntityId]
-- get host managed computer instances
INNER JOIN dbo.[TypedManagedEntity] TME WITH(nolock)
ON TME.[BaseManagedEntityId] = BHS.[TopLevelHostEntityId]
AND TME.[IsDeleted] = 0
INNER JOIN dbo.[DerivedManagedTypes] DMT WITH(nolock)
ON DMT.[DerivedTypeId] = TME.[ManagedTypeId]
INNER JOIN dbo.[ManagedType] BT WITH(nolock)
ON DMT.[BaseTypeId] = BT.[ManagedTypeId]
AND BT.[TypeName] = N'Microsoft.Windows.Computer'
-- only with missing primary
LEFT OUTER JOIN dbo.Relationship HSC WITH(nolock)
ON HSC.[SourceEntityId] = HS.[BaseManagedEntityId]
AND HSC.[RelationshipTypeId] = dbo.fn_RelationshipTypeId_HealthServiceCommunication()
AND HSC.[IsDeleted] = 0
INNER JOIN DiscoverySourceToTypedManagedEntity DSTME WITH(nolock)
ON DSTME.[TypedManagedEntityId] = TME.[TypedManagedEntityId]
AND DSTME.[DiscoverySourceId] = @DiscoverySourceId
WHERE HS.[IsAgent] = 1
AND HSC.[RelationshipId] IS NULL;

Once you have identified the agent you need to re-assign to a new management server, this is doable from the SDK. Below is a powershell script I wrote which will re-assign it to the RMS. It has to run from within the OpsMgr Command Shell.
You still need to change the logic which chooses which agent – this is meant as a starting base… you could easily expand it into accepting parameters and/or consuming an input text file, or using a different Management Server than the RMS… you get the point.

  1. $mg = (get-managementgroupconnection).managementgroup
  2. $mrc = Get-RelationshipClass | where {$_.name –like “*Microsoft.SystemCenter.HealthServiceCommunication*”}
  3. $cmro = new-object Microsoft.EnterpriseManagement.Monitoring.CustomMonitoringRelationshipObject($mrc)
  4. $rms = (get-rootmanagementserver).HostedHealthService
  5. $deviceclass = $mg.getmonitoringclass(“HealthService”)
  6. $mc = Get-connector | where {$_.Name –like “*MOM Internal Connector*”}
  7. Foreach ($obj in $mg.GetMonitoringObjects($deviceclass))
  8. {
  9.     #the next line should be changed to pick the right agent to re-assign
  10.     if ($obj.DisplayName -match ‘dsxlab’)
  11.     {
  12.                 Write-host $obj.displayname
  13.                 $imdd = new-object Microsoft.EnterpriseManagement.ConnectorFramework.IncrementalMonitoringDiscoveryData
  14.                 $cmro.SetSource($obj)
  15.                 $cmro.SetTarget($rms)
  16.                 $imdd.Add($cmro)
  17.                 $imdd.Commit($mc)
  18.     }
  19. }

Similarly, you might get orphaned network devices. The script below is used to re-assign all Network Devices to the RMS. This script is actually something I have had even before the other one (yes, it has been sitting in my “digital drawer” for a couple of years or more…) and uses the same concept – only you might notice that the relation’s source and target are “reversed”, since the relationships are different:

  • the Management Server (source) “manages” the Network Device (target)
  • the Agent (source) “talks” to the Management Server (target)

With a bit of added logic it should be easy to have it work for specific devices.

  1. $mg = (get-managementgroupconnection).managementgroup
  2. $mrc = Get-RelationshipClass | where {$_.name –like “*Microsoft.SystemCenter.HealthServiceShouldManageEntity*”}
  3. $cmro = new-object Microsoft.EnterpriseManagement.Monitoring.CustomMonitoringRelationshipObject($mrc)
  4. $rms = (get-rootmanagementserver).HostedHealthService
  5. $deviceclass = $mg.getmonitoringclass(“NetworkDevice”)
  6. Foreach ($obj in $mg.GetMonitoringObjects($deviceclass))
  7. {
  8.                 Write-host $obj.displayname
  9.                 $imdd = new-object Microsoft.EnterpriseManagement.ConnectorFramework.IncrementalMonitoringDiscoveryData
  10.                 $cmro.SetSource($rms)
  11.                 $cmro.SetTarget($obj)
  12.                 $imdd.Add($cmro)
  13.                 $mc = Get-connector | where {$_.Name –like “*MOM Internal Connector*”}
  14.                 $imdd.Commit($mc)
  15. }

Disclaimer

The information in this weblog is provided “AS IS” with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided “AS IS” without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

Does anyone have a new System Center sticker for me?

Does anyone have a new System Center sticker?

I got this sticker last APRIL at MMS2010 in JUST ONE COPY, and I waited till I got a NEW laptop in SEPTEMBER to actually use that…
It also took a while to stick it on properly (other than to re-install the PC as I wanted…),  but this week they told me that, for an error, I got given the wrong machine (they did it all themselves, tho – I did not ask for any specific one) and this one needs to be replaced!!!!

This is WORSE than any hardware FAILure, as the machine just works very well and I was expecting to keep it for the next two years 🙁

Can anyone be so nice to send me one of those awesome stickers again? 🙂

OpsMgr Event IDs Spreadsheet

I work in support (mostly with System Center Operations Manager, as you know), and I work with event logs every day. The following are typical situations:

  1. I get a colleague or a customer telling me “I am having a problem and the SCOM agent is showing 21037 events and 20002 events.  What’s wrong with it?”   
  2. I want to tune an OpsMgr environment and reduce load on the database by turning off a few event collections, as my friend Kevin Holman suggests here http://blogs.technet.com/kevinholman/archive/2009/11/25/tuning-tip-turning-off-some-over-collection-of-events.aspx .
  3. I am analyzing, sorting and grouping Events with Powershell like I have written on my blog lately http://www.muscetta.com/2009/12/16/opsmgr-eventlog-analysis-with-powershell/ but I can’t read those long descriptions properly.
  4. I exported an EVT from a customer environment and I load it on a machine that does not have OpsMgr message DLLs installed – all I see are EventIDs and type (Warning, Error) – but no real description – and I still want to figure out what those events are trying to tell me.

Getting to the point: I, like everyone – don’t have every OpsMgr event memorized.

This is why I thought of building this spreadsheet, and I hope it might come in handy to more people.

The spreadsheet contains an “AllEvents” list – and then the same events are broken down by event source as well:

clip_image002

When you want to search for an events (in one of the situations described above) just open up the spreadsheet, go to the “AllEvents” tab, hit CTRL+F (“Find”) and type in the Event ID you are searching for:

clip_image004

And this will take you to the row containing the event, so you can look up its description:

clip_image006

The description shows the event standard text (which is in the message DLL, therefore is the part you will not see if opening an EVT on another machine that does not have OpsMgr installed), and where the event parameters are (%1, %2, etc – which will be the strings you see in the EVT anyway).

That way you can get an understanding of what the original message would have looked like on the original machine.

This is just one possible usage pattern of this reference. It can also be useful to just read/study the events, learning about new ones you have never encountered, or remembering those you HAVE seen in the past but did not quite remember. And of course you can also find other creative ways to use it.

You can get it from here.

 

A few last words to give due credit: this spreadsheet has been compiled by using Eventlog Explorer (http://blogs.technet.com/momteam/archive/2008/04/02/eventlog-explorer.aspx ) to extract the event information out of the message DLLs on a OpsMgr2007 R2 installation. That info has been then copied and pasted in Excel in order to have an “offline” reference. Also I would like to thank Kevin Holman for pointing me to Eventlog Explorer first, and then for insisting I should not keep this spreadsheet in my drawer, as it could be useful to more people!

How to convert (and fixup) the RedHat RPM to run on Debian/Ubuntu

In an earlier post I had shown how I got the Xplat agent running on Ubuntu. I perfected the technique over time, and what follows is a step-by-step process on how to convert and change the RedHat package to run on Debian/Ubuntu. Of course this is still a hack… but some people asked me to detail it a bit more. At the same time, the cross platform team is working to update the the source code on codeplex with extra bits that will make more straightforward to grab it, modify it and re-compile it than it is today. Until then, here is how I got it to work.

I assume you have already copied the right .RPM package off the OpsMgr server’s /AgentManagement directory to the Linux box here. The examples below refer to the 32bit package, but of course the same identical technique would work for the 64bit version.

We start by converting the RPM package to DEB format:

root# alien -k scx-1.0.4-258.rhel.5.x86.rpm –scripts

scx_1.0.4-258_i386.deb generated

 

Then we need to create a folder where we will extract the content of the package, modify stuff, and repackage it:

root# mkdir scx_1.0.4-258_i386

root# cd scx_1.0.4-258_i386

root# ar -x ../scx_1.0.4-258_i386.deb

root# mkdir debian

root# cd debian

root# mkdir DEBIAN

root# cd DEBIAN

root# cd ../..

root# rm debian-binary

root# mv control.tar.gz debian/DEBIAN/

root# mv data.tar.gz debian/

root# cd debian

root# tar -xvzf data.tar.gz

root# rm data.tar.gz

root# cd DEBIAN/

root# tar -xvzf control.tar.gz

root# rm control.tar.gz

Now we have the “skeleton” of the package easily laid out on the filesystem and we are ready to modify the package and add/change stuff to and in it.

 

First, we need to add some stuff to it, which is expected to be found on a redhat distro, but is not present in debian. In particular:

1. You should copy the file “functions” (that you can get from a redhat/centos box under /etc/init.d) under the debian/etc/init.d folder in our package folder. This file is required/included by our startup scripts, so it needs to be deployed too.

Then we need to chang some of the packacge behavior by editing files under debian/DEBIAN:

2. edit the “control” file (a file describing what the package is, and does):

clip_image002

3. edit the “preinst” file (pre-installation instructions): we need to add instructions to copy the “issue” file onto “redhat-release” (as the SCX_OperatingSystem class will look into that file, and this is hard-coded in the binary, we need to let it find it):

clip_image004

these are the actual command lines to add for both packages (DEBIAN or UBUNTU):

# symbolic links for libaries called differently on Ubuntu and Debian vs. RedHat

ln -s /usr/lib/libcrypto.so.0.9.8 /usr/lib/libcrypto.so.6

ln -s /usr/lib/libssl.so.0.9.8 /usr/lib/libssl.so.6

the following bit would be Ubuntu-specific:

#we need this file for the OS provider relies on it, so we convert what we have in /etc/issue

#this is ok for Ubuntu (“Ubuntu 9.0.4 \n \l” becomes “Ubuntu 9.0.4”)

cat /etc/issue | awk ‘/\\n/ {print $1, $2}’ > /etc/redhat-release

while the following bit is Debian-specific:

#this is ok for Debian (“Debian GNU/Linux 5.0 \n \l” becomes “Debian GNU/Linux 5.0”)

cat /etc/issue | awk ‘/\\n/ {print $1, $2, $3}’ > /etc/redhat-release

 

4. Then we edit/modify the “postinst” file (post-installation instructions) as follows:

a. remove the 2nd and 3rd lines which look like the following

RPM_INSTALL_PREFIX=

export RPM_INSTALL_PREFIX

as they are only useful for the RPM system, not DEB/APT, so we don’t need them.

b. change the following 2 functions which contain RedHat-specific commands:

configure_pegasus_service() {

           /usr/lib/lsb/install_initd /etc/init.d/scx-cimd

}

start_pegasus_service() {

           service scx-cimd start

}

c. We need to change in the Debian equivalents for registering a service in INIT and starting it:

configure_pegasus_service() {

               update-rc.d scx-cimd defaults

}

start_pegasus_service() {

              /etc/init.d/scx-cimd start

}

5. Modify the “prerm” file (pre-removal instructions):

a. Just like “postinst”, remove the lines

RPM_INSTALL_PREFIX=

export RPM_INSTALL_PREFIX

b. Locate the two functions stopping and un-installing the service

stop_pegasus_service() {

         service scx-cimd stop

}

unregister_pegasus_service() {

          /usr/lib/lsb/remove_initd /etc/init.d/scx-cimd

}

c. Change those two functions with the Debian-equivalent command lines

stop_pegasus_service() {

           /etc/init.d/scx-cimd stop

}

unregister_pegasus_service() {

           update-rc.d -f scx-cimd remove

}

At this point the change we needed have been put in place, and we can re-build the DEB package.

Move yourself in the main folder of the application (the scx_1.0.4-258_i386 folder):

root# cd ../..

Create the package starting from the folders

root# dpkg-deb –build debian

dpkg-deb: building package `scx’ in `debian.deb’.

Rename the package (for Ubuntu)

root# mv debian.deb scx_1.0.4-258_Ubuntu_9_i386.deb

Rename the package (for Debian)

root# mv debian.deb scx_1.0.4-258_Debian_5_i386.deb

Install it

root# dpkg -i scx_1.0.4-258_Platform_Version_i386.deb

All done! It should install and work!

 

Next step would be creating a Management Pack to monitor Debian and Ubuntu. It is pretty similar to what Robert Hearn has described step by step for CentOS, but with some different replacements of strings, as you can imagine. I have done this but have not written down the procedure yet, so I will post another article on how to do this as soon as I manage to get it standardized and reliable. There is a bit more work involved for Ubuntu/Debian… as some of the daemons/services have different names, and certain files too… but nothing terribly difficult to change so you might want to try it already and have a go at it!

In the meantime, as a teaser, here’s my server’s (http://www.muscetta.com) performance, being monitored with this “hack”:

image

 

Disclaimer

The information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
THIS WORK IS NOT ENDORSED AND NOT EVEN CHECKED, AUTHORIZED, SCRUTINIZED NOR APPROVED BY MY EMPLOYER, AND IT ONLY REPRESENT SOMETHING WHICH I’VE DONE IN MY FREE TIME. NO GUARANTEE WHATSOEVER IS GIVEN ON THIS. THE AUTHOR SHALL NOT BE MADE RESPONSIBLE FOR ANY DAMAGE YOU MIGHT INCUR WHEN USING THIS INFORMATION. The solution presented here IS NOT SUPPORTED by Microsoft.

Audit Collection Services Database Partitions Size Report

A number of people I have talked to liked my previous post on ACS sizing. One thing that was not extremely easy or clear to them in that post was *how* exactly I did one thing I wrote:

[…] use the dtEvent_GUID table to get the number of events for that day, and use the stored procedure “sp_spaceused”  against that same table to get an overall idea of how much space that day is taking in the database […]

To be completely honest, I do not expect people to do this manually a hundred times if they have a hundred partitions. In fact, I have been doing this for a while with a script which will do the looping for me and run that sp_spaceused for me a number of time. I cannot share that script, but I do realize that this automation is very useful, therefore I wrote a “stand-alone” SQL query which, using a couple of temporary tables, produces a similar type of output. I also went a step further and packaged it into a SQL Server Reporting Services Report for everyone’s consumption. The report should look like the following screenshot, featuring a chart and the table with the numerical information about each and every partition in the database:

ACS Partitions Report

You can download the report from here.

You need to upload it to your report server, and change the data source to the shared Data Source that also the built-in ACS Reports use, and it should work.

[NOTE/UPDATE May 4th 2011: This report has a few bugs. I have posted the updated query on http://www.muscetta.com/2011/05/04/improved-acs-partitions-query/ . I am sorry I can’t provide a ready made report with the fix right now. Make sure you understand this and don’t implement it without testing.]

Enjoy!