Troubleshooting SCSM 2012 Cubes

Although I could stabilize my cube processing through the SQL Server Analysis Server, I was not able to successfully stabilize the SCSM 2012 environment and see my nightly runs complete. I could probably get to this state if I re-installed the SCSM 2012 Data Warehouse, but this situation will happen again and reinstallation is not a long term solution for each outage.

After I corrected my nightly run schedule issue (discussed here), the DWMaintenance, MPSyncJob, and the ETL jobs and all but one cube ran as expected. The SystemConfigItem job failed with “Object reference not set to an instance of an object.”

Figure 1 Object reference not set to an instance of an object

Opening a ticket with Microsoft we reviewed these SCSM 2012 Data Warehouse server’s Operations Manager log errors:

Event ID 33526 Incorrect user information detected. The domain name, user name or password were not valid. Ignoring the given credentials and proceeding with the default workflow account:

User information detected:
Domain name:
User name:

Per the SCSM Engineers, this is not a problem and can be ignored since the default workflow account has the necessary rights.

Event ID 33566 – An Exception was encountered while trying to process a cube. Cube Name: SystemCenterConfigItemCube Exception Message: Object reference not set to an instance of an object. Stack Trace: at Microsoft.SystemCenter.Warehouse.Olap.OlapCube.GetPartitionsToProcess(Cube asCube) at Microsoft.SystemCenter.Warehouse.Olap.OlapCube.Process(ManagementPackCube mpCube).

The key to the error is the “GetPartitionsToProcess.” One of my ConfigItem Cube partitions was missing an “object.” To validate SCSM Partitions for a particular cube, I ran this query against DWDataMart

select from etl.cubepartition where CubeName =‘SystemCenterConfigItemCube’

This listed all of the associated partitions. Reviewing each one, I noticed that all but one had FACTS for each month since June (installation date), except for ComputerHostNetworkAdapater monthly partition, which was missing September. I scripted out the missing FACT using one of the other existing month FACTs.

I then re-ran the cube job successfully via PowerShell.

After further research, my theory is that the DWMaintenance Job fails or stalls and is stopped.  This would explain why some of the why I’m missing all the FACTs in a DIM.  This happened to me again and I solved it by recreating the FACTs.  

 

Event ID 33545 – MP Sync did not associate the lower version of the Management Pack

This was due to an out-of-sync third party MP that I corrected with the vendor.

Event ID 4000 – A monitoring host is unresponsive or has crashed. The status code for the host failure was 2164195371.

This was occurring every 25-minutes. Under the guidance of a Microsoft engineer, we ran C:\Program Files\Microsoft System Center 2012\Service Manager\Tools\SMTracing tool and I provided the resulting capture to the engineer after recording the event occurrence. If this is fixed, I’ll post the solution with a separate post. This is a very difficult issue to fix and I’m still working with Microsoft to find a solution.

Root cause – The System Center Orchestration Configuration Library never finishes. Check and verify the Deployment Status.  It’s probably stuck in a running state.  

The fix is to open a ticket with Microsoft Premier Support and ask them to fix the Deploy Sequence in the database.  I don’t fully understand what was completed, so I don’t want to post anything until I understand.  It’s a quick fix, but let the MS professionals walk you through it.

Event ID 11366 – The Microsoft Operations Manager Scheduler Data Source Module failed to initialize because some window has no day of the week set.

The fix for this is discussed here.

Here’s an overview of the process I followed to solve problems with my cubes. In a nutshell, I didn’t find a holistic solution to the failing process and each experience is probably different. My goal is to provide you quicker access to the varying solutions located throughout the web.

How I Measured Success – A Goal of Processed Cubes

  1. Able to run Travis Wright’s script successfully end-to-end (this can take over an hour depending on how long it’s been since a good run).
  2. See the following via the \Data Warehouse\Data Warehouse\Data Warehouse Jobs display for each job:
    1. Enabled – Yes
    2. Status – Not Started
  3. See the following via the \Data Warehouse\Data Warehouse\Cubes display for each cube:
    1. Status – PROCESSED
    2. Schema Pending Changes – NO
    3. Last Processed Date – Current or accurate with the time your processed the cube.
  4. The SCSM Data Warehouse Jobs that process the cubes successfully run each at the local time, as scheduled

Note: Criteria Item 1 is probably the most important as it provides a more realistic end-to-end run on what you’ll see from your automated processing.

Things You Should Know Before Troubleshooting SCSM 2012 Cubes (IMHO)

  • Manually processing the cubes from the Analysis Server and then running the SCSM Data Warehouse jobs is not going to cure the problem for the long term and is not recommended as an approach by the experts I talked to.
  • The Data Warehouse server PowerShell command (get-SCDWJobSchedule) reflects job schedules in GMT versus local time.
  • Log on to the SCSM servers with the same service account you use as the data warehouse RUN AS account.
  • Just because a Data Warehouse job is stopped doesn’t mean the module is. Verify the job status with get-scdwjobmodule –jobname “NAME OF JOB” (i.e. get-scdwjobmodule –jobname “Process.SystemCenterWorkItemsCube”). You Data Warehouse job will stall in a “running” state If the module is still running when you re-start the process.
  • SCSM 2012 checks the version of your SQL Server Analysis Server. If it’s the Enterprise version, SCSM 2012 will process the cubes incrementally (PROCESS INCREMENTAL) if the cubes have already been processed fully (PROCESS FULL). This will keep your processed jobs from future failures. This functionality isn’t available in Standard.
  • Read Danny Chen’s 8-part blog and become an expert on how this process works. 1, 2, 3, 4, 5, 6, 7, and 8.
  • If a Cube job starts via the console and runs longer than an hour or two, this is not normal. Run the cube manually at the SSAS and you’ll probably find a bad DIM. Unprocess bad DIMs, then process them FULL.

Troubleshooting Steps

Here are the paraphrased steps I followed to fix my cubes (details are in the section below) after they failed:

As with any code, you should run this in a test environment first.

  1. Ensured I was running SQL 2008 Enterprise.
  2. Looked at the Ops Manager event log for error ID 33573 – “The operation has been cancelled due to memory pressure.” If you see this on your SSAS, add more RAM. I have 32 GB of RAM, so I haven’t seen this issue, but there are number of posts about it.
  3. Apply any pending schema changes (see below for details) via \Data Warehouse\Data Warehouse\Cubes — Apply Schema Changes
  4. RUN-ETL.ps1  – Successful? Yes? Then your cubes should run as scheduled. FAIL?  Follow the steps detailed in the below section “Manually Verifying SCSM 2012 Cubes”
  5. Let your jobs run overnight and verify everything is working as expected.

Manually verifying SCSM 2012 Cubes

How to ensure your cubes will even process.

 

1. Via the OLAP server, PROCESS FULL all the cubes (process all or just the failed cube) 

Failures here were associated with dimensions missing key data. The error messages show the offending object and you can correct it this by processing the individual dimensions separately (see below). My failed DIMS show a state of Process Update so I always Unprocess, and then Process Full. Eventually, all your cubes will process.

2. Unprocess the cubes at the SSAS.

This places the cubes in a Process Full requirement for the next processing and if your watermark batchID is 0, SCSM will complete a full process updating the watermark. Check the properties of the cube for an Unprocessed state.

3.  Confirm that DWDataMart Database watermarks are setup for a Process Full.

When you manually process the cubes and then un-process them before your nightly runs, the watermark batchids are usually reset. To confirm this:

Use DWDataMart

Go

select * from etl.cubepartition

This returns a result set showing either a batchID from a completed process or a batchID of 0.  SCSM uses this watermark to determine the type of run for the next schedule event.

Anything other than a 0 will result in a incremental build, while a 0 runs a full process. To have all of your cubes reset, simply update the batchID to 0 after un-processing all your cubes manually.

update etl.CubePartition set WatermarkBatchId = 0 where CubeName = ‘SystemCenterCubeNameHereCube’

4.  Start the appropriate cube processing job for the failed cube and wait for the results.

I’ve spent a lot of time working through this issue and may have left off a process or tip, so comment below if you need more guidance.

DS

Advertisements

About Doug Sigmon

IT Helpdesk manager in southern California. Love technology, gadgets, and golf.
This entry was posted in SCDW, SCSM 2012 and tagged , . Bookmark the permalink.

9 Responses to Troubleshooting SCSM 2012 Cubes

  1. Hi,

    Did you find solution for error 4000 ?

    regards,

    • dougsigmon says:

      Working with Microsoft Premier Support, we were able to isolate it to a bad Management Pack that fails during one of the SCSM system process (there are 9 processes). Looks like it chokes on the DEPLOYMENT process, causing the 4000 event.

      I hope to have a solution rolled out today and tested. If successful, I’ll post the info asap.

      DS

  2. Thanks, Wish you successful solution

  3. dougsigmon says:

    Mohamed, just wrapped up the fix for Error 4000.

    Root cause – The System Center Orchestration Configuration Library never finishes. Check and verify the Deployment Status. It’s probably stuck in a running state.

    The fix is to open a ticket with Microsoft Premier Support and ask them to fix the Deploy Sequence in the database. I don’t fully understand what was completed, so I don’t want to post anything until I understand. It’s a quick fix, but let the MS professionals walk you through it. Good luck!

  4. Sam says:

    Hello,
    On Event ID 33566 you said
    “I scripted out the missing FACT using one of the other existing month FACTs.”
    How did you do that?
    I having that same problem with SystemCenterConfigItemCube,but also in my case I cant even see the cube under my Analysis server Database,but your query does return results,thats why I wanted to know how you scripted it as you said.
    Thanks

    Regards

    • dougsigmon says:

      Hi Sam,

      No worries. Ultimately, you’re expanding each partition (these hold the facts, DIMS, etc).

      1. Using Microsoft SQL Server Management Studios (ensure you select ANALYSIS SERVICES for Server Type) connect to your Data Warehouse OLAP database.
      2. Expand the cubes section under the appropriate database (in my case, the database is DWASDataBase, which may also be the name you use).
      3. Expand the affected Cube (for example SystemCenterConfigItemCube).
      4. Expand the Measure Groups under the Cube.
      5. Now, expand each DIM and expand the partitions under each DIM. This is where it can get painful. Under the Computer Dim PARTITION, you see Computer Dim. Excellent.
      Look under ComputerHostLogicalDisk. Now you see facts (measuring units by a parameter, which in our case, are months). Here’s where mine blew up.

      One of my dims was missing a month fact. I simply scripted out an older month fact and change the parameters to the missing month, ran the CREATE script and problem solved. How?

      1. Right click on an existing fact (I’ll use ComputerHostsLogicalDiskFact_2012_Aug in this example).
      2. Select SCRIPT PARTITION AS > CREATE TO > NEW QUERY EDITOR WINDOW
      3. Either manually or using SEARCH AND REPLACE (I use S&R to make sure I get each one), replace Aug with the missing month. In my example below, I use April (APR).
      4. We have to now modify the definition to grab the appropriate calendar days. Change the date range in the XML line.

      The August Fact

      DWASDataBase
      SystemCenterConfigItemCube
      ComputerHostsLogicalDisk

      ComputerHostsLogicalDiskFact_2012_Aug
      ComputerHostsLogicalDiskFact_2012_Aug

      DWDataMart
      SELECT fact.[ComputerDimKey],fact.[ComputerHostsLogicalDevice_LogicalDiskDimKey],fact.[DateKey],fact.[CreatedDate],fact.[DeletedDate], (CASE WHEN fact.DeletedDate IS NULL THEN 0 ELSE 1 END) AS [IsDeleted] from dbo.ComputerHostsLogicalDiskFactvw fact Where DateKey >= 20120801 And DateKey <= 20120831

      Molap
      Regular

      -PT1S
      -PT1S
      -PT1S
      -PT1S

      DefaultAggregationDesign

      Changes to the missing Apr month fact:

      DWASDataBase
      SystemCenterConfigItemCube
      ComputerHostsLogicalDisk

      ComputerHostsLogicalDiskFact_2012_Apr
      ComputerHostsLogicalDiskFact_2012_Apr</Name>

      DWDataMart
      SELECT fact.[ComputerDimKey],fact.[ComputerHostsLogicalDevice_LogicalDiskDimKey],fact.[DateKey],fact.[CreatedDate],fact.[DeletedDate], (CASE WHEN fact.DeletedDate IS NULL THEN 0 ELSE 1 END) AS [IsDeleted] from dbo.ComputerHostsLogicalDiskFactvw fact Where DateKey >= 20120401 And DateKey <= 20120430

      Molap
      Regular

      -PT1S
      -PT1S
      -PT1S
      -PT1S

      DefaultAggregationDesign

      Now execute the modified script, click refresh, and you’ll see the missing fact.

      Drop me a line at dougsigmon@gmail.com if this isn’t clear.

      DS

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s