Anyone using SCSM 2012 has probably suffered through some type of challenge with the ETL data warehouse and cube processing jobs. After a few days of troubleshooting, I think I’ve discovered why.
– All Data Warehouse jobs disabled.
– DWMaintenance job never finishes or runs a realllllllly long time.
– Reports don’t show updated data.
– Failed cube processing
– You tried Travis Wright’s awesome ETL running script and get an “Exiting since job is in an unexpected status.” (…if you haven’t tried this yet, you should. It’s the quickest way to tell you have a stalled process).
– Processing is just stalled
– DWMaintenance is stalled on the ManageCubePartitions/ManageCubeTranslations Step
– The data warehouse is just dead.
– The DWMaintenance job has a stalled process. It’s a statusID of 7 (waiting) instead of 3 (not started). Unless you reset this status, the Data Warehouse jobs won’t run and your report data is stale.
Solution – Review the paragraph “DWMaintenance is Hung on the ManageCubePartitions/ManageCubeTranslations Step ” on Danny Chen’s blog exhaustive article detailing cube and SCDW troubleshooting. You can stop here or read on to see how I troubleshot the process to get to his solution.
Troubleshooting Steps – How I Solved It
Don’t try to troubleshoot via the SCSM console.
Use PowerShell on your Data Warehouse server
Also, when the MPSync or DWMaintenance jobs start, they disable all of the ETL jobs. You will see the Is Enabled column set to False for each of the ETL jobs. This means that even if the ETL job status shows it is running, it actually is not running. When the MPSync or DWMaintenance job completes, the ETL jobs are automatically enabled and resume processing. Enabled and Not Running is a good thing. Means the job ran.
- Load the Datawarehouse cmdlets – Use one of these depending on your install path
import-module “C:\Program Files\Microsoft System Center 2012\Service Manager\Microsoft.EnterpriseManagement.Warehouse.Cmdlets”
Import-Module ‘%ProgramFiles%\Microsoft System Center 2012\Service Manager\Microsoft.EnterpriseManagement.Warehouse.Cmdlets.psd1’
- Run Get-SCDWJob and note the status of the jobs. Is DWMaintenance running. Stop it with Stop-SCDWJob –Jobname DWMaintenance. Do the same for MPSyncJob or any other running job.
- Check for Job Module errors. This is important to help us find stalled processes in the jobs. The easiest way to do this is to export the SCDW job modules to a text file – Get-SCDWJobModule >> c:\scripts\jobs.txt – When I exported mine I noticed ModuleID 6 was waiting with an error:
Job Name: DWMaintenance, Batch Id: 20154
ModuleID Module Name Status Error Message
——– ———– —— ————-
18 ResetFailedSystemDerivedMPWorkItems Not Started
3 ApplyUpdatesToCube Not Started
6 DisableProcesses Waiting <Errors><Error
7 EnableProcesses Not Started
…there were other modules too, but all were in a good status.
Now using Danny Chen’s blog, I ran these SQL scripts against the DWStagingAndConfig database.
select processId from infra.process where processname = ‘DWMaintenance’
Returns a value of 1. So, the ProcessID for the DWMaintenance is 1.
select batchid from infra.batch where processId = 1
Returns a batchID of 20161 and 20162. Starting with the first one, I updated the status to 3 to clear the stalled process. I also updated the second one.
select * from infra.workitem(nolock) where BatchId = 20161
select * from infra.workitem(nolock) where BatchId = 20162
Use the BatchID to determine the stalled process (moduleID 6 previously). You’ll get a long list of WorkItemID’s. Look for the one with a status of 7. This is the problem. In my case it was 7429423.
update infra.workitem set statusid = 3 where workitemId = 7429423
I then double-checked the status an the WAITING status of 7 was gone, replaced with statusid of 3.
- Once I finished this, I manually ran my MPSyncJob using powershell – start-SCDWJob -jobname MPSyncJob
- Follow the progress of each of your jobs with the Get-scdwjobmodule -jobname MPSyncJob
- Now follow this process for each of your ETL jobs after MPSyncJob finishes successfully. If MPSyncJob fails, there are a number of troubleshooting blogs to correct this.
- start-SCDWJob -jobname (Enter your name here) and start-SCDWJob -jobname (Enter your job’s name) – Wait for these to finish.
- start-SCDWJob –jobname Transform.Common – Wait for the job to finish.
- start-SCDWJob –jobname Load.Common
All of my ETL jobs finished in under 30-minutes, but yours might vary. Checking my reports showed updated data and all of my SCDW jobs were Enabled and displayed a status Not Running in the management console.
Good to go. Hope this helps and leave me a comment if you need more details or help.
My cubes, however, are still down and I’m tackling those next. More to follow…