Showing posts with label sp1. Show all posts
Showing posts with label sp1. Show all posts

Wednesday, March 12, 2014

DPM 2012 and Beyond Frustration

All of our Hyper-V Clusters, Server 2008 R2 hosts, started having failed backups inside our two independent Data Protection Managers. The problem initially progressed from one node consistently fail backups for virtual machines and the other hosts kept performing backups, until all of our nodes could no longer could make successfully backups of any virtual machines. Our standalone backups via DPM had no issue. These hosts had been configured and unchanged for well over a year - only Windows patches months prior and anti-virus updates were continuously loading.

DPM kept stating for the failed backups that "The VSS application writer or the VSS provider is in a bad state ... ID 30111: VssError:A function call was made when the object was in an incorrect state for that function(0x80042301)) and the local nodes wrote VSS 12362 Application Log Event Errors "A Shadow Copy LUN was not detected in the system and did not arrive" and VSS 12363 Application Log Event Errors "An expected hidden volume arrival did not complete because this LUN was not detected" whenever we attempted to run full virtual machine backup via a Consistency check.

We had tried and didn't work...
  • Power cycling all of the equipment involved: Hyper-V Servers (PowerEdge R710's), the iSCSI SAN (EqualLogic PS4000vx's), the switches connecting them (Catalyst 3750X's), and our DPM server
  • Unregistering and Registering the EqualLogic VSS provider (eqlvss /unregserver and eqlvss /regserver)
  • Removing virtual machines from a protection group (deleting disk data) and adding them back
  • Moving virtual machines to a new protection group
  • Upgrading the EqualLogic Windows Host Integration Toolkits (HIT kits) on the Hyper-V nodes - upgraded from 4.0 to 4.6
  • Installing the EqualLogic HIT kit on one of the virtual machines
  • Patching the Hyper-V nodes to all of the latest Windows Updates - even yesterdays released kb 2908783 which resolves issues with corruption of iSCSI LUNs in Windows Server 2008 R2 and 2012
... and still no success.

After much time wasted on what seemed to be magic potions and DPM's hatred of backing up critical data, a random thought of trying to disable our anti-virus on the cluster nodes resolved the issue! Yeah, I know they say to disable anti-virus on everything and everywhere you read, but we have had Microsoft Forefront Client Security on these systems configured and running since we setup these servers 2+ years ago. Apparently, some change in the definitions or just its mood decided to start messing with the iSCSI VSS Hardware process... and messing with my sleep over the last two days.

Good luck!


 

 

Wednesday, May 15, 2013

DPM 2012 Not Generating E-mail Reports after Upgrading to SP1

We have been using DPM 2012 for quite a while now. We also have the reports set to deliver reports daily/weekly. After we upgraded to SP1, we noticed it no longer was e-mailing us the reports, even though the alerts for errors continued to come. We also could run reports manually, but no automatic e-mails.

Went to clear and recreate the report schedule and set it to e-mail us, and we got this awesome non-descript error ID: 3014. "An error occurred causing the reporting job on to fail. The system files may be corrupt. Retry the reporting task. If the problem persists, repair your DPM installation using the steps described in the System Center 2012 Service Pack 1 DPM Deployment Guide. ID: 3014"

 
I checked out the guide and the basic idea to "repair" is uninstall and reinstall. I don't know about you all, but risking loosing backup data just to fix reporting didn't sit well. So, I proceeded to evaluate what was occurring with the SQL Server Profiler on our system and comparing it to our secondary server.
 
After playing around with it for hours, seemed to narrow down that it was an issue with permissions for the Reporting Services predefined database role called RSExecRole. I went through this guide Create the RSExecRole (http://technet.microsoft.com/en-us/library/cc281308.aspx), used to recreate permissions during a report database move, and we were able to recreate the e-mail subscriptions.  It looks like there must have been some undetected failure during the SP1 upgrade.