DPM 2016 U4 - 1.3TB File Server will not complete Consistency Check.
The File server has a couple of Volumes that are backing up fine, apart from the d:\ drive. All was working fine up till we copied around 250GB of data and it did automatic consistency check. The file server
is up to date and now has 2018-04 installed to see if this would fix the issue.
The consistency check starts, and data is being transferred and after a while when viewing the DPM console the items scans appears to be stuck on the same number for a while. And when you log on the server
there is no disk activity from the DPMRA.exe image. When you look at the CPU and look at the associated handles it always in the same user’s home folder, when looking at the files it has open they haven’t been altered since access since 2017 and these are
files that have been on the system for over 6 months.
When viewing the DPM log file there are a lot of entries like the following roughly around the time it appears to get stuck,
0498 0C40 04/16 20:47:59.080 18 fsutils.cpp(3225) A7CB7744-4C43-4826-B0FB-84FAFF40340D WARNING Failed:
Hr: = [0x80070057] : GetFileHandleById failed to open file,
frn:0x0001000000048244
0498 0C40 04/16 20:47:59.080 18 fsutils.cpp(3225) A7CB7744-4C43-4826-B0FB-84FAFF40340D WARNING Failed:
Hr: = [0x80070057] : GetFileHandleById failed to open file,
frn:0x0001000000048244
And they are the bottom of the log file for a while and then about 8 hours later, in the DPM cosole you get the following error,
Type: Consistency check
Status: Failed
Description: Task is cancelled because some other task in agent is not responding on SERVERNAME machine. (ID 32557 Details: Internal error code: 0x809909C1)
More information
End time: 17/04/2018 05:07:12
Start time: 16/04/2018 17:18:25
Time elapsed: 11:48:46
Data transferred: 3,160.76 MB
Cluster node -
Source details: D:\
Items scanned:396593
Items fixed: 1645
And lot of entries appear in the log after the DPM console gets the error above, ending with the following
1FD8 0C18 04/17 04:07:12.014 29 radefaultsubtask.cpp(196) [0000027CB613A230] BE0A6B8E-772F-4D6F-9C1A-2465538E84AF WARNING Failed: Hr: = [0x809909b0] : Encountered Failure: : lVal :
(HRESULT)0x809909B0
1FD8 0C18 04/17 04:07:12.014 05 defaultsubtask.cpp(944) [0000027CB613A230] BE0A6B8E-772F-4D6F-9C1A-2465538E84AF WARNING Failed: Hr: = [0x809909b0] : Encountered Failure: : lVal :
CommandReceivedSpecific(pCommand, pOvl)
1FD8 0C18 04/17 04:07:12.014 05 defaultsubtask.cpp(1149) [0000027CB613A230] BE0A6B8E-772F-4D6F-9C1A-2465538E84AF WARNING Failed: Hr: = [0x809909b0] : Encountered Failure: : lVal :
CommandReceived(pAgentOvl)
1FD8 0BA4 04/17 04:12:11.999 03 runtime.cpp(1426) [0000027CB4A96690] NORMAL CDLSRuntime::ProcessIdleTimeout
1FD8 0BA4 04/17 04:12:11.999 03 runtime.cpp(602) [0000027CB4A96690] NORMAL CDLSRuntime::Uninitialize,
bForce: 0
1FD8 0BA4 04/17 04:12:11.999 05 genericagent.cpp(273) [0000027CB4A4AA20] NORMAL Agent
Can Shutdown if there is only default wokitem active[1]
1FD8 0BA4 04/17 04:12:11.999 29 dpmra.cpp(356) [0000027CB4A4AA20] NORMAL CDPMRA::Shutting down dpmra,
force-shutdown :yes
1FD8 0BA4 04/17 04:12:11.999 03 cworkitem.cpp(328) [0000027CB4B2E6A0] NORMAL Timing
out WI [0000027CB4B2E6A0], WI GUID = {B71B4544-7067-4A30-B5FB-BA320B10D82A},
..last DM activity happened 229748828msec back, WI Idle Timeout = 390000msec
1FD8 0BA4 04/17 04:12:11.999 22 genericthreadpool.cpp(684) [0000027CB4AEBAD0] NORMAL CGenericThreadPool: Waiting for
threads to exit
1FD8 0BA4 04/17 04:12:14.023 22 genericthreadpool.cpp(684) [0000027CB4A96690] NORMAL CGenericThreadPool: Waiting for
threads to exit
1FD8 1018 04/17 04:12:16.047 03 timer.cpp(513) [0000027CB61170C8] ACTIVITY Shutting
down timer thread.
1FD8 0BA4 04/17 04:12:16.047 03 service.cpp(81) ACTIVITY CService::StopThisService
1FD8 0BA4 04/17 04:12:16.047 03 service.cpp(281) [000000D39927FC20] ACTIVITY CService::StopService()
1FD8 16FC 04/17 04:12:16.047 03 service.cpp(298) [000000D39927FC20] ACTIVITY CService::AnnounceServiceStatus
Daniel Wingfield