Dedupe Store Stop - Awaiting meta cleanup completion

Hi,

We have a Dedupe Store on a Windows Server 2016 VM in use by an RHEL DP 23.4 cell. It's been running for about 9 months. On a monthly basis the Windows VM needs to be patched, so we issue a DPDUtils -shutdown_all command and wait, and then patch, and restart. This month though we've hit an anomaly which I can't find a resolution for online. 

Sequence of events and results:

  • Have shutdown DP to ensure Dedupe Store is not used, it hadn't been written too for over an hour when I shutdown DP
  • Run DPDUtils -shutdown_all
    • this came back with an error on the Command Prompt CLI as
      • ERROR: Couldn't shutdown one or more deduplication stores process
  • Hmmm, so went and had a look at the C:\Program Files\OmniBack\sdfs\logs folder for the cfg log file
    • messages from this cfg file:
      • shutting down volume
      • Shutting Down SDFS
      • Stopping FDISK scheduler
      • Flushing and Closing Write Caches
      • Write Caches Flushed and Closed
      • Committing open Files
      • Closing metafilestore
      • Awaiting meta cleanup completion
    • It's the last one, the Awaiting message which is now repeating itself every 10 seconds, has done for almost 3 hours
  • Have tried re-running the shutdown_all command again, but now it says:
    • ERROR: Couldn't fetch Deduplication store list
    • ERROR: Couldn't shutdown one or more deduplication stores process
  • And yet the DPDUtils -stat store_name comes back just fine saying the store is still running

So now we have a Dedupe store which won't stop, DP being down, constant output messages waiting for completion of meta cleanup, so we are at an impass. 

Anyone know of a solution to get us out of the loop? Do we just wait, do we reboot the Windows server and hope the Dedupe Store comes back, or something else? We can leave DP down for a day or two maybe as it's a tiny standalone environment being backed up, but we'd like to know what's going on. For the record the full error of the meta cleanup is:

date/timestamp [INFO] [sdfs] [org.opendedup.sdfs.filestore.MetaFileStore] [483] [Thread-54] [] - Awaiting meta cleanup completion

Any help or advice gratefully received.

Cheers,

Andy

  • 0  

    Hello Andy

    How much big the store is? The cleanup process could take several time depending on that.

    Best Regards

    Jose Maria Basilio

    Although I am an OpenText employee, I am speaking for myself and not for OpenText.
    If you found this post useful, give it a “Like” or click on "Verify Answer" under the "More" button.

  • 0 in reply to   

    Hi Jose,

    Thanks for your reply. To confirm first off, the Dedupe Store did finish and stop itself successfully overnight :-)

    The Dedupe Store is 8.99TB total size, but only has 1.15TB used, according to Windows File Explorer. This is the only environment we have DP Dedupe Stores in, but we'd not see this "Awaiting meta cleanup" messages before that we can find in the cfg log file. 

    In the end the meta cleanup ran from approx 10:00 Mon - 01:30 Tue, so approx 15.5 hours for the 1.15TB of used space. That does seem quite excessive, are we missing anything in configuration of the Dedupe Store which may reduce this time of meta cleanup in the future to speed up shutdown of the store and reboot of the server? Or is it the case of this long running cleanup means future cleanups should be shorter in time? 

    Just trying to understand how and why the meta cleanup runs. If you've got any documentation online which we can read up on that would be great also.

    Thanks,

    Andy

  • 0 in reply to 

    Well we've actually hit another problem now. We have patched the Windows VM server, and rebooted it, but the Dedupe store won't restart!

    We run:

    dpdutils -restart_store VOL_A

    And it responds:

    ERROR: Couldn't fetch Deduplication store list

    ERROR: Couldn't restart deduplication store process

    So now we can't restart the Dedupe Store. We run the list_stores command and it shows the dedupe store, we run the stat command and it lists the dedupe store as stopped, so to us the dedupe store is there, but for some reason the dpdutils can't find it in some "store list".

    So in this environment we have two Dedupe Stores on two separate VM's, Windows Server 2016 & RHEL 8.8. We are going to point all the backups to RHEL 8.8 Dedupe Store for now whilst we investigate further, hopefully with help via this forum, to get the Windows Dedupe Store up and running.

    I have a feeling we won't get it started and we'll have to cut our losses and delete it and create a new Windows Dedupe Store, unless someone can help :-)

  • 0 in reply to 

    Well, before I went to move our backups, the store started to start up! In the cfg log file it now shows it starting the Dedupe Store. It is showing cmap errors, as we've had previously, so it is likely to not be up again for a couple of hours, but it's on it's way back.

    My colleague got it going by running the DPDUtils restart -s command, rather than restart_store, but I tried both earlier, I don't know why after the 2nd reboot, and waiting a while, the restart has worked.

    Will hopefully update when it's back up and running.

  • 0 in reply to 

    And after 1 hr 50 mins, the cmap errors finished, and the Dedupe Store is back online and tested ok writing a backup.

    Hmmm, would still be good to confirm what steps we should run when stopping a Dedupe Store prior to planned server reboot. We have our media server and deduplication store server as one and the same entity. So we assume just running DPDUtils -shutdown_all is sufficient prior to any planned reboot activities. And then also checking the output C:\Program Files\OmniBack\sdfs\logs\*volume-cfg.log file to make sure the Dedupe Store is down before we reboot the server. As per:

    https://docs.microfocus.com/doc/Data_Protector/23.4/DeduplicationStore#Shutdown_of_store

    This is a standalone environment, small, we only keep backups for a couple of weeks, we can't back up elsewhere, so using DP Dedupe Store did seem the logical choice to get good dedupe on the limited backup space we have available. But so far, the stop/restart capability of the Dedupe Store's are quite flakey, with it wanting to do it's own thing before and after to keep the integrity of the store.

    But we will persevere, and see how we get on. The real test being when someone wants a recovery of course ;-)

  • Suggested Answer

    0   in reply to 

    Hello Andy

    Ok several topics at the same time :) let me try answer to you.

    1º Information or documentation about  "Awaiting meta cleanup"

    I don't have or found more information about, you can try to open a support case so for to be honest wouldn't be very optimistic about which kind of information they can provide. Anyway keeping in mind the time taken and the rest of issues that you have experienced trying to start up Store again, my recommendation is to keep this Store in "read mode" only and build new one in the same deduplication server. 

    2º Could you please share hardware resource dedicated to DPD virtual machine? RAM, CPU, Disk technology.

    3º This is the right procedure for to stop stores before reboot DPD server.

    Before rebooting or stopping the Linux or Windows server, shut down all the stores using the DPDUtils command DPDUtils -shutdown_all on the Deduplication server.

    If even after rebooting the server, if you see any store services still running, perform the following steps to manually shut down the store:

    1. On the Media Agent system, to shut down the deduplication services, run the following commands:
      • For the Windows Media agent (gateway) client: DPDClientUtils -shutdown <DP_Datadir>/config/client/SdfsProxyPort.conf
      • For the Linux Media agent (gateway) client: DPDClientUtils -shutdown /etc/opt/omni/client/SdfsProxyPort.conf
      DPDClientUtils is an internal Data Protector utility and the output isn't displayed.
    2. Delete the SdfsProxyPort.conf and SdfsProxyPort.conf.lck files from the following directory: <DP_Datadir>/config/client/ and /etc/opt/omni/client/ in Windows and Linux respectively.
    3. On the Deduplication server, to shut down the Deduplication store processes, run the following command: DPDUtils.exe -shutdown_all
    4. Delete the pfconfig.json file from the following directory: <DP_Datadir>/config/client/ and /etc/opt/omni/client/ in Windows and Linux respectively

    4º Please if is possible considerer upgrade to lst version 24.4 apart some important security features you will be able to use DPD replication capabilities, it will help to you in case that Store goes to corrupted state abruptly. Of course Im aware that it means use another VM and double of disk space so maybe is possible for you.

    https://docs.microfocus.com/doc/Data_Protector/24.4/ReplicateObjects

    Hope this help.

    Best Regards

    Jose Maria Basilio

    Although I am an OpenText employee, I am speaking for myself and not for OpenText.
    If you found this post useful, give it a “Like” or click on "Verify Answer" under the "More" button.