iManger reporting a NCP server object for a cluster resource is status = down even though the resource is up, on-line and fully accessible.

This is an OES2018sp3 two node cluster with three cluster resources. The cluster resource in question is called ADMDATA. When you go in to iManager and browse for objects, then modify the NCP server object for the cluster resource, it reports as Status = down. All other resources show status = up. The status from "up" to "down" seems to have happened on 12/18/2024. The cluster resource is on-line and in all other aspects functioning normally. I have taken it off-line and on-line again and it still reports as down. I migrated it to the second node and it reported a status of "up" for a minute or so then switched to "down". I migrated it back to the initial node and iManger still reports a status of down even though the resource and volume are up and accessible. The impact this is having is with Storage Manager which we use for home directory management. Processes with accounts having home directories on the impacted resource are in a pending state while those for other resources are fine. Storage Manager uses that status up or down attribute and even though the resource is on-line and accessible, the fact that the status is reporting as down means that Storage Manager thinks the resource is off-line. The impacted resource hosts the vast majority of our home directories so this is impacting the creation of new accounts and the disposition of home directories when accounts are inactivated.

  • 0

    Quick update, the server I was running iManager on is also the server that was reporting the NCP object for the cluster resource as down.  This server has a replica of all our partitions.  This is also the server that acts as the engine for Storage Manager and Storage Manager is configured to point to itself for eDir.  I have a second server with replicas of everything and I decided to check iManager on that server and it reported the NCP server object as being up.  I removed the replica of the servers partition off the initial server (Storage manager engine server and core eDir server) and I checked NCP server object for the cluster resource again and it now reports as up.  I checked storage manager and all the pending events cleared.  So the issue was a difference in how the replica of the servers OU reported the status of that NCP object object on the two different servers that had replicas of the OU containing the object.  I had rebooted the server in question and it still had reported that NCP server object as down.  Removing the replica of the servers OU seems to have done the trick.  I suspect adding the replica back would be OK but I am leaving it off the server for now.

  • 0   in reply to 

    Sounds like something is off on your eDirectory. Have you run through the basic checks to make sure there are no errors or problems showing?

    The three easy on a terminal session for each server is:
       ndsrepair -T
       ndsrepair -E
       ndsrepair -C -Ad -A
    checking that all end with  Total errors: 0 
       before continuing to the next one, otherwise run that particular check again to see if the first run cleared it, otherwise troubleshoot the error.

    ________________________

    Andy of KonecnyConsulting.ca in Toronto
    Please use the "Like" and/or "Verified Answers" as appropriate as that helps us all.