Restricting Agent queue somehow possible?

Hi everyone,

we're using OBM 24.2 with it's related Agent Version.

Last friday we had a big issue where the Agent couldn't reach OBM. We found out that the Agent wrote a file called msgagtq, most likely because it was buffering. 
This file grew so fast that it at last had 98GB !!! This then caused the server to stop working fine because there was no more space left and unfortunately OBM was also installed on this server, which, you can guess, led to a not working OBM! 
We still don't know what spammed the Agent, but what we're now looking for, is an option to restrict the Agent from queueing any more if this file reaches a certain size. Is there any possibility to tell the Agent to ignore incoming traffic if certain criterias are met?

Thanks in advance.

  • Verified Answer

    +1  

    This is the doc with that info to limit the queue size of the agent.  It shows different way to set the agent messge buffer size.   default is 2GB  

    https://docs.microfocus.com/doc/Operations_Agent/12.27/AvoidAgentMessageCorrupt

    Setting OPC_BUFLIMIT_SIZE to a maximum size of 1.9 GB will stop the buffer from reaching the 2 GB limit.

    Set on single agent

    You can set the variable on a single agent by executing the following command:

    ovconfchg -ns eaagt -set OPC_BUFLIMIT_SIZE 1900000

    You can execute the command remotely via ovconfpar if required as shown below:

    ovconfpar -set -ns eaagt -set OPC_BUFLIMIT_SIZE 1900000

    Set on multiple agents

    You can use the opr-agt command to set the variable on multiple nodes as shown below:


    opr-agt -username <user> -password <passwd> -set_config_var eaagt: OPC_BUFLIMIT_SIZE=1900000 -node_list <myNodeList> | -view_name <myView>

    Hope this helps,

    Chris

  • 0   in reply to   

    Hello,

    Sorry to hear you encountered an issue. Building on Chris's response:

    Event Buffering: If the agent (OA12) cannot connect to the server, events are buffered. Heartbeat polling (HBP) plays a critical role here, as it verifies that the agent is running. However, HBP failure messages generated on the OBM server itself are internal events that do not use OA12, which might explain why no event was created. On the other hand, running out of disk space on the OBM server is monitored by the OA12 agent. This means the event might be located within the 96GB queue file, although this situation is less than ideal.

    Queue File Growth: The rapid growth of the queue file is another concern. One possible explanation could be large annotations generated by automatic actions.

    Event Failure on the OBM Server: The reason for the event failure on the OBM server itself is a separate issue that requires further investigation.

    Stale Event Filtering: To handle outdated events buffered when the server is unavailable, you can configure the OPC_MSGA_STALE_FILTER_INTERVAL variable. This ensures agents only send relevant events to the server within the specified interval, while outdated or irrelevant events are discarded at the agent level.

    Setting the Interval: You can configure a minimum time interval of 1 minute. For example, if the server is unavailable for 5 hours and the stale filter interval is set to 60 minutes, the agent will only send events buffered in the last 60 minutes. Events older than 60 minutes are discarded.

    Run the following command to set a time interval for 60 minutes:
    ovconfchg -ns eaagt -set OPC_MSGA_STALE_FILTER_INTERVAL 60

    I hope this also helps.

  • 0 in reply to   

    Thank you very much, havent's seen this in the doc as it is listed under "OBM downtime" section - such a useful general setting might best not be located right there I'd say.