fine tuning network automation

NA 2022.11 on Linux server

Linux server has around 24 GB and we had on-boarded around 600 devices and our target is 1500 devices.

We had restarted NA server last friday and now we are seeing only 5 GB left when we run free -g

Morever we are seeing NA application also shows low memory as seen in below output,

Used Memory (Total-Free): 11640 MB
Free Memory: 327 MB
Total Memory: 11968 MB
Maximum Memory: 11968 MB

Please can you provide if there is any fine tuning parameters we can apply similar to NNM like JVM heap memory allocation and garbage collection.

Parents
  • 0  

    Hi Ramesh,

    So, you mention your NA Core has 24 GB RAM, did you or someone else do the install?  I ask as there is / are some performance steps that are available and curious what might have been done already...

    Performance tuning - Network Automation (microfocus.com)

    Is your NA instance single core or multiple cores?  

    When this happens / happened, what tasks were running?  Anything stuck (running long)?  Do you have any custom tasks (change plans or diagnostics)?  Has this happened more than just this one time?  Like every Monday night, you see this happen on Core 3?  

    Has anyone changed the default task values?  Max Tasks / Max Concurrent Tasks?  Do you have an external DB or is it the embedded one?  

    Also, are the ~600 devices "typical" devices (switches, routers, load balancers, firewalls) or do you have anything that might be more complex (ACI / APIC devices)?  

    Have you looked at the appserver_wrapper.log file?  There may be some useful information that'll point you to a problem.  Perhaps old driver(s) or something else but quite possible you can find the beginning of this bad behavior.  

    Lastly, and this is just from my history:

    1) It's tempting to think that if some memory is good, (tons) memory is better and you throw almost all your memory to JVM - not really a good idea.  Same with increasing tasks.  

    2) Like life, there is a balance here.  You can increase your task numbers but if you do that, then you need to make sure you have JVM set to handle it as well as have the number of DB connections too.  

    3) Small steps and use caution.  Make changes slowly, document what you had and are changing and then test carefully.  You always want to be able to get to prior steps.

    Good luck!

    -Chris

Reply
  • 0  

    Hi Ramesh,

    So, you mention your NA Core has 24 GB RAM, did you or someone else do the install?  I ask as there is / are some performance steps that are available and curious what might have been done already...

    Performance tuning - Network Automation (microfocus.com)

    Is your NA instance single core or multiple cores?  

    When this happens / happened, what tasks were running?  Anything stuck (running long)?  Do you have any custom tasks (change plans or diagnostics)?  Has this happened more than just this one time?  Like every Monday night, you see this happen on Core 3?  

    Has anyone changed the default task values?  Max Tasks / Max Concurrent Tasks?  Do you have an external DB or is it the embedded one?  

    Also, are the ~600 devices "typical" devices (switches, routers, load balancers, firewalls) or do you have anything that might be more complex (ACI / APIC devices)?  

    Have you looked at the appserver_wrapper.log file?  There may be some useful information that'll point you to a problem.  Perhaps old driver(s) or something else but quite possible you can find the beginning of this bad behavior.  

    Lastly, and this is just from my history:

    1) It's tempting to think that if some memory is good, (tons) memory is better and you throw almost all your memory to JVM - not really a good idea.  Same with increasing tasks.  

    2) Like life, there is a balance here.  You can increase your task numbers but if you do that, then you need to make sure you have JVM set to handle it as well as have the number of DB connections too.  

    3) Small steps and use caution.  Make changes slowly, document what you had and are changing and then test carefully.  You always want to be able to get to prior steps.

    Good luck!

    -Chris

Children
  • 0 in reply to   

    Chris

    Yes NA server has 24 GB, we have only NA and Operations agent running on the server.

    NA instance is running on single core.

    There were no tasks running when the issue happens and there are also no stuck tasks and we do not have custom tasks and this has happened twice.

    Max tasks has been changed from 20 to 30 and other than there is no change done and we have external DB.

    We had only on-boarded only cisco switches so far.

    I also checked the file /opt/NA/server/ext/wrapper/conf/appserver_wrapper.conf where we can configure initial and max JVM memory, but I am not seeing any option for garbage collection like in NNM.