This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Archiving speed

Hi,

I´m curious how many messages per seconds is normal on VMware ( average hardware , plenty of RAM, CPU , SAN etc ) , I had never more than 1,2 . Latest Retain / GroupWise, DB are PostgreSQL / MariaDB .

David

Parents
  • 0

    Good morning,

    that is the classic answer, well it depends.  You say average hardware but there are a ton of variables that come into play.

    cpu, number of cores, disk type, response time of disk etc.

    Also, independent of the underlying infrastructure supporting the hypervisor, what about the disk setup for retain?  

    All on one drive, several?

    database, do you have that on separate disk and segregated across disk or disks?  When i say disks, i am referring to drives.

    also the VM’s themselves.  Number of cores and cpu allocated

    SAN connection assuming fiber or 10GBs, 40Gbs or?

    this is somewhat borderline support question that can be checked against Config’s we have seen.  There should be a little more dialog to help get you to where you want to be.

  • 0   in reply to 
    I was only curious about the numbers, I have multiple customers with totally different hardware, different kind of setups, from all-in-one single server setup, to everything separate .. DB, Worker and Server and speed is about the same ...

    David

  • 0 in reply to   

    Hi David,

    I am getting 1-2 messages per second with the PAM tool, a 65,150 user archive is 10gb and is going to take about 6.5 hours. We have VMware with enough RAM on every guest that there is no swapping and fiber channel that I have seen with ATTO run at 800 megabytes per second.  The import is so slow we could have 10 sites with multiple PAM sessions over a WAN and the bottleneck would NOT be the WAN.  Something is wonky and very slow and I do not know how to pinpoint it.

    It may have to do with the RETAIN system trying to do DEDUPE on every message as the message is presented.  It should be that the batch is pumped to the server and at night or the weekend have a task that de dupes.  Like a staging table where the first stage is stored and NOT de duped so that it can get the process done and then have a sub process finish.  In human terms RETAIN has one stomach to process all the food where us sophisticated humans have a stomach, small and large intestines. LOL

  • 0   in reply to 

    Joe,

    Retain doesn´t do depud after archiving, but during the job. Every message get hash and this hash is compared with already stored messages during archive job. Maybe this what is slowing down the whole process.

    I have customer with 100 users and the speed isn´t much different from customer with 2 000 users. 

    David

  • 0 in reply to   

    It needs to be user selectable, at least on the PAM import.

     

    Is this on the roadmap or a future version?

     

    Does MicroFocus use RETAIN, i.e. do they dog food the software?  The system managers must hate retain because of this issue.  You can only support x users as they want to archive x messages per day and it takes 1 second per message and after that we can start the backup. LOL

  • 0 in reply to 

    I have drawn PM's attention to your post.

    Please allow him time to respond..

     

    Thanks

    Tarik

Reply Children
  • 0   in reply to 

    Hmm,

    I have large environments to use Retain; i.e. universities. Some of them archive every night which is pretty fast. Some of them archive at weekend; in this case many thousand items per run. Even if there are more than 100.000 items to archive, it will happen within a few hours.

    If you have to archive local GroupWise archives, it is (a lot) slower. But in this case it will help to run more than one RetainWorker. I did this several times especially when Retain has been introduced to replace good old client based GroupWise archives.


    Use "Verified Answers" if your problem/issue has been solved!

  • 0 in reply to   

    Hi Diethmar,

     

    We archive about 15,000 mail items per night and it takes about 300 minutes.  So that is about .83 messages per second.  I am told that is normal for it to be that slow.  My PAM imports from GroupWise 18.01 archives are also that slow.  I have only been using Retain with GroupWise since the first version of GW18 and Retain 4.7.  We have everything running on SUSE Linux per the Micro Focus documentation.

    Are you running on Windows and all on a single box? Did you find settings you had to tweak?

    Thanks,

    Joe

     

  • 0   in reply to 

    Hi Joe,

    i.e. one site archives about 30.000 items each weekend. Retain needs around 5 hours for this job. They have three post offices and about 4000 users. They have two classes of users and two different jobs for each kind of class. The numbers above are for the heavier job. Unfortunately the have only one RetainWorker (not the way I use usually). I think that Retain space in the background is a little bit more than 1 TB (Retain version 4.9 since two weeks - 4.7 before because of operation system)

    The other university has more than 10 post offices and they use 4 or 5 RetainWorkers to run jobs each night. They archive each night (I cannot access the numbers right now - maybe I return with more information later on). As I remember none of the jobs needs more than one hour. In the background Retain occupies more than 1.7 TB. (Retain version 4.9 since one week now - 4.7 before because of operation system).

    I did not adjust any options or settings - I used default Retain values. If there is more than one post office then I play around with more retain workers.


    Use "Verified Answers" if your problem/issue has been solved!

  • 0 in reply to   

    Hi,

    I'm new to this community. And my apologies if it's not done to revive and/or hijack an old thread, but my questions really relate.

    I've inherited an undocumented Retain 4.9.0.1 setup. Every night some 800 messages from 19 mailboxes, some 200MB in size are added to Retain and are findable next day. So the setup works. However this proces takes for ever... like 4 to 5 hours on reasonable hardware. All (sata) flash array, 6 core VM with 16 GB ram on an idle vSphere server.

    Several things I would like to disclose and or point out:

    • this is an old installation which has been upgraded for years.
    • the filesystem is ext3 and not the recommended xfs, but has plenty of free space and inodes
    • network has only a percent load, but can handle enough when tested with iperf.
    • the disk usage on the retain VM is insane. To proces these few messages, the delta of the VMDK is 14-16GB that's right not mega but gigabytes to store 200Mb of data.
    • From the logs it seems that all messages in groupwise are processed not just the new ones.

    Backing up the VM via snapshot's is a recommended procedure in the installation guide page 41. However these insane delta's result in rather expensive offsite backups.
    I'm coming from Solaris so I really miss all the diagnostic tools (DTrace) to troubleshoot this system. I dont want to install any trace tools which have known performance impact. All I can see is the amount of writes to the filesystems and the iops and delta vmware reports. I cannot see which process issues these writes and to which files. Is it mysql? Is it lucene? Is it...?

    The sum it up:

    • Is it normal that Retain needs to rewrite 70-80x as much off it's own storage for storing the amount of actual data?
    • Is it by design that Retain needs to read the entire Groupwise PO instead of just the new messages?

    Any suggestions into improving this would be most appreciated.

    Thank you all for your time and interest,

    Benny

  • 0   in reply to 

    Hi Benny,

    I agree that 4 to 5 hours for this amount of messages is too long.

    I do not see a massive problem in your hard/software. Your Retain version is not really old - you miss only one version now (4.9.1.0). Because of your Retain version I do not expect your operating system is too old. Although ext3 is maybe a little bit old but it will not cause a slow Retain I assume.

    We cannot answer your question around the database. It can be mysql, mariaDB or postgres - it depends how it has been installed. At Server Configuration you will see the answer.

    In the background there are a lot of log files which tell you more; maybe at /var/log/retain-tomcat.

    However I assume that your archiving job needs inspection. Maybe this job is starting to check all entries from the very beginning instead of the last successful archiving job (you're mentioning similar).

    In your case I would open a service request to get additional help.


    Use "Verified Answers" if your problem/issue has been solved!

  • 0 in reply to   
    Hi Diethmar,

    thank for your reply. First off my bad, it's Retain 4.9.1.0 and the database is MySQL 5.5 something, which is way old but that's what's installed.
    It seems like it's checking all mesages in the PO every day. In the Retain status the amount of duplicates is almost the same as the amount of messages.
    Reading the Retain documentation page 41,42 and 196,197, is the following correct:
    In the "scope" tab of a profile
    - "Date range" "new items" are only new when they're "newer" than the timestamp set by the "Advance timestamp"
    -"new" in the context of "date range" has nothing todo with the status of an email in GW but only a relation to "Advance timestamp"

    Probably the setting "All items in mailbox" in "Date range to scan" is wrong. "Don't advance timestamp" is NOT checked.

    This system was setup years ago by another company and hardly looked after. I'm be no means a Retain expert and don't want to mess with it too much without having more experience with Retain. I certainly don't want to miss some email...

    Thank you for your time.
    Regards,
    Benny
  • 0   in reply to 

    MySQL 5.5 is not supported with 4.9.1, but this not the cause of the slowness, it´s the setting of Date Range to Scan, set it to New items and you will be fine. Anything other let as before, don´t check Don't advance timestamp . Enabling "Don't Advance Timestamp" will not update the timestamp flag. Items that are dredged will still be considered new by Retain the next time the job runs.This is useful when troubleshooting, but is generally not used for normal jobs.

    David

  • 0 in reply to   
    Hi David,
    thank you for your reply. I'm aware of the MySQL version as well as the ext3 fs. After reading the planning and install guides I verified the requirements against the current system.
    I'll set it the "new items" and see how it behaves after that.
    Could please verify my two statements in my previous post? Just so that I understand things correct.

    I expect that changing this setting will resolve some of my pain but not my curiosity. Why all the writes? It's not the logging, that's info level mostly. I don't see the fs grow apart from the daily mail ingest and some logging. What process does al the writing when handling mostly duplicates? All I know for sure is that it's retain related because retain and it's data is installed in a single fs. And when checking the amount of writes in lifetime_writes_kbytes I can see the immense amount of (re)writes (the fs isn't actually growing that much)

    Thank you for your time,
    Benny
  • 0   in reply to 

    Major I/O is from Lucene, there is in Maintenance section Enable Index Optimization, when it's set to daily, it will merge index every day, the process will use double amount of space, which is the index using. 

    You can set it ones a month, for so small system is this enough. 

    David

  • 0 in reply to   
    Hi David,
    thanks again. Index backup and rebuild is set to weekly so doesn't explain the daily I/O.
    I've changed the setting to only process new items. I'll report tomorrow about the effects.

    Regards,
    Benny