Gathering OES23.4 In-Place Upgrade issues (and their solutions)

Hi. I wanted to start a thread with all real-life Inplace Upgrade Issues long with their (if known) possible solutions. Also, I want to generally discuss the upgrade process, and if there isn't a lot of potential for improvement, cause in my mind, there are dozens unnecessary and dangerous steps the Upgrade process does for no reason.

So let's start:

1. Check /etc/sysconfig/novell/edir_oes2018_sp3, if it contains 'SERVICE_CONFIGURED="yes"'. If it says "no", the upgrade to oes2018sp3 hasn't finished properly. Most likely, the "yast channel-upgrade-oes" didn't happen or ran into an error that wasn't fixed, and now haunts you. Your server most likely is working fine anyways, but your ugrade will fail miserably. (*)

Solution: Edit the file to read 'SERVICE_CONFIGURED="yes"' *before* attemting the upgrade. If you don't do this, the upgrade process will attempt to add your server to the eDir Tree instead of upgrading it.
(IMNSHO this is a massive bug. Relying on some freely editable, notoriously unreliable config file that has no meaning whatsoever to the operation of the server, to determine if a server is active part of an eDir tree or no is insane. Why not just ask eDir instead?)
Also, when this happens, verify the other OES2018SP3 files in /etc/sysconfig/novell, as most likely some others are wrong.

2. The Upgrade *will* activate the Firewall in all cases, which will block most non-standard traffic. Solution: Obviously to disable the firewall after the upgrade again, or configure it for your needs. I personally consider Server Side Firewalls a completely broken idea.

3. The upgrade will alter your Timesync config massively. If you have multiple time servers configured now, it will only take over the first one, plus, it will add a public SUSE NTP pool to your setup without asking. On top (and this is nasty), it will stop your server from answering to NTP traffic, as the /etc/chrony.conf it creates does not contain an "allow=" line. Many installs do rely on OES Servers as their Time Sources, and they will no longer work after the upgrade.

Solution: Edit /etc/chrony.conf (or use yast) and add back all your servers. Also, remove /etc/chrony.d/pool.conf (that is the public suse server pool you may not want).

4. Less important, but may hit you anyways, especially when you run Groupwise: The Upgrade will reenable postfix when it was disabled before. Solution: Disable postfix again, if e.g your GWIA will no longer listen on port 25, and you need it to listen on more than one IP.

More to come. Feel free to add to the discussion what you have found.


 



Parents
  • 0  

    I forgot:

    7. SNMP is broken after the upgrade. Solution: "zypper in libsnmp40"

    See also: community.microfocus.com/.../

  • 0   in reply to   

    I had a fun one today. I just in-place upgraded 10 OES 2018.3 server to OES 23.4 with virtually no issues (booting the ISO).  On the 11th one, I get this failure now:

    Details:

    These are all VMs, so I aborted, rolled back, and checked, and that kernel isn't even installed:

    rpm -qa | grep kernel-default
    kernel-default-4.12.14-122.183.1.x86_64
    kernel-default-4.4.180-94.121.1.x86_64
    kernel-default-4.12.14-122.37.1.x86_64
    kernel-default-4.12.14-122.124.3.x86_64
    kernel-default-4.12.14-122.32.1.x86_64
    kernel-default-4.12.14-122.113.1.x86_64
    kernel-default-4.12.14-122.26.1.x86_64
    kernel-default-4.12.14-122.91.2.x86_64
    kernel-default-4.12.14-122.159.1.x86_64
    kernel-default-4.12.14-122.46.1.x86_64
    kernel-default-4.12.14-122.54.1.x86_64
    kernel-default-4.12.14-122.83.1.x86_64
    kernel-default-4.12.14-122.106.1.x86_64
    kernel-default-4.12.14-122.136.1.x86_64
    kernel-default-4.12.14-122.71.1.x86_64
    kernel-default-4.12.14-122.57.1.x86_64
    kernel-default-4.12.14-122.127.1.x86_64

    I tried it one more time, same issue. So I tried just seeing what happens if I do Ignore.  It finishes the package installation, but then blows up after that.  I never get to the Upgrade eDirectory question, instead I get this:

    I hit ok, and the server comes up to the login prompt.  So the server still boots, but it never ran any of the OES 23.4 upgrade process.  I ended up rolling back again.

    Wondering if anyone has any ideas on that one?

    Matt

  • 0 in reply to   

    This can be a place/size problem on your /boot partition. There are 2 solutions for this:

    1. Remove on / boot all old kernel files and all *.gz files

    2. increase the size for partition /boot (more difficult)

    then you can install successfully the new version oes23.4

    you can also use putty during the installation and remove the old kernel files on /boot

    i had this problem and solved it my self and reported to OT

  • 0   in reply to 

    That's not the issue, there is plenty of space.  I thought space on boot as well and that was the first thing I checked.  This server is actually using EFI so  /boot isn't a partition, /boot/efi is and there is 5.5GB of space there and almost nothing is used.  In /boot (which I assume must then be on the root partition?), there is just under 1GB of kernel files.  And again, there is plenty of space (27G).  Most of the servers I upgraded are configured identically and had no issues whatsoever.  Only this one is having an issue.  

    I did try removing one of the old kernels (rpm -e kernel-default-4.4.180-94.121.1.x86_64) but I get a huge list of dependency warnings so I cannot remove it.  

    Matt

Reply
  • 0   in reply to 

    That's not the issue, there is plenty of space.  I thought space on boot as well and that was the first thing I checked.  This server is actually using EFI so  /boot isn't a partition, /boot/efi is and there is 5.5GB of space there and almost nothing is used.  In /boot (which I assume must then be on the root partition?), there is just under 1GB of kernel files.  And again, there is plenty of space (27G).  Most of the servers I upgraded are configured identically and had no issues whatsoever.  Only this one is having an issue.  

    I did try removing one of the old kernels (rpm -e kernel-default-4.4.180-94.121.1.x86_64) but I get a huge list of dependency warnings so I cannot remove it.  

    Matt

Children
  • 0 in reply to   

    your decision! you can't see the complete disk size, /boot had not large size!
    I had resolved my installation problem with removing this, also *.gz

    Try it and win!

    Keep only the latest kernel files!
    .vmlinuz-5.14.21-150400.24.97-default.hmac
    System.map-5.14.21-150400.24.97-default
    config-5.14.21-150400.24.97-default
    initred -> initred-5.14.21-150400.24.97-default
    initrd-5.14.21-150400.24.97-default
    vmlinuz -> vmlinuz-5.14.21-150400.24.97-default
    vmlinuz-5.14.21-150400.24.97-default

    remove all *.gz

    Snapshot before and test it!

    KEEP ONLY YOUR LATEST VERSION OF KERNEL FILES!
    You have old version, oes23.4 had 5.14.21....

  • 0   in reply to   

    Hi Matt,

    you write that the current server is running an EFI boot. Can you please check if your installation works without bootloader (e.g. grub2). This is possible with certain kernel versions. Please just run a mkinitrd without parameters and post the output here. Maybe I can already write something about your problem.

    What does zypper purge-kernels actually say (SELS15)  (please be careful and backup the inital ram disk + backup the system beforehand)

    zypper se -si 'kernel*' I have just seen that there is still a 4.x kernel in your installation. zypper rm PACKAGENAME-VERSION to remove the package

    Claude's advice is helpful, I often do this in the field.


    George

    “You can't teach a person anything, you can only help them to discover it within themselves.” Galileo Galilei