Wikis - Page

High Availability for Sentinel Log Manager in 9 points ...and many more links

0 Likes
This article provides the guidelines for the implementation of a High Availability (HA) cluster for Sentinel Log Manager (SLM) with SUSE Linux Enterprise High Availability Extension (SLE_HAE). It is not intended to be a detailed step-by-step guide to install and configure every single component, but rather it provides some general guidelines for administrators who are familiar with SLM and need to set up and configure a two-nodes HA clusters. A minimum understanding of Linux and HA technology is anyway mandatory as well.

There is already very good documentation and other very good guides around about HA cluster with SLES. The real purpose of this article is to provide general guidelines based on my personal experience with HA environment and SLM. Of course, this is not an exact recommendation for design SLM in a particular environment. There are many ways to do it due to the large number of options and varieties of approaches which SLE-HAE provides.


    • Introduction



    A very important concept to keep in mind throughout the document is that High Availability is not achieved by simply installing the fail over software and walking away. Many techniques at different layers are used to make each individual system and the overall infrastructure around it as reliable as possible. Typically, this is achieved by using redundancy of system components to eliminate the Single Point Of Failure (SPOF) and HA Cluster is one of these techniques, not "the technique". If you are implementing a HA cluster and have never considered the need of redundant power supply units or high-quality network cables, for example, you are probably wasting your money.

    Through the next points we will discuss some of these techniques, the most fundamentals probably, to provide more reliability for each individual server at the various levels.

    Two words to introduce the design of the cluster are important as well. This is a two-node Active/Passive cluster which is the most common deployment for an HA cluster since it is the minimum required to provide redundancy. Moreover, the application SLM itself is not designed to run in an Active/Active deployment.

    Each server of the cluster will have local storage for operating system's data and shared storage for SLM's data and Storage-based Fencing.

    Each server will also have 4 NIC in order to provide redundancy on each network channel.

    About software, SUSE Linux Enterprise Server 11 Service Pack 1 is installed on each server. On top of it, the SUSE Linux Enterprise – High Availability Extension will be added as add-on and Sentinel Log Manager 1.2 will be installed. Next points will discuss this a little bit more in detail.


    • Base Operating System



    There is not any particular guideline about the installation of the base operating system. It would simply be a standard installation of SLES11 SP1 on each node of the cluster. If virtual machines, the clone functionality can be use to ease the procedure.

    The operating system will be installed on local storage only of each server of the cluster and in accord to the relative documentation. Local storage will be most likely configured with RAID technology, much better if not software, to add reliability and protection to our data. Partitioning can vary since there is not any specific requirements or indication on this point, SLM's data will not reside here. Of course, the more simple it is, the best it is.

    After the initial installation it will necessary to configure time synchronization and name resolution among the nodes of the cluster.

  1. Beyond this, the basic common recommendations should be considered as usual. Install only the needed software, keep the system up-to-date, secure the system and keep the overall configuration as much simple as possible, etc etc...


  2. Network


Redundant network channels are required to provide fault tolerance and load balancing across the active connections so that single cable, switch, or network interface failures do not result in network outages. In our HA cluster there will be two separate networks:

  1. the public network, where the SLM service will be listening on

  2. the cluster network which will be dedicated to let the two cluster nodes communicate each other

While for the cluster network the redundancy will be created by configuring the cluster itself, for the public network it is necessary to configure it aggregating multiple network interfaces. This is achieved by using Linux bonding driver.

The Linux bonding driver provides a method for aggregating multiple network interfaces into a single logical "bonded" interface. The behavior of the bonded interfaces depends upon the mode; generally speaking, modes provide either hot standby or load balancing services. Additionally, link integrity monitoring may be performed.

The most simple configuration is likely the active-backup policy, or mode 1. In this mode, only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. More information can be found in the Kernel documentation (/usr/src/linux/Documentation/networking/bonding.txt).

Also, a good Novell TID is available at the following link:
http://www.novell.com/support/php/search.do?cmd=displayKC&docType=ex&bbid=TSEBB_1222707479531&url=&stateId=0 0 235832705&dialogID=116390763&docTypeID=DT_TID_1_1&externalId=3929220&sliceId=2&rfId=


  • Multipath I/O



Just like the network channels, redundant data connections are required to provide fault tolerance and load balancing across the active connections so that single cable, switch, or interface failures do not lead to loss of connectivity to the storage.

Multipathing is the ability of a server to communicate with the same physical or logical block storage device across multiple physical paths between the host bus adapters in the server and the storage controllers for the device, typically in Fibre Channel (FC) or iSCSI SAN environments.

Typically, Multipath I/O automatically detect and configure multipath device.

When using Multipath I/O for fibre channel, it is also recommended to review the kernel module's settings of the Fibre-Channel devices in accord to the vendor's documentation. This is necessary since they typically are not configured for HA environments, which means that settings are not tuned to provide fast failure detection and fast fail over.

For more information see the "SLES 11 SP1 Storage Administration Guide" available at the following link: http://www.novell.com/documentation/sles11/stor_admin/?page=/documentation/sles11/stor_admin/data/multipathing.html

A good Novell TID is available at the following link:
http://www.novell.com/support/viewContent.do?externalId=3231766&sliceId=1


  • High Availability Extension



After the installation and the configuration of the base operating system and some fundamentals components, we can proceed with the installation of the High Availability Extension and the basic cluster setup.

SLE-HAE is a software add-on which needs to be installed on top of each server of the cluster. Both the installation and initial cluster setup is quite straightforward and is very well step-by-step documented at section "3.0 Installation and Basic Setup with YaST" at the following link:
http://www.novell.com/documentation/sle_ha/book_sleha/?page=/documentation/sle_ha/book_sleha/data/sec_ha_installation_inst.html

After the cluster is brought online, it is then possible to manage and configure cluster resources and options. This can done either via GUI or via command line or via Web Interface as well. There are the pros and the cons for each of them. To cut it short, command line is much more powerful, the GUI is more friendly. The Web Interface is instead the most cool tool but still needs to be improved.

For those who are using the GUI, before logging in to the cluster the respective user must be a member of the haclient group. The installation creates a linux user named hacluster which is member of the haclient group. Before using the GUI, either set a password for the hacluster user or create a new user which is member of the haclient group. Do this on every node you will connect to with the GUI.


The following the status of your cluster from GUI at this point:



Now, the very first configuration to change for our HA is about a couple of Global Cluster Options:

  • "No Quorum Policy" needs to be set to ignore. This is needed for a two node cluster because it is impossible to form quorum in a two node cluster.
  • "Default Resource Stickiness" is the global option we have to set to a positive value in order to disable fail back of resources. Basically, it means to leave resources where they are.

The good result at this point should look similar to the following:



Our HA cluster is now ready to run resources but we prefer to do it after the configuration of some other components and after the underlying shared storage is finished.


  • Fencing/STONITH



Before we talk about fencing and STONITH we need to review the needs of shared storage which basically means storage accessible from both nodes of the cluster.

There are multiple possibilities to deploy shared storage for a cluster, but considering the requirements of SLM, the most feasible option for production environment is likely either a SAN device (iSCSI or FC) if working with physical servers or an unformatted SAN LUNs if you are using virtual machine on VMware ESX/ESXi.

The procedure to properly set up the SAN environment needs to be discussed with your SAN and VMware administrators.

As result you will need to have at least one shared storage device big enough to satisfy the minimal storage requirements for SLM data (discussed later at point 7), and one more shared storage device for the storage-based fencing (SBD) discussed here.

Fencing is a very important concept in HA cluster may be defined as a concept or a method to bring an HA cluster to a known state. STONITH is the node level fencing implementation with SLE-HAE. It is the primary component in the cluster which prevents uncoordinated concurrent access to shared data storage protecting the integrity of data from Split Brain. Split Brain is the scenario in which the cluster nodes are divided into two or more groups that do not know of each other (unknown state of the cluster).

There are several STONITH devices and resources which are supported by SLE-HAE. Split Brain Detector (SBD) is one of these and can be reliably implemented, along with watchdog support, to avoid split-brain scenarios in environments where shared storage it is available. It provides a way to enable STONITH and fencing in clusters without external power switches, but with shared storage, just like Novell Cluster Services which uses SBD to exchange poison pill messages.

A small partition (1MB) is formatted for the use with SBD. After the respective daemon is configured, it is brought online on each node before the rest of the cluster stack is started. It is terminated after all other cluster components have been shut down, thus ensuring that cluster resources are never activated without SBD supervision.

The configuration of the external/SBD agent is pretty straightforward. It basically requires the following steps:

  1. the creation of the SBD partition on the shared device
  2. setting up the appropriate watchdog*
  3. starting the SBD daemon
  4. Testing SBD
  5. Configuring the cluster resource (stonith:external/sbd)

*you may want to insert the kernel module in the MODULES_LOADED_ON_BOOT variable in /etc/sysconfig/kernel

The whole procedure is well detailed at the following link:
http://www.novell.com/documentation/sle_ha/book_sleha/?page=/documentation/sle_ha/book_sleha/data/sec_ha_storage_protect_fencing.html#pro_ha_storage_protect_watchdog

The good result at this point should look similar to the following:



More information at the following links:







  • Shared Storage



Shared storage for SLM data will be managed by Linux Volume Manager 2 (LVM2) and its extension Clustered LVM (cLVM).

LVM2 provides a higher-level view of the disk storage than the traditional view of disks and partitions. It basically gives the system administrator much more flexibility in allocating storage to applications and users.

LVM2, which is widely used to manage local storage, has been extended to support transparent management of volume groups across the whole cluster so that clustered volume groups can be managed using the same commands as local storage. This extension takes the name of cLVM.

For more information about LVM2 see the following link:
http://www.novell.com/documentation/sles11/stor_admin/?page=/documentation/sles11/stor_admin/data/lvm.html

For more information about cLVM see the following link:
http://www.novell.com/documentation/sle_ha/book_sleha/?page=/documentation/sle_ha/book_sleha/data/cha_ha_clvm.html

There are two primary benefits of using the cLVM in our HA cluster for SLM.

  • The possibility to expand the file system used by SLM at will, this is especially useful when having a growing data stock as in the case of SLM
  • The possibility to leverage on cLVM's exclusivity option to provide even more protection to the shared data.

The configuration of cLVM2 may be not so easy as expected since it is coordinated with different tools (DLM, cLVM and LVM2). The whole procedure could be split in three different parts:

  1. First of all, it is necessary to create in the cluster a base-group clone resource with DLM (ocf:pacemaker:controld) and cLVM (ocf:lvm2:clvmd) right in this order. When creating this clone resource, remember to set the meta attribute "interleave=true"

    The good result at this point should look similar to the following:





    More information about the configuration of it at the following link:
    http://www.novell.com/documentation/sle_ha/book_sleha/?page=/documentation/sle_ha/book_sleha/data/sec_ha_clvm_config.html

  2. With the clone-base-group resource online, the following steps need to be performed (on one node of the cluster only):

    • create a partition (0x8E Linux LVM) on the shared storage device with fdisk
    • create the physical volume (PV) with pvcreate
    • create the Volume Group (VG) with vgcreate (with "–cluster y" option)
    • create the Logical Volume (LV) with lvcreate

    • Now we need to create a new group resource in cluster, named "sentinel-group" for example, this time not cloned, and its first primitive resource in it, which is the LVM OCF RA (ocf:heartbeat:lvm).

      With it we specify the volume group created above and also, quite important, the option "–exclusive=true". Then, it is also important to specify with a order constraint the order the two groups must be started. In fact, the "clone-base-group" has to started before and stopped after the "sentinel-group".



    The good result at this point should look similar to the following:



    We are now ready to move on to the file system part.


    • File System



    SLES ships with a number of different file systems from which to choose, including Ext3, Ext2, ReiserFS, and XFS. Each file system has its own advantages and disadvantages. In addition, the High Availability Extension provides the Oracle Cluster File System 2 (OCFS2) for high-performance setups which might require a highly available storage systems.
    However, for this scenario the benefit of having a cluster-aware file system like OCFS2 is not required. This because there will always be one single instance of SLM in the cluster without the need of concurrent access. Then, the best choice would be to simply use the file system tested by Novell for SLM, which is Ext3.

    The procedure to create the file system and relative cluster resource is quite simple and could be sum up in the following steps:

    1. Creation of the directory which will be the mount point for the created file system on both nodes of the cluster. For example, we could create the directory /sentinel in the root of the file system.

    2. Creation of the ext3 file system on the logical volume previously created. This needs to be done one time and from one node of the cluster only, of course. Since the file system will be completely managed (mount and dismount) by the cluster itself, /etc/fstab does not need to be configured to automatically mount it at boot.

    3. Testing as usual. It is always good practice to verify the file system can be manually mounted and dismount from both the nodes of the cluster.

    4. Creation of the cluster resource (ocf:heartbeat:Filesystem) in the same group along with the LVM resource previously created (sentinel-group)

    The good result at this point should look similar to the following:




    • Sentinel Log Manager



    Before proceeding with the installation of SLM, it is necesasry the use of a shared IP address, just like most of the HA cluster deployments. The shared IP address will be configured as cluster resource using the resource ocf:heartbeat:IPAddr2. The attribute "NIC" will be used to specify the public network which is the bond interface (bond0)

    The good result at this point should look similar to the following:



    Now it is high time to install and configure SLM. There are few configurations to take care of which are discussed point-by-point:

    1. The installation needs to be done on one server only and in accord to the SLM documentation. It is a standard installation with the exception to use the –location option in order to specify the location of our SLM data. In fact, by default the installation will spread all files out to all the root file system. This is not good for our HA cluster since we want to keep all files on the shared storage. For the rest, the installation can be done as usual.

      NOTE: the same cluster node where SLM is initially installed MUST be used for future upgrade as well. Upgrading SLM from a different cluster node than the one has been used for the initial installation may wipe out the old installation. This because the current SLM installer is designed to check for previous versions of itself against the RPM database. Since SLM's RPMs are present only on the node where it has been originally installed, in case these do not exist the installer considers it a new installation consequently wiping out all the existing data.

    2. After verifying that SLM is working properly on the node where it has been installed, we need to stop it and disable it from starting at boot (chkconfig sentinel_log_mgr off) since it will be started only be the cluster.

    3. We need to fix up all the explicit and automatic IP addresses used by the SLM configuration and scripts.
      • In the file 3rdparty/tomcat/webapps/ROOT/novellsiemdownloads/configuration.xml it is necessary to change the two places that reference an IP address to use the cluster IP address.
      • In the file bin/start_tomcat.sh just comment out the line that assigns SERVER_IP and add a new line that assigns it the cluster IP address. The following is an example to assign the IP address 192.168.43.103

        #SERVER_IP=$(/sbin/ifconfig | grep bond -A 2 | grep 'inet' | grep -v 'inet6' | grep -v '127.0.0' | head -1 | cut -d: -f2 | awk '{print $1}')
        SERVER_IP=192.168.43.103



      • Start SLM and verify that everything is working fine (WebUI and ESM UI) pointing to the cluster IP Address


    4. Stop SLM. Go to the other node and edit /etc/passwd and add the line for the novell user to it from the /etc/passwd file of the node where SLM was installed. Use scp to copy over the novell user's home directory and the file /etc/init.d/sentinel_log_mgr
    5. Migrate the resource group over to the other node. Start SLM and verify that everything is working fine (WebUI and ESM UI), pointing again to the cluster IP adress.
    6. At this point, there is only one cluster resource left to be created and it is about SLM. SLE-HAE does not come with any Resource Agent for SLM, so we could either use the LSB RA or create our own OCF RA. The latter is the best option we have, in fact, it is our choice. Attached to this article there is the script "SLM.tgz" which contains the file "SLM", a very good example of a working OCF RA for SLM.

      This needs to be copied on each server of the cluster, in a dedicated subdirectory under /usr/lib/ocf/resource.d. For example, you will create the folder /usr/lib/ocf/resource.d/logmanager and then you will put the OCF RA in it. Remember to add the execute permission to the file.

      After which, the cluster resource ocf:logmanager:SLM is ready to be created as last primitive in our sentinel-group resource.

    The good result at this point should look similar to the following:



    More information can be found at the following link:


    Labels:

    How To-Best Practice
    Comment List
    Related
    Recommended