IBM® WebSphere MQ, Version 6.0 release notes for all distributed platforms: AIX®, HP-UX, iSeries™, Linux®, Solaris, and Windows®.© Copyright International Business Machines Corporation 2005. All rights reserved.
.Introduction Service NodesIn large clusters, it is desirable to have more than one node (the Management Node - MN) handle the installation and management of the compute nodes. We call these additional nodes service nodes (SN).
The management node can delegate all management operations need for a compute node to the SN that is managing that compute node. You can have one or more service nodes set up to install and manage groups of compute nodes. See for a high-level diagram showing this structure.With xCAT, you have the choice of either having each service node install/manage a specific set of compute nodes, or having a pool of service nodes, any of which can respond to an installation request from a compute node. (Service node pools must be aligned with the network broadcast domains, because the way a compute node choose its SN for that boot is by whoever responds to the DHPC request broadcast first.) You can also have a hybrid of the 2 approaches, in which for each specific set of compute nodes you have 2 or more SNs in a pool. Where Did I Come From and Where Am I Going?This document explains the basics for setting up a hierarchical Linux cluster using service nodes. The user of this document should be very familiar with setting up an xCAT non-hierarchical cluster. Angel black killing demons. The document will cover only the additional steps needed to make the cluster hierarchical by setting up the SNs.
It is assumed that you have already set up your management node according to the instructions in the relevant xCAT 'cookbook' on the page. For example:.Note: If using P775 hardware, reference the link above.Once you have used this document to define your service nodes, you should return to the xCAT cookbook you are using to deploy the service nodes and then the compute nodes. Service Node 101The SNs each run an instance of xcatd, just like the MN does. The xcatd's communicate with each other using the same XML/SSL protocol that the xCAT client uses to communicate with xcatd on the MN.The service nodes need to communicate with the xCAT database on the Management Node. They do this by using the remote client capability of the database (i.e. They don't go through xcatd for that).
Therefore the Management Node must be running one of the daemon-based databases supported by xCAT (PostgreSQL, MySQL, or DB2). (Currently DB2 is only supported in xCAT clusters of Power 775 nodes.) The default SQLite database does not support remote clients and cannot be used in hierarchical clusters. This document includes instructions for migrating your cluster from SQLite to one of the other databases. Since the initial install of xCAT will always set up SQLite, you must migrate to a database that supports remote clients before installing your service nodes.xCAT will help you install your service nodes will install on the SNs xCAT software and other required rpms such as perl, the database client, and other pre-reqs. Service nodes require all the same software as the MN (because it can do all of the same functions), except that there is a special top level xCAT rpm for SNs called xCATsn vs.
The xCAT rpm that is on the Management Node. The xCATsn rpm tells the SN that the xcatd on it should behave as an SN, not the MN. Setup the MN Hierarchical DatabaseBefore setting up service nodes, you need to set up either MySQL, PostgreSQL, or DB2 as the xCAT Database on the Management Node. The database client on the Service Nodes will be set up later when the SNs are installed. MySQL and PostgreSQL are available with the Linux OS. DB2 is an IBM product and must be purchased and is only supported by xCAT in Power 775 clusters.Follow the instructions in one of these documents for setting up the Management node to use the selected database:. To use MySQL or MariaDB:.
Follow this documentation and be sure to use the xCAT provided mysqlsetup command to setup the database for xCAT:. To use PostgreSQL:. Follow this documentation and be sure and use the xCAT provided pgsqlsetup command to setup the database for xCAT:. To use DB2 (Power775 support only).
Follow the sections on setting up the Management Node. Be sure to use the xCAT provided db2sqlsetup command to setup the database for xCAT. At this time, do not do anything to setup the DB2 Client on the Service Nodes, so stop after you have run the db2sqlsetup script to setup the Management Node:Define the service nodes in the databaseThis document assumes that you have previously defined your compute nodes in the database. It is also possible at this point that you have generic entries in your db for the nodes you will use as service nodes as a result of the node discovery process. We are now going to show you how to add all the relevant database data for the service nodes (SN) such that the SN can be installed and managed from the Management Node (MN).
In addition, you will be adding the information to the database that will tell xCAT which service nodes (SN) will service which compute nodes (CN).For this example, we have two service nodes: sn1 and sn2. We will call our Management Node: mn1. Note: service nodes are, by convention, in a group called service.
Some of the commands in this document will use the group service to update all service nodes.Note: a Service Node's service node is the Management Node; so a service node must have a direct connection to the management node. The compute nodes do not have to be directly attached to the Management Node, only to their service node. This will all have to be defined in your networks table. Add Service Nodes to the nodelist TableDefine your service nodes (if not defined already), and by convention we put them in a service group. We usually have a group compute for our compute nodes, to distinguish between the two types of nodes. (If you want to use your own group name for service nodes, rather than service, you need to change some defaults in the xCAT db that use the group name service.
For example, in the postscripts table there is by default a group entry for service, with the appropriate postscripts to run when installing a service node. Also, the default kickstart/autoyast template, pkglist, etc that will be used have files names based on the profile name service.). Chdef -t group service arch=x8664 os=rhels6.3 nodetype=osi profile=service netboot=xnba installnic=mac primarynic=mac provmethod=rhels6.3-x8664-install-serviceAdd Service Nodes to the servicenode TableAn entry must be created in the servicenode table for each service node or the service group. This table describes all the services you would like xcat to setup on the service nodes. (Even if you don't want xCAT to set up any services - unlikely - you must define the service nodes in the servicenode table with at least one attribute set (you can set it to 0), otherwise it will not be recognized as a service node.)When the xcatd daemon is started or restarted on the service node, it will make sure all of the requested services are configured and started.
(To temporarily avoid this when restarting xcatd, use 'service xcatd reload' instead.)To set up the minimum recommended services on the service nodes. Chdef -t group -o service setupnfs=1 setupdhcp=1 setuptftp=1 setupnameserver=1 setupconserver=1See the setup. attributes in the for the services available. (The HTTP server is also started when setupnfs is set.) If you are using the setupntp postscript on the compute nodes, you should also set setupntp=1. For clusters with subnetted management networks (i.e. The network between the SN and its compute nodes is separate from the network between the MN and the SNs) you might want to also set setupipforward=1.
Add Service Node PostscriptsBy default, xCAT defines the service node group to have the 'servicenode' postscript run when the SNs are installed or diskless booted. This postscript sets up the xcatd credentials and installs the xCAT software on the service nodes. If you have your own postscript that you want run on the SN during deployment of the SN, put it in /install/postscripts on the MN and add it to the service node postscripts or postbootscripts. Chdef -t group -p service postscripts=Notes:. For Red Hat type distros, the postscripts will be run before the reboot of a kickstart install, and the postbootscripts will be run after the reboot. The disappearance of haruhi suzumiya full movie. Make sure that the servicenode postscript is set to run before the otherpkgs postscript or you will see errors during the service node deployment. The -p flag automatically adds the specified postscript at the end of the comma-separated list of postscripts (or postbootscripts).If you are running additional software on the service nodes that need ODBC to access the database (e.g.
LoadLeveler or TEAL), use this command to add the xCAT supplied postbootscript called 'odbcsetup'. Chdef -t group -p service postbootscripts=odbcsetupIf using DB2 follow the instructions in this document for setting up the postscripts table to enable DB2 during servicenode installs. Assigning Nodes to their Service NodesThe node attributes servicenode and xcatmaster define which SN services this particular node. The servicenode attribute for a compute node defines which SN the MN should send a command to (e.g.
Xdsh), and should be set to the hostname or IP address of the service node that the management node contacts it. The xcatmaster attribute of the compute node defines which SN the compute node should boot from, and should be set to the hostname or IP address of the service node that the compute node contacts it. Unless you are using service node pools, you must set the xcatmaster attribute for a node when using service nodes, even if it contains the same value as the node's servicenode attribute.Host name resolution must have been setup in advance, with /etc/hosts, DNS or dhcp to ensure that the names put in this table can be resolved on the Management Node, Service nodes, and the compute nodes. It is easiest to have a node group of the compute nodes for each service node. For example, if all the nodes in node group compute1 are serviced by sn1 and all the nodes in node group compute2 are serviced by sn2. Chdef -t group compute1 servicenode=sn1 xcatmaster=sn1-cchdef -t group compute2 servicenode=sn2 xcatmaster=sn2-cNote: in this example, sn1 and sn2 are the node names of the service nodes (and therefore the hostnames associated with the NICs that the MN talks to).
The hostnames sn1-c and sn2-c are associated with the SN NICs that communicate with their compute nodes.Note: the attribute tftpserver defaults to the value of xcatmaster if not set, but in some releases of xCAT it has not defaulted correctly, so it is safer just to set it to the same value as xcatmaster.These attributes will allow you to specify which service node should run the conserver (console) and monserver (monitoring) daemon for the nodes in the group specified in the command. In this example, we are having each node's primary SN also act as its conserver and monserver (the most typical setup). Chdef -t group compute1 conserver=sn1 monserver=sn1,sn1-cchdef -t group compute2 conserver=sn2 monserver=sn2,sn2-cService Node PoolsService Node Pools are multiple service nodes that service the same set of compute nodes. Having multiple service nodes allows backup service node(s) for a compute node when the primary service node is unavailable, or can be used for work-load balancing on the service nodes. But note that the selection of which SN will service which compute node is made at compute node boot time.
You Must Define The Environment Before Running This Command Catia V6 2016
After that, the selection of the SN for this compute node is fixed until the compute node is rebooted or the compute node is explicitly moved to another SN using the command.To use Service Node pools, you need to architect your network such that all of the compute nodes and service nodes in a partcular pool are on the same flat network. If you don't want the management node to respond to/manage some of the compute nodes, it shouldn't be on that same flat network. The site.dhcpinterfaces attribute should be set such that the SNs' DHCP daemon only listens on the NIC that faces the compute nodes, not the NIC that faces the MN. This avoids some timing issues when the SNs are being deployed (so that they don't respond to each other before they are completely ready). You also need to make sure the table accurately reflects the physical network structure.To define a list of service nodes that support a set of compute nodes, set the servicenode attribute to a comma-delimited list of the service nodes. When running an xCAT command like xdsh or updatenode for compute nodes, the list will be processed left to right, picking the first service node on the list to run the command.
If that service node is not available, then the next service node on the list will be chosen until the command is successful. Errors will be logged. If no service node on the list can process the command, then the error will be returned.
You can provide some load-balancing by assigning your service nodes as we do below.When using service node pools, the intent is to have the service node that responds first to the compute node's DHCP request during boot also be the xcatmaster, the tftpserver, and the NFS/http server for that node. Therefore, the xcatmaster and nfsserver attributes for nodes should not be set. When nodeset is run for the compute nodes, the service node interface on the network to the compute nodes should be defined and active, so that nodeset will default those attribute values to the 'node ip facing' interface on that service node.For example. Rsync -auv -exclude 'autoinst' /install sn1:/Note: If your service nodes are stateless and site.sharedtftp=0, if you reboot any service node when using servicenode pools, any data written to the local /tftpboot directory of that SN is lost.
You will need to run nodeset for all of the compute nodes serviced by that SN again.For additional information about service node pool related settings in the networks table, see. Conserver and Monserver and PoolsThe support of conserver and monserver with Service Node Pools is still not supported. You must explicitly assign these functions to a service node using the nodehm.conserver and noderes.monserver attribute as above.
Setup Site TableIf you are not using the NFS-based statelite method of booting your compute nodes, set the installloc attribute to '/install'. This instructs the service node to mount /install from the management node. (If you don't do this, you have to manually sync /install between the management node and the service nodes.). Chdef -t site clustersite installloc='/install'For IPMI controlled nodes, if you want the out-of-band IPMI operations to be done directly from the management node (instead of being sent to the appropriate service node), set site.ipmidispatch=n.If you want to throttle the rate at which nodes are booted up, you can set the following site attributes:. syspowerinterval. syspowermaxnodes. powerinterval (system p only)See the for details.
Setup networks TableAll networks in the cluster must be defined in the networks table. When xCAT was installed, it ran makenetworks, which created an entry in this table for each of the networks the management node is on.
You need to add entries for each network the service nodes use to communicate to the compute nodes.For example. Mkdef -t network net1 net=10.5.1.0 mask=255.255.255.224 gateway=The ipforward attribute should be enabled on all the xcatmaster nodes that will be acting as default gateways. You can set ipforward to 1 in the servicenode table or add the line 'net.ipv4.ipforward = 1' in file /etc/sysctl.conf and then run 'sysctl -p /etc/sysctl.conf' manually to enable the ipforwarding.Note:If using service node pools, the networks table dhcpserver attribute can be set to any single service node in your pool. The networks tftpserver, and nameserver attributes should be left blank. Verify the TablesTo verify that the tables are set correctly, run lsdef on the service nodes, compute1, compute2.
Lsdef service,compute1,compute2Add additional adapters configuration script (optional)It is possible to have additional adapter interfaces automatically configured when the nodes are booted. XCAT provides sample configuration scripts for ethernet, IB, and HFI adapters. These scripts can be used as-is or they can be modified to suit your particular environment. The ethernet sample is /install/postscript/configeth. When you have the configuration script that you want you can add it to the 'postscripts' attribute as mentioned above.
Make sure your script is in the /install/postscripts directory and that it is executable.Note: For system p servers, if you plan to have your service node perform the hardware control functions for its compute nodes, it is necessary that the SN ethernet network adapters connected to the HW service VLAN be configured. For Power 775 clusters specifically, see for more information. Configuring Secondary AdaptersTo configure secondary adapters, see. Gather MAC information for the install adaptersyou should get the MAC information for the service node firstly. After finishing the OS provision for service node, and create the connections between the hdwrsvr on the service node and the non-sn-CEC, you can get the MAC information for the compute node. Using getmacsUse the xCAT getmacs command to gather adapter information from the nodes. This command will return the MAC information for each Ethernet or HFI adapter available on the target node.
The command can be used to either display the results or write the information directly to the database. If there are multiple adapters the first one will be written to the database and used as the install adapter for that node.The command can also be used to do a ping test on the adapter interfaces to determine which ones could be used to perform the network boot. In this case the first adapter that can be successfully used to ping the server will be written to the database. Getmacs working with P775The P775 cec supports two networks with the getmacs command. The default network in the P775 is the HFI which is used to communicate between all the P775 octants. There is also support for an Ethernet network which is used to communicate between the xCAT EMS, and other P775 server octants. It is important that all the networks are properly defined in the xCAT networks table.Before running getmacs you must first run the makeconservercf command.
You need to run makeconservercf any time you add new nodes to the cluster. Type Location Code MAC Address Full Path Name Ping Result Device Type ent U9125. F2A.024 C362 - V6 - C2 - T1 fef9dfb7c602 / vdevice / l - lan @30000002 ( / vdevice / l - lan @30000002 ) successful virtual ent U9125. F2A.024 C362 - V6 - C3 - T1 fef9dfb7c603 / vdevice / l - lan @30000003 unsuccessful virtualFrom this result you can see that ' fef9dfb7c602' should be used for this service nodes MAC address.To retrieve the HFI MAC address used for an P775 xCAT compute nodes, your getmacs command does not require the -D flag since the first HFI adapter recognized will be used. # Type Location Code MAC Address Full Path Name Ping Result hfi - ent U78A9.0 - P1 04 / hfi - iohub @00002 / hfi - ethernet @10 unsuccessful physical hfi - ent U78A9.0 - P1 04 / hfi - iohub @00002 / hfi - ethernet @11 unsuccessful physicalFor more information on using the getmacs command see the man page.If you did not have the getmacs command write the MAC addresses directly to the database you can do it manually using the the chdef command. Chdef -t node node01 mac=fef9dfb7c60Configure DHCPAdd the relevant networks into the DHCP configuration, refer to:Add the defined nodes into the DHCP configuration, refer to:Set Up the Service Nodes for Stateful (Diskful) InstallationAny cluster using statelite compute nodes (include p775 clusters) must use a stateful (diskful) service nodes.Note: If you are using diskless service nodes, go toFirst, go to the site and download the level of the xCAT tarball you desire.
Then go to and get the latest xCAT dependency tarball. Note: All xCAT service nodes must be at the exact same xCAT version as the xCAT Management Node. Copy the files to the Management Node (MN) and untar them in the appropriate sub-directory of /install/post/otherpkgs: 4 Note for the appropriate directory below, check the otherpkgdir=/install/post/otherpkgs/rhels6.4/ppc64 attribute of the osimage defined for the servicenode.For example for the osimage rhels6.4-ppc64-install-service. Chdef -t osimage -o ubuntu14.
1-ppc64el-install-service -p otherpkgdir = 'trusty main,trusty-updates main,trusty universe, /install/post/otherpkgs/ubuntu14.04.1/ppc64el/xcat-core/ trusty main, /install/post/otherpkgs/ubuntu14.04.1/ppc64el/xcat-dep/ trusty main'Note: you will be installing the xCAT Service Node rpm xCATsn meta-package on the Service Node, not the xCAT Management Node meta-package. Do not install both.For Power 775 Clusters, you should add the DFM and hdwrsvr into the list of packages to be installed on the SN. Refer to:(TODO) This needs to be a real doc, not point to a mini-design.If you want to setup disk mirroring(RAID1) on the service nodes, see for more details. Update the powerpc-utils-1.2.2-18.el6.ppc64.rpm in the rhels6 RPM repository (rhels6 only). This section could be removed after the powerpc-utils-1.2.2-18.el6.ppc64.rpm is built in the base rhels6 ISO.
The direct rpm download link is: ftp://linuxpatch.ncsa.uiuc.edu/PERCS/powerpc-utils-1.2.2-18.el6.ppc64.rpm. The update steps are as following:. put the new rpm in the base OS packages. Rnetboot serviceInitialize network boot to install Service Nodes in Power 775Starting from xCAT 2.6 and working in Power 775 cluster, there are two ways to initialize a network boot: one way is that using rbootseq command to setup the boot device as network adapter for the compute node, and after that, you can issue rpower command to power on or reset the compute node to boot from network, another way is to use rnetboot to the compute node directly. Comparing between these two ways, rbootseq/rpower command doesn't require the console support and operate in the console, so it has a better performance.
It is recommended to use rbootseq/rpower to setup the boot device to network adapter and initialize the network boot in Power 775 cluster.Example of using rbootseq and rpower. Wcons service # make sure DISPLAY is set to your X server/VNC orrcons tail -f /var/log/messagesNote: We have experienced one problem while trying to install RHEL6 diskful service node working with SAS disks. The service node cannot reboots from SAS disk after the RHEL6 operating system has been installed. We are waiting for the build with fixes from RHEL6 team, once meet this problem, you need to manually select the SAS disk to be the first boot device and boots from the SAS disk. Update Service Node Diskfull ImageIf you need to update the service nodes later on with a new version of xCAT and its dependencies, obtain the new xCAT and xCAT dependencies rpms. (Follow the same steps that were followed in.Update the service nodes with the new xCAT rpms. Updatenode -P ospkgsSetup the Service Node for Stateless DeploymentNote: The stateless service node is not supported in ubuntu hierarchy cluster.
For ubuntu, please skip this section.If you want, your service nodes can be stateless (diskless). The service node must contain not only the OS, but also the xCAT software and its dependencies. In addition, a number of files are added to the service node to support the PostgreSQL, or MySQL database access from the service node to the Management node, and ssh access to the nodes that the service nodes services. (DB2 is not supported on diskless Service Nodes.) The following sections explain how to accomplish this. Build the Service Node Stateless ImageThis section assumes you can build the stateless image on the management node because the service nodes are the same OS and architecture as the management node.
If this is not the case, you need to build the image on a machine that matches the service node's OS/architecture. Create an osimage definition. When you run copycds, xCAT will create a service node osimage definitions for that distribution. For a stateless service node, use the.-netboot-service definition. Lsdef -t osimage -o rhels6.3-ppc64-netboot-service -i otherpkgdirObject name: rhels6.3-ppc64-netboot-serviceotherpkgdir=/install/post/otherpkgs/rhels6.3/ppc64cd /install/post/otherpkgs/rhels6.3/ppc64mkdir xcatcd xcatcp -Rp /xcat-core.cp -Rp /xcat-dep.If you installed your management node directly from the Linux online repository, you will need to download the xcat-core and xcat-dep tarballs:First, go to the page and download the level of xCAT tarball you desire. Then go to the page and download the latest xCAT dependency tarball.
Place these into your otherpkdir directory. Rnetboot service. To diskless boot the service nodes in Power 775Starting from xCAT 2.6 and working in Power 775 cluster, there are two ways to initialize a network boot: one way is that using rbootseq command to setup the boot device as network adapter for the compute node, and after that, you can issue rpower command to power on or reset the compute node to boot from network, another way is to use rnetboot to the compute node directly. Comparing between these two ways, rbootseq/rpower command doesn't require the console support and operate in the console, so it has a better performance. It is recommended to use rbootseq/rpower to setup the boot device to network adapter and initialize the network boot in Power 775 cluster.
Genimage rhels6.3-ppc64-netboot-servicepackimage rhels6.3-ppc64-netboot-servicenodeset service osimage=rhels6.3-ppc64-netboot-servicernetboot serviceTo diskless boot the service nodes in Power 775Starting from xCAT 2.6 and working in Power 775 cluster, there are two ways to initialize a network boot: one way is that using rbootseq command to setup the boot device as network adapter for the compute node, and after that, you can issue rpower command to power on or reset the compute node to boot from network, another way is to use rnetboot to the compute node directly. Comparing between these two ways, rbootseq/rpower command doesn't require the console support and operate in the console, so it has a better performance. It is recommended to use rbootseq/rpower to setup the boot device to network adapter and initialize the network boot in Power 775 cluster. Tail -f /var/log/messagesTest Service Node installation. ssh to the service nodes. Chdef -t site clustersite installloc='/install'Make compute node syncfiles available on the servicenodesIf you are not using the NFS-based statelite method of booting your compute nodes, and you plan to use the syncfiles postscript to update files on the nodes during install, you must ensure that those files are sync'd to the servicenodes before the install of the compute nodes.
To do this after your nodes are defined, you will need to run the following whenever the files in your synclist change on the Management Node. Updatenode -fAt this point you can return to the documentation for your cluster environment to define and deploy your compute nodes. For Power 775 Cluster, after the service node has been installed, you should create the connections between the hdwrsvr and non-snCEC, and then do other hardware control commands. Appendix A: Setup backup Service NodesFor reliability, availability, and serviceability purposes you may wish to designate backup service nodes in your hierarchical cluster.
The backup service node will be another active service node that is set up to easily take over from the original service node if a problem occurs. This is not an automatic fail over feature. You will have to initiate the switch from the primary service node to the backup manually.
Doom fatal error. The xCAT support will handle most of the setup and transfer of the nodes to the new service node. Mkdef -t group sn1group members=node01-20Note: Normally backup service nodes are the primary SNs for other compute nodes.
So, for example, if you have 2 SNs, configure half of the CNs to use the 1st SN as their primary SN, and the other half of CNs to use the 2nd SN as their primary SN. Then each SN would be configured to be the backup SN for the other half of CNs.When you run, it will configure dhcp and tftp on both the primary and backup SNs, assuming they both have network access to the CNs. This will make it possible to do a quick SN takeover without having to wait for replication when you need to switch. Xdcp Behaviour with backup servicenodesThe xdcp command in a hierarchical environment must first copy (scp) the files to the service nodes for them to be available to scp to the node from the service node that is it's master. The files are placed in /var/xcat/syncfiles directory by default, or what is set in site table SNsyncfiledir attribute.
If the node has multiple service nodes assigned, then xdcp will copy the file to each of the service nodes assigned to the node. For example, here the files will be copied (scp) to both service1 and rhsn. Lsdef cn4 grep servicenode. Xdcp cn4 /tmp/lissa/file1 /tmp/file1service1: Permission denied (publickey,password,keyboard-interactive).service1: Permission denied (publickey,password,keyboard-interactive).service1: lost connectionThe following servicenodes: service1, have errors and cannot be updatedUntil the error is fixed, xdcp will not work to nodes serviced by these service nodes.xdsh cn4 ls /tmp/file1cn4: /tmp/file1Synchronizing statelite persistent filesIf you are using xCAT's 'statelite' support, you may want to replicate your statelite files to the backup (or new) service node. This would be the case if you are using the service node as the server for the statelite persistent directory. In this case you need to copy your statelite files and directories to the backup service node and keep them synchronized over time.
An easy and efficient way to do this would be to use the rsync command from the primary SN to the backup SN.For example, to copy and/or update the /nodedata directory on the backup service node 'sn2' you could run the following command on sn1. Rsync -auv /nodedata sn2:/Note: The xCAT command has a new option -l to synchronize statelite files from the primary service node to the backup service node, but it is currently only implemented on AIX.See for details on using the xCAT statelite support. Monitoring the service nodesIn most cluster environments it is very important to monitor the state of the service nodes. If a SN fails for some reason you should switch nodes to the backup service node as soon as possible.See for details on monitoring your service nodes. Switch to the backup SNWhen an SN fails, or you want to bring it down for maintenance, use this procedure to move its CNs over to the backup SN. Move the nodes to the new service nodesUse the xCAT to make the database updates necessary to move a set of nodes from one service node to another, and to make configuration modifications to the nodes.For example, if you want to switch all the compute nodes that use service node 'sn1' to the backup SN (sn2), run. Snmove -s sn1Modified database attributesThe snmove command will check and set several node attribute values.servicenode:: This will be set to either the second server name in the servicenode attribute list or the value provided on the command line.
Xcatmaster:: Set with either the value provided on the command line or it will be automatically determined from the servicenode attribute. Nfsserver:: If the value is set with the source service node then it will be set to the destination service node.
Tftpserver:: If the value is set with the source service node then it will be reset to the destination service node. Monserver:: If set to the source service node then reset it to the destination servicenode and xcatmaster values. Conserver:: If set to the source service node then reset it to the destination servicenode and run makeconservercfRun postscripts on the nodesIf the CNs are up at the time the snmove command is run then snmove will run postscripts on the CNs to reconfigure them for the new SN. The 'syslog' postscript is always run. The 'mkresolvconf' and 'setupntp' scripts will be run IF they were included in the nodes postscript list.You can also specify an additional list of postscripts to run.Modify system configuration on the nodesIf the CNs are up the snmove command will also perform some configuration on the nodes such as setting the default gateway and modifying some configuration files used by xCAT. Statelite migrationIf you are using the xCAT statelite support you may need to modify the and tables.
This would be necessary if any of the entries in the tables include the name of the primary service node as the server for the file or directory. In this case you would have to change those entries to the name of the backup service node. But a better solution is to use the variable $noderes.xcatmaster in the statelite and litetree tables.
See for details. Boot the statelite nodesFor statelite nodes that do not use an external NFS server, if the original service node is down, the CNs it manages will be down too.
You must run the command for those nodes and then boot the nodes after running snmove. For stateless nodes (and in some cases RAMDisk statelite nodes), the nodes will be up even if the original service node is down. However, make sure to run the nodeset command in case you need to reboot the nodes later.Note: when moving p775 nodes, use the rbootseq and rpower commands to boot the nodes. If you do not use the rbootseq command, the nodes will still try to boot from the old SN. Switching backThe process for switching nodes back will depend on what must be done to recover the original service node. If the SN needed to be reinstalled, you need to set it up as an SN again and make sure the CN images are replicated to it. Once you've done this, or if the SN's configuration was not lost, then follow these steps to move the CNs back to their original SN:.
Use snmove. Snmove sn1group -d sn1. If these are statelite CNs:. rsync persistent files back to the original SN. run nodeset for these CNs.
boot the CNsAppendix B: Diagnostics. root ssh keys not setup - If you are prompted for a password when ssh to the service node, then check to see if /root/.ssh has authorizedkeys. If the directory does not exist or no keys, on the MN, run xdsh service -K, to exchange the ssh keys for root. You will be prompted for the root password, which should be the password you set for the key=system in the passwd table. XCAT rpms not on SN -On the SN, run rpm -qa grep xCAT and make sure the appropriate xCAT rpms are installed on the servicenode.
See the list of xCAT rpms in Set Up the Service Nodes for diskfull Installation. If rpms missing check your install setup as outlined in Build the Service Node Stateless Image for diskless or Set Up the Service Nodes for diskfull Installation for diskfull installs. otherpkgs(including xCAT rpms) installation failed on the SN -The OS repository is not created on the SN. When the 'yum' command is processing the dependency, the rpm packages (including expect, nmap, and httpd, etc) required by xCATsn can't be found.
In this case, please check whether the /install/postscripts/repos/// directory exists on the MN. If it is not on the MN, you need to re-run the 'copycds' command, and there will be some file created under the /install/postscripts/repos// directory on the MN. Then, you need to re-install the SN, and this issue should be gone. Error finding the database/starting xcatd - If on the Service node when you run tabdump site, you get 'Connection failure: IO::Socket::SSL: connect: Connection refused at /opt/xcat/lib/perl/xCAT/Client.pm'. Then restart the xcatd daemon and see if it passes by running the command: service xcatd restart.
If it fails with the same error, then check to see if /etc/xcat/cfgloc file exists. It should exist and be the same as /etc/xcat/cfgloc on the MN. If it is not there, copy it from the MN to the SN.
The run service xcatd restart. This indicates the servicenode postscripts did not complete successfully. Check to see your postscripts table was setup correctly in Add Service Nodes postscripts to the postscripts table. Error accessing database/starting xcatd credential failure- If you run tabdump site on the servicenode and you get 'Connection failure: IO::Socket::SSL: SSL connect attempt failed because of handshake problemserror:14094418:SSL routines:SSL3READBYTES:tlsv1 alert unknown ca at /opt/xcat/lib/perl/xCAT/Client.pm', check /etc/xcat/cert. The directory should contain the files ca.pem and server-cred.pem. These were suppose to transfer from the MN /etc/xcat/cert directory during the install.
Also check the /etc/xcat/ca directory. This directory should contain most files from the /etc/xcat/ca directory on the MN. You can manually copy them from the MN to the SN, recursively. This indicates the the servicenode postscripts did not complete successfully. Check to see your postscripts table was setup correctly in Add Service Nodes postscripts to the postscripts table.
Again service xcatd restart and try the tabdump site again. Missing ssh hostkeys - Check to see if /etc/xcat/hostkeys on the SN, has the same files as /etc/xcat/hostkeys on the MN. These are the ssh keys that will be installed on the compute nodes, so root can ssh between compute nodes without password prompting. If they are not there copy them from the MN to the SN. Again, these should have been setup by the servicenode postscripts.Appendix C: Migrating a Management Node to a Service NodeIf you find you want to convert an existing Management Node to a Service Node, you need to work with the xCAT team. It is recommended for now, to backup your database, setup your new Management Server, and restore your database into it. Take the old Management Node and remove xCAT and all xCAT directories, and your database.
See and then follow the process for setting up a SN as if it is a new node. Appendix D: Set up Hierarchical ConserverTo allow you to open the rcons from the Management Node running the conserver daemon on the Service Nodes, do the following:. Set nodehm.conserver to be the service node (using the ip that faces the management node).