Thursday, October 18, 2012

JBoss Dynamic Migration of Distributed HA Singletons among Nodes

Document  Version 1.0
Copyright © 2012-2013 beijing.beijing.012@gmail.com


Keywords:
JBoss HA  Service, HA Singleton, JBoss Cluster, JBoss load balancing, load migration, load distribution

A JBoss HA singleton is a cluster wide singleton, which runs only on one node of a JBosss cluster. When the node on which the singeton runs,  fails, another node (the master node) will automatically start the singelton service on it. When there are several  different HA singleton services, ALL the services will be activated on the master node. When the master node fails, another node take over the role of of master node, and start ALL the singleton services on this node. This is the default HA singleton behavior of JBoss.

However there might be some problems with the above solution.
If the singleton services, or some of the singleton services are resource intensive, we are going to have performance problem, if they all run simutaniously on one node, while other nodes are relatively idle.

We will now introduce a solution, which dynamically distribute HA singleton services on preferred nodes on cluster startup. In case of server failures, services will be migrated and distributed to other nodes, by checking the number of still alive nodes, and by considering the preferred nodes.

With an example, we will show how 3 HA singleton services could be dynmically migrated among a 3-nodes JBoss cluster.

Rules for distributing services on nodes are: 

if alive node count 3 -> TestHASvcA  on node[0]
                                  TestHASvcB  on node[1]
                                  TestHASvcC  on node[2]


if alive node count 2 -> TestHASvcA  on node[0]
                                  TestHASvcB  on node[0]
                                  TestHASvcC  on node[1]


if alive node count 1 -> TestHASvcA  on node[0]
                                  TestHASvcB  on node[0]
                                  TestHASvcC  on node[0]

Explaination to node sequences:
For a JBoss cluster, all nodes share the same sorted list of all alive nodes. This list is sorted by the time, when a node joins the cluster. For  a 3-nodes cluster,  node1, ndoe2, and node3, if nodes  are started one after another, then the order list will be :
                                  node1 --> [0]
                                  node2 --> [1]
                                  node3 --> [2]

In case node1 fails, node2 and node3 will update the list to:
                                   node2 --> [0]
                                   node3 --> [1]

When node1 joins the cluster again (restarted), the shared list of the 3 nodes is changed to:
                                  node2 --> [0]
                                  node3 --> [1]
                                  node1 --> [2]


Expected HA service migration behavior of our example :


step1. start node1
TestHASvcA on node1
TestHASvcB on node1
TestHASvcC on node1
step2. start node2
TestHASvcA on node1
TestHASvcB on node1
TestHASvcC migrates to node2
step3. start node3
TestHASvcA still on node1
TestHASvcB migrates to  node2
TestHASvcC migrates to  node3
step6. node1 recovers
TestHASvcA still on node3
TestHASvcB still on node3
TestHASvcC migrates to node1
step5. node2 fails
TestHASvcA migrates to node3
TestHASvcB migrates to node3
TestHASvcB still on node2
TestHASvcC still on node3
step4. node1 fails
TestHASvcA migrates to node2
TestHASvcB still on node2
TestHASvcC still on node3
step7. node2 recovers TestHASvcA still on node3
TestHASvcB migrate to node1
TestHASvcC migrates to node2


Configure and run a 3-nodes JBoss cluster

Please refer to the post HA Singleton, Cluster Wide Singleton as MBean in JBoss 5, part 1/3,  and create a JBoss cluster with 3 nodes:
  • node1
  • node2
  • node3 
Create 3 MBean TestHASvcA, TestHASvcB, TestHASvcC.
Use your favorite IDE to creat a simple Java project "HAServiceDynamicLoadMigration".
Please refer to the post HA Singleton, Cluster Wide Singleton as MBean in JBoss 5, part 2/3,  and create 3 JBoss MBeans TestHASvcA, TestHASvcB and TestHASvcC.


 TestHASvcA.java



package test.ha;
import org.jboss.system.ServiceMBeanSupport;
/**
 * Make sure to use the naming standard for MBeans (If the
 * source class is named Serious, then the interface must be named
 * SeriousMBean).
 * 
 * @author ws
 */
public class TestHASvcA extends ServiceMBeanSupport implements TestHASvcAMBean {
public void startHAService() {
System.out.println("# Starting HA service " + TestHASvcA.class.toString());
}

public void stopHAService() {
System.out.println("# Stopping HA service " + TestHASvcA.class.toString());
}
 }



TestHASvcAMBean.java



package test.ha;

import org.jboss.system.ServiceMBean;

    public interface TestHASvcAMBean extends ServiceMBean {
}




TestHASvcB.java



package test.ha;
import org.jboss.system.ServiceMBeanSupport;
/**
 * Make sure to use the naming standard for MBeans (If the
 * source class is named Serious, then the interface must be named
 * SeriousMBean).
 * 
 * @author ws
 */
public class TestHASvcB extends ServiceMBeanSupport implements TestHASvcBMBean {
public void startHAService() {
System.out.println("#Starting HA service " + TestHASvcB.class.toString());
}

public void stopHAService() {
System.out.println("#Stopping HA service " + TestHASvcB.class.toString());
}
 }



TestHASvcBMBean.java



package test.ha;

import org.jboss.system.ServiceMBean;

public interface TestHASvcBMBean extends ServiceMBean {
}




TestHASvcC.java



package test.ha;
import org.jboss.system.ServiceMBeanSupport;
/**
 * Make sure to use the naming standard for MBeans (If the
 * source class is named Serious, then the interface must be named
 * SeriousMBean).
 * 
 * @author ws
 */
public class TestHASvcC extends ServiceMBeanSupport implements TestHASvcCMBean {
public void startHAService() {
System.out.println("#Starting HA service " + TestHASvcC.class.toString());
  }

public void stopHAService() {
System.out.println("#Stopping HA service " + TestHASvcC.class.toString());
}
}



TestHASvcCMBean.java


package test.ha;

import org.jboss.system.ServiceMBean;

public interface TestHASvcCMBean extends ServiceMBean {
}



Customize JBoss HA Singleton Service's Selection Policy 
Key of dynamic HA service distrbution and migration is customized activation of HA singleton services on selected node at runtime. JBoss needs sort of "HASingletonElectionPolicy" to decide on which node to activate a certain HA singleton service. The default selection policy is "HASingletonElectionPolicySimple", which always activate HA Service on the "0th" node.
To change the default selection behavoir, we need to provide JBoss with a   customized HA selection policy, we call it "TestP" selection policy hereafter:
  • write a "TestPMBean" interface for the "TestP" MBean.
  • write a "TestP" MBean, which, implements "HASingletonElectionPolicy" and TestPBean.
  • configure JBoss to apply this selection policy to the singleton service MBeans in "jboss-service.xml"


TestPMBean.java



package test.ha;

import org.jboss.system.ServiceMBean;

public interface TestPMBean extends ServiceMBean {
                  void setId(String singletonId);
                  String getId();
 }






TestP.java



package test.ha;

import java.util.List;

import org.jboss.ha.framework.interfaces.ClusterNode;
import org.jboss.ha.framework.interfaces.HASingletonElectionPolicy;
import org.jboss.system.ServiceMBeanSupport;

public class TestP extends ServiceMBeanSupport implements
HASingletonElectionPolicy, TestPMBean {

private String id;

@Override
public void setId(String singletonId) {
this.id = singletonId;
}

  @Override
public String getId() {
return this.id;
}

// @Override
  public ClusterNode elect(List<ClusterNode> arg0) {
System.out.println(" ### list all nodes before doing selection ploicy name: ");

for (ClusterNode tmpNode : arg0) {
System.out.println(" ## node name: " + tmpNode.getName());
}

// int nodeSeg = getNodeSeq(arg0);
ClusterNode selectedNode = getNodeSeq(arg0);
System.out.println(" ## selected node, policy name: "+ selectedNode.getName());

return selectedNode;
}

      /*
      * This method changes the selection behavior from default to our customized selection          
      logic.
      *
      */
private ClusterNode getNodeSeq(List<ClusterNode> activeNodes) {

System.out.println(" ## getNodeSeq ,  policy name " + id);
// A, B, C 3 logical nodes
// 3 physical nodes: A->0, B->1, C->2
int nodeCount = activeNodes.size();
if (nodeCount == 3) {
if (this.id.equals("SS_A")) {
return activeNodes.get(0);
}

if (this.id.equals("SS_B")) {
return activeNodes.get(1);
}

if (this.id.equals("SS_C")) {
return activeNodes.get(2);
}
}

if (nodeCount == 2) {
if (this.id.equals("SS_A")) {
return activeNodes.get(0);
}

if (this.id.equals("SS_B")) {
return activeNodes.get(0);
}

if (this.id.equals("SS_C")) {
return activeNodes.get(1);
}
}

if (nodeCount == 1) {
if (this.id.equals("SS_A")) {
return activeNodes.get(0);
}

if (this.id.equals("SS_B")) {
return activeNodes.get(0);
}

if (this.id.equals("SS_C")) {
return activeNodes.get(0);
}
}

// Default active service on the 0th node
return activeNodes.get(0);
}
}



jboss-service.xml

Create a new folder "META-INF" directly in base folder of "HAServiceDynamicLoadMigration" project. Create a new "jboss-service.xml" in "META-INF" with following content:



<?xml version="1.0" encoding="UTF-8"?>
<server>
<mbean code="test.ha.TestHASvcA" name="myexample:service=TestHASvcA"/>
<mbean code="test.ha.TestP" 
      name="myexample:service=SingletonServiceControllerA,type=ElectionPolicy">
<attribute name="Id">SS_A</attribute>
</mbean>
<mbean code="org.jboss.ha.singleton.HASingletonController" name="myexample:service=SingletonServiceControllerA">
<attribute name="HAPartition">
<inject bean="HAPartition" />
</attribute>
<attribute name="ElectionPolicy">
<inject bean="myexample:service=SingletonServiceControllerA,type=ElectionPolicy" />
</attribute>
<attribute name="Target">
<inject bean="myexample:service=TestHASvcA" />
</attribute>
<attribute name="TargetStartMethod">startHAService</attribute>
<attribute name="TargetStopMethod">stopHAService</attribute>
</mbean>

<mbean code="test.ha.TestHASvcB" name="myexample:service=TestHASvcB"/>
<mbean code="test.ha.TestP" name="myexample:service=SingletonServiceControllerB,type=ElectionPolicy">
<attribute name="Id">SS_B</attribute>
</mbean>
<mbean code="org.jboss.ha.singleton.HASingletonController" name="myexample:service=SingletonServiceControllerB">
<attribute name="HAPartition">
<inject bean="HAPartition" />
</attribute>
<attribute name="ElectionPolicy">
<inject bean="myexample:service=SingletonServiceControllerB,type=ElectionPolicy" />
</attribute>
<attribute name="Target">
<inject bean="myexample:service=TestHASvcB" />
</attribute>
<attribute name="TargetStartMethod">startHAService</attribute>
<attribute name="TargetStopMethod">stopHAService</attribute>
</mbean>

<mbean code="test.ha.TestHASvcC" name="myexample:service=TestHASvcC"/>

<mbean code="test.ha.TestP" name="myexample:service=SingletonServiceControllerC,type=ElectionPolicy">
<attribute name="Id">SS_C</attribute>
</mbean>
<mbean code="org.jboss.ha.singleton.HASingletonController" name="myexample:service=SingletonServiceControllerC">
<attribute name="HAPartition">
<inject bean="HAPartition" />
</attribute>
<attribute name="ElectionPolicy">
<inject bean="myexample:service=SingletonServiceControllerC,type=ElectionPolicy" />
</attribute>
<attribute name="Target">
<inject bean="myexample:service=TestHASvcC" />
</attribute>
<attribute name="TargetStartMethod">startHAService</attribute>
<attribute name="TargetStopMethod">stopHAService</attribute>
</mbean>
</server>



Now the necessay coding is all done, we are ready to deploy  and test HA singleton services.

Deployment

Use you IDE to export "HAServiceDynamicLoadMigration" project as "HAServiceDynamicLoadMigration.sar" archive to "farm" folder of node1.

Step by step we will now simulate the process of starting all the nodes in cluster, shutting down nodes one by one till only one node is left alive, and then recover the custer by restarting failed server one after another. In the mean time we will keep an eye on the console output of each node to check if the HA singleton services migrations among nodes are done as expected, and as programmed in the customized HA selection policy.

Step1. start node1

Now startup node1 and check the console output of node1, you will see output like following:
----

12:33:39,351 INFO  [GroupMember] I am (127.0.0.1:32902)
12:33:39,352 INFO  [GroupMember] New Members : 1 ([127.0.0.1:32902])
12:33:39,353 INFO  [GroupMember] All Members : 1 ([127.0.0.1:32902])
12:33:39,382 INFO  [STDOUT] 
....
12:33:56,251 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcA
12:33:56,309 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcB
12:33:56,337 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcC
----
The above info shows:
  • JBoss is started in clustered mode, but currently with only one node
  • As expected, all 3 HA services are started on node1 

Step2. start node2

Now start node2 and check the console output of node1 and node2.

node1 console:
---
12:43:12,454 INFO  [GroupMember] org.jboss.messaging.core.impl.postoffice.GroupMember$ControlMembershipListener@1a701a8 got new view [127.0.0.1:50418|1] [127.0.0.1:50418, 127.0.0.1:53538], old view is [127.0.0.1:50418|0] [127.0.0.1:50418]                                                                                                                                      
12:43:12,470 INFO  [GroupMember] I am (127.0.0.1:50418)                                                                                                                                   
12:43:12,471 INFO  [GroupMember] New Members : 1 ([127.0.0.1:53538])                                                                                                                      
12:43:12,471 INFO  [GroupMember] All Members : 2 ([127.0.0.1:50418, 127.0.0.1:53538])                                                                                                     
...
12:43:32,899 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:32,977 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:32,977 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:32,977 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
12:43:32,977 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
12:43:32,978 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:32,978 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:32,978 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:32,978 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
12:43:32,986 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
12:43:33,020 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:33,021 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:33,021 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:33,021 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
12:43:33,021 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
12:43:33,022 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcC

---
Node1 now has detected, that there is a new member joined the cluater. It updated the sorted list for current alive nodes:
                node1 -->[0]
                node2 -->[1]

While the cluster structure has changed, JBoss HA Controller will try to update the HA Singleton status with help of the configured HA selection policy.
"TestHASvcA" is configured to use HA selection policy "SS-A", and will activate the  "TestHASvcA" on the 0th node, i.e. the node1. Since "TestHASvcA" is already active on node1, so no further action will be done.

"TestHASvcB" is configured to use HA selection policy "SS-B", and will activate the  "TestHASvcB" on the 0th node, i.e. the node1. Since "TestHASvcB" is already active on node1, so no further action will be done.

"TestHASvcC" configured to use HA selection policy "SS-C", and will activate the  "TestHASvcC" on the 1st node, i.e. the node2. So here the "TestHASvcC" is stopped.

node2 console:
---
12:43:12,481 INFO  [GroupMember] I am (127.0.0.1:53538)
12:43:12,482 INFO  [GroupMember] New Members : 2 ([127.0.0.1:50418, 127.0.0.1:53538])
12:43:12,482 INFO  [GroupMember] All Members : 2 ([127.0.0.1:50418, 127.0.0.1:53538])
12:43:12,642 INFO  [STDOUT] 
...
12:43:32,901 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:32,902 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:32,903 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:32,904 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
12:43:32,905 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
12:43:32,967 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:32,968 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:32,969 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:32,969 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
12:43:32,970 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
12:43:33,023 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
12:43:33,023 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
12:43:33,023 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
12:43:33,023 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
12:43:33,023 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
12:43:33,027 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcC

Node2 joins cluster and detected, there is a already a member in the cluster. It updated the sorted list for current alive nodes:
                node1 -->[0]
                node2 -->[1]
Since cluster structure has been changed, JBoss HA Controller will try to update the HA Singleton status.
According to the configured selection policy, "TestHASvcA" and  "TestHASvcB" should stay on 0th node, i.e. node1, so nothing will be done for these services on node2.
"TestHASvcC" is configured to start on 1st node, i.e. node2, so node2 will started "TestHASvcC"
---

Step3. start node3

Now start node3 and check the console output of node1, node2 and node3.

node1 console:
---
14:52:24,349 INFO  [GroupMember] I am (127.0.0.1:36472)
14:52:24,349 INFO  [GroupMember] New Members : 1 ([127.0.0.1:42881])
14:52:24,350 INFO  [GroupMember] All Members : 3 ([127.0.0.1:36472, 127.0.0.1:44741, 127.0.0.1:42881])
...
14:52:43,676 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,705 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,706 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,706 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,706 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
14:52:43,706 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
14:52:43,752 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,752 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,752 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,753 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,753 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
14:52:43,753 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:52:43,757 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcB
14:52:43,823 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,825 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,825 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,826 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,826 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
14:52:43,827 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
---

Node3 joins cluster. Node1 updates the sorted list for current alive nodes:
                node1 -->[0]
                node2 -->[1]
                node3--> [2]
Since cluster structure has changed, JBoss HA Controller will try to update the HA Singleton status.
According to the configured selection policy, "TestHASvcA" should stay on 0th node, i.e. node1. "TestHASvcB" should be migrated to 1st node, i.e. node2, so this service will be stopped on node1.

node2 console:
---
[127.0.0.1:36472|1] [127.0.0.1:36472, 127.0.0.1:44741]
14:52:24,298 INFO  [GroupMember] I am (127.0.0.1:44741)
14:52:24,298 INFO  [GroupMember] New Members : 1 ([127.0.0.1:42881])
14:52:24,298 INFO  [GroupMember] All Members : 3 ([127.0.0.1:36472, 127.0.0.1:44741, 127.0.0.1:42881])
14:52:43,675 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,676 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,676 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,676 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,676 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
14:52:43,676 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
14:52:43,748 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,749 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,749 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,749 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,749 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
14:52:43,749 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:52:43,751 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcB
14:52:43,822 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,822 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,823 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,823 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,823 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
14:52:43,823 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
14:52:43,824 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcC
---
Node2 updates the sorted list for current alive nodes too:
                node1 -->[0]
                node2 -->[1]
                node3--> [2]
According to the configured selection policy, "TestHASvcB" should be migrated to 1st node, i.e. node2, so this service was stopped on node1 as shown in the console of node1, and is started on node2.
"TestHASvcB" does not belong to node2 any more and will be stopped on node2, and will be started on node3.

node3 console:
---
14:52:24,367 INFO  [GroupMember] I am (127.0.0.1:42881)
14:52:24,368 INFO  [GroupMember] New Members : 3 ([127.0.0.1:36472, 127.0.0.1:44741, 127.0.0.1:42881])
14:52:24,369 INFO  [GroupMember] All Members : 3 ([127.0.0.1:36472, 127.0.0.1:44741, 127.0.0.1:42881])
14:52:24,875 INFO  [STDOUT] 
....
14:52:43,679 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,680 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,681 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,681 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,682 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
14:52:43,682 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
14:52:43,755 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,756 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,756 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,756 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,756 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
14:52:43,757 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:52:43,826 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:52:43,827 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
14:52:43,827 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:52:43,827 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:52:43,827 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
14:52:43,827 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
14:52:43,831 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcC
---
Node3 joins cluster and detected, there is a already 2 members in the cluster.  It created a sorted list for current alive nodes:
                node1 -->[0]
                node2 -->[1]
                node3 -->[2]
Since cluster structure has changed, JBoss HA Controller will try to update the HA Singleton status.
According to the configured selection policy, "TestHASvcC" will be taken over to node3 from node2.

Step4. node1 fails

Shutdown node1, and check console output of node2 and node3.

console node2:
---
 14:55:37,727 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:55:37,810 INFO  [STDOUT]  ## node name: 127.0.0.1:1299                                                                                                                                       
14:55:37,810 INFO  [STDOUT]  ## node name: 127.0.0.1:1399                                                                                                                                       
14:55:37,810 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C                                                                                                                                  
14:55:37,810 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399                                                                                                                      
14:55:38,000 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name:                                                                                                             
14:55:38,000 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:55:38,000 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:55:38,000 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
14:55:38,001 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:55:38,031 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:55:38,039 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:55:38,039 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:55:38,039 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
14:55:38,040 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:55:38,043 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcA
...
14:55:46,615 INFO  [DefaultPartition] I am (127.0.0.1:1299) received membershipChanged event:
14:55:46,615 INFO  [DefaultPartition] Dead members: 1 ([127.0.0.1:1199])
14:55:46,616 INFO  [DefaultPartition] New Members : 0 ([])
14:55:46,616 INFO  [DefaultPartition] All Members : 2 ([127.0.0.1:1299, 127.0.0.1:1399])
---
When node1 is shutdown, node2 updates the sorted list for current alive nodes:
                node2 -->[0]
                node3 -->[1]

Since for 2 alive nodes, "TestHASvcA" should run on the 0th node. And now according to the updated alive node list, the 0th node is node2, so node2 will start "TestHASvcA".

console node3:
---
14:55:37,880 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:55:38,016 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:55:38,016 INFO  [STDOUT]  ## node name: 127.0.0.1:1399                                                                         
14:55:38,016 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C                                                                    
14:55:38,017 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399                                                        
14:55:38,017 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name:                                               
14:55:38,017 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:55:38,017 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
14:55:38,017 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
14:55:38,017 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
14:55:38,032 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
14:55:38,033 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
14:55:38,033 INFO  [STDOUT]  ## node name: 127.0.0.1:1399                                                                         
14:55:38,033 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A                                                                    
14:55:38,033 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
...
14:55:46,625 INFO  [DefaultPartition] I am (127.0.0.1:1399) received membershipChanged event:
14:55:46,625 INFO  [DefaultPartition] Dead members: 1 ([127.0.0.1:1199])
14:55:46,625 INFO  [DefaultPartition] New Members : 0 ([])
14:55:46,625 INFO  [DefaultPartition] All Members : 2 ([127.0.0.1:1299, 127.0.0.1:1399])
---

Node3 just updated the alive node list. "TestHASvcC" stays on node3.

Step5. node2 fails

When node2 fails now, the node3 will be left as the onyl alive node. As shown in node3's console,
The alive node list contains now only one node, i.e. node3. The services running on node2, are now migrated to node3.

console node3:
---
14:58:10,737 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcB
14:58:10,800 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcA
14:58:12,491 INFO  [GroupMember] ....
14:58:14,078 INFO  [DefaultPartition] I am (127.0.0.1:1399) received membershipChanged event:
14:58:14,115 INFO  [DefaultPartition] Dead members: 1 ([127.0.0.1:1299])
14:58:14,115 INFO  [DefaultPartition] New Members : 0 ([])
14:58:14,115 INFO  [DefaultPartition] All Members : 1 ([127.0.0.1:1399])
---

Step6. node1 revocers

Start node1 again, check the console output of node1 and node3

console node1:
---
15:01:35,189 INFO  [GroupMember] I am (127.0.0.1:45635)
15:01:35,190 INFO  [GroupMember] New Members : 2 ([127.0.0.1:42881, 127.0.0.1:45635])
15:01:35,190 INFO  [GroupMember] All Members : 2 ([127.0.0.1:42881, 127.0.0.1:45635])
15:01:35,354 INFO  [STDOUT] 
---------------------------------------------------------
GMS: address is 127.0.0.1:7900 (cluster=MessagingPostOffice-DATA)
---------------------------------------------------------

15:01:49,131 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,132 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,132 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,132 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
15:01:49,133 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:01:49,207 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,207 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,207 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,208 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
15:01:49,208 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:01:49,265 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,266 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,266 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,266 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
15:01:49,266 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
15:01:49,272 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcC
15:01:49,366 INFO  [Http11Protocol] Starting Coyote HTTP/1.1 on http-127.0.0.1-8180
15:01:49,526 INFO  [AjpProtocol] Starting Coyote AJP/1.3 on ajp-127.0.0.1-8109
15:01:49,576 INFO  [ServerImpl] JBoss (Microcontainer) [5.1.0.GA (build: SVNTag=JBoss_5_1_0_GA date=200905221634)] Started in 1m:51s:721ms
---
When node1 recovers, the alive node list becomes:
                node3 -->[0]
                node1 -->[1]
With alive node count equals 2, "TestHASvcA" and "TestHASvcB" should run on the 0th node, so they stay on node3.  "TestHASvcC" will be migrated to 1st node, i.e. node1. So we see on the console of nod1, "TestHASvcC" is started.

console node3:
---
15:01:18,345 INFO  [DefaultPartition] I am (127.0.0.1:1399) received membershipChanged event:
15:01:18,345 INFO  [DefaultPartition] Dead members: 0 ([])
15:01:18,345 INFO  [DefaultPartition] New Members : 1 ([127.0.0.1:1199])
15:01:18,345 INFO  [DefaultPartition] All Members : 2 ([127.0.0.1:1399, 127.0.0.1:1199])
...
15:01:49,126 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,127 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,127 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,127 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
15:01:49,127 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:01:49,206 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,206 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,206 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,206 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
15:01:49,207 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:01:49,263 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:01:49,270 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcC
15:01:49,280 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:01:49,281 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:01:49,281 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
15:01:49,281 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
---

Now node3 is still the 0th node in the cluster, but "TestHASvcC" is expected to  run on the 1st node, so this service does not belong to node3 any more, and will be shutdown on node3.

Step6. node2 revocers

Start node2 again.

console node1:
---
15:04:44,024 INFO  [GroupMember] I am (127.0.0.1:45635)
15:04:44,024 INFO  [GroupMember] New Members : 1 ([127.0.0.1:39250])
15:04:44,025 INFO  [GroupMember] All Members : 3 ([127.0.0.1:42881, 127.0.0.1:45635, 127.0.0.1:39250])
...
15:05:00,621 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,687 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,688 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,688 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,688 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
15:05:00,688 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:05:00,688 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,688 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,688 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,701 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,701 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
15:05:00,701 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
15:05:00,860 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcB
15:05:00,860 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,861 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,861 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,861 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,861 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
15:05:00,861 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
15:05:00,864 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcC
---
When node2 joins cluster , there will be 3 nodes in the cluster again, but the alive node list is different than that before nodes failures. The list is now sorted differently:

                node3 -->[0]
                node1 -->[1]
                node2 -->[2]
Before node2 joins cluster, "TestHASvcC" was running on node1. Now with alive count bekoming 3, the node1 as 1st node, will only have "TestHASvcB" runs on it. Therefore "TestHASvcC" will be stopped, and "TestHASvcB" will be started on node1.

console node2:
---
15:04:44,038 INFO  [GroupMember] I am (127.0.0.1:39250)
15:04:44,038 INFO  [GroupMember] New Members : 3 ([127.0.0.1:42881, 127.0.0.1:45635, 127.0.0.1:39250])
15:04:44,039 INFO  [GroupMember] All Members : 3 ([127.0.0.1:42881, 127.0.0.1:45635, 
...
15:05:00,624 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,625 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,625 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,625 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,625 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
15:05:00,625 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:05:00,732 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,733 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,733 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,733 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,733 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
15:05:00,733 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
15:05:00,863 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,863 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,863 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,863 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,863 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
15:05:00,863 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
15:05:00,868 INFO  [STDOUT] #                   Starting HA service class test.ha.TestHASvcC
---
Node2 joins the cluster as the 2nd node, element [2] of the alive node list, will only have "TestHASvcC"  started on it.

console node3:
---
[127.0.0.1:42881, 127.0.0.1:45635]
15:04:44,019 INFO  [GroupMember] I am (127.0.0.1:42881)
15:04:44,019 INFO  [GroupMember] New Members : 1 ([127.0.0.1:39250])
15:04:44,019 INFO  [GroupMember] All Members : 3 ([127.0.0.1:42881, 127.0.0.1:45635, 127.0.0.1:39250])
...
15:05:00,621 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,689 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,689 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,689 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,689 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_A
15:05:00,689 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1399
15:05:00,704 INFO  [STDOUT] #                   Stopping HA service class test.ha.TestHASvcB
15:05:00,731 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,732 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,788 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,788 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,788 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_B
15:05:00,789 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1199
15:05:00,789 INFO  [STDOUT]  ### list all nodes before doing selection ploicy name: 
15:05:00,789 INFO  [STDOUT]  ## node name: 127.0.0.1:1399
15:05:00,789 INFO  [STDOUT]  ## node name: 127.0.0.1:1199
15:05:00,789 INFO  [STDOUT]  ## node name: 127.0.0.1:1299
15:05:00,789 INFO  [STDOUT]  ## getNodeSeq ,  policy name SS_C
15:05:00,789 INFO  [STDOUT]  ## selected node, policy name: 127.0.0.1:1299
---
Before node2 joins cluster, node3 was the 0th node in cluster and had "TestHASvcA" and "TestHASvcB"
running on it.

Now the cluster has 3 nodes, node3 will only have "TestHASvcA", running. "TestHASvcB" will be stopped on it, and taken over by node1 as shown in the console of  node1.

CONGRATULATIONS!
Now we are done with our expertment. The HA singleton services are again distrubuted on the 3 nodes.

Dynamic HA singeton services distribution and migration is useful, when you have several services which:
  •  each service should run only once in the cluster 
  •  each service should be high available, .i.e. could survive node failures
  • all services not running on one node , but distributed among all node for, for example performance reason.
You might ask, in case of "client-server" system, what if a client is connected to a node, but the required service just runs on another node? We will come to this topic later...




Wednesday, October 3, 2012

HA Singleton with JBoss 7 Cluster

Document  Version 1.0

   Copyright © 2012-2013 beijing.beijing.012@gmail.com

Keywords:
JBoss 7  HA  Service, HA Singleton, JBoss 7 Cluster, load migration, load distribution, HA Service deployment


Start 2 standalone JBosss 7 server in clustered mode  [Draft]


Download JBoss AS 7.1.1.Final and extract the archive to a tmp location, lets say " JBoss_AS_7_1_1_Final"

Create 2 new folders:
jboss7_node1
jboss7_node2

Copy all folders and files under "JBoss_AS_7_1_1_Final" into jboss7_node1.
Copy all folders and files under "JBoss_AS_7_1_1_Final" into jboss7_node2.

Now we have 2 JBoss nodes ready to be started.


Start "jboss7_node1" in clusted mode using command:

./standalone.sh -Djboss.node.name=node1 --server-config=standalone-ha.xml

The node1 is started correctly in clusted mode when following line is shown in console:


[org.jboss.modcluster.advertise.impl.AdvertiseListenerImpl] (MSC service thread 1-3) Listening to proxy advertisements on 224.0.1.105:23364
....

When you are familiar with older JBoss (4x, 5x version), you may expect to see log info about cluster information or  node information. But JBoss 7 will not show such info yet.


Start "jboss7_node2" in clusted mode using command:


./standalone.sh -Djboss.node.name=node2 --server-config=standalone-ha.xml -Djboss.socket.binding.port-offset=100

The node2 is started correctly in clustered mode when following line is shown in console:


[org.jboss.modcluster.advertise.impl.AdvertiseListenerImpl] (MSC service thread 1-3) Listening to proxy advertisements on 224.0.1.105:23364
....

 Here we still can not see any log info to clusters and nodes.

You could just write a samll "Hello World" web applicaiton and try to deploy it in cluster.
I will just take the "TestWebSec20" web application, add "<distributable/>" to the web.xml file and try put the "TestWebSec20.war" into the folder "deployments" of jboss7_node1.

The console of node1 shows:


21:19:36,185 INFO  [org.jboss.as.clustering.impl.CoreGroupCommunicationService.web] (MSC service thread 1-1) JBAS010206: Number of cluster members: 1
....
21:19:37,378 INFO  [org.jboss.as.server] (DeploymentScanner-threads - 2) JBAS018559: Deployed "TestWebSec20.war"

Now cluster info is shown, but where is the node2?

Now try to deploy the "TestWebSec20.wa" in node2, put the war file in the folder "deployments" of jboss7_node2.
The console of node2 shows:


21:26:35,227 INFO  [org.jboss.as.clustering.impl.CoreGroupCommunicationService.web] (MSC service thread 1-3) JBAS010206: Number of cluster members: 2
....
21:26:37,207 INFO  [org.jboss.as.server] (DeploymentScanner-threads - 1) JBAS018559: Deployed "TestWebSec20.war"

Lets take look again the console of node1:


21:26:31,272 INFO  [org.jboss.as.clustering.impl.CoreGroupCommunicationService.lifecycle.web] (Incoming-1,null) JBAS010247: New cluster view for partition web (id: 1, delta: 1, merge: false) : 

[node1/web, node2/web]
21:26:31,326 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-1,null) ISPN000094: Received new cluster view: [node1/web|1] [node1/web, node2/web]




Now lets shutdown node1, and console of node2 shows:

21:31:10,903 INFO  [org.jboss.as.clustering.impl.CoreGroupCommunicationService.lifecycle.web] (Incoming-6,null) JBAS010247: New cluster view for partition web (id: 2, delta: -1, merge: false) : [node2/web]
21:31:10,961 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-6,null) ISPN000094: Received new cluster view: [node2/web|2] [node2/web]












Tuesday, October 2, 2012

HA Singleton, Cluster Wide Singleton as MBean in JBoss 5, part 3/3

Document  Version 1.0
  Copyright © 2012-2013 beijing.beijing.012@gmail.com


Keywords:
JBoss HA  Service, HA Singleton, JBoss Cluster, load migration, load distribution, HA Service deployment


In "HA Singleton, Cluster Wide Singleton as MBean in JBoss 5", part1-2, we have successfully configured a 2-nodes JBoss cluster, and have reated a "TestHASingleton" MBean. This bean is ready to be deployed as a JBoss HA singleton. If you have followed the part2, you should now have a "TestHASingleton.sar" file. In this part 3 of the serial, we will deploy the  "TestHASingleton" Mbean in JBoss cluster. We will also do some experiments to make sure it is really a HA singleton, i.e. it runs ONLY once in the cluster.


Deployment of "TestHASingleton" MBean


Put the "TestHASingleton.sar in "JBOSS_HOME/server/node1/farm/".
You will see following information in the console of node1 :

10:28:09,752 INFO  [STDOUT] ### Starting JbHaSingletonSvcSample Singleton Service..

From the above text we can see that the "startSingletonService" method of "JbHaSingletonSvcSample" is called, i.e. the singleton is started on node1.

We check the console output of node2, nothing about the singleton is shown. This is correct.


Shutdown node1


Now, we will shut down node1. Now we will see following info in the console of node2:

10:40:05,383 INFO  [STDOUT] ### Starting JbHaSingletonSvcSample Singleton Service..


Find out service on which node will the singleton be active?


step1. shutdown node1 and node2
step2.  remove the "TestHASingleton.sar" from farms of both nodes
step3.  start node1
step4.  start node2
step5. deploy "TestHASingleton.sar" in farm of node2.


The console of node2 shows:

10:48:34,880 INFO  [STDOUT] ### Starting JbHaSingletonSvcSample Singleton Service..
10:48:37,657 INFO  [STDOUT] ### Stopping JbHaSingletonSvcSample Singleton Service..


The console of node1 shows:

10:48:37,658 INFO  [STDOUT] ### Starting JbHaSingletonSvcSample Singleton Service..


This is because, via default, a HA singleton will only active on the "master" node of a cluster (JBoss mainains a sorted list of all active nodes, sorted by the time it joins the cluster. The first node in the list will be choosen as master... ). A HA singleton is active only on the MASTER node.

We could also customize the HA singleton behavior when deciding service on which node should be made active. In such case customized "HA selection policy" is needed... 

HA Singleton, Cluster Wide Singleton as MBean in JBoss 5, part 2/3

Document  Version 1.0

  Copyright © 2012-2013 beijing.beijing.012@gmail.com


Keywords:
JBoss HA  Service, HA Singleton, JBoss Cluster, load migration, load distribution, HA Service deployment



Write and deploy an JBoss Mbean



We will write a JBoss MBean called "JbHaSingletonSvcSample"


Create a simple Java project "TestHASingleton" in Eclipse. 

An MBean needs an interface, an implementation class, and a "jbosss-service.xml" file for MBean description/deployment.



The interface: 

package test.ha;

import org.jboss.system.ServiceMBean;

public interface JbHaSingletonSvcSampleMBean extends ServiceMBean{
}



The implementation class:

package test.ha;
import org.jboss.system.ServiceMBeanSupport;
/**
 * The service itself shoul not be written as singleton.
 * @author ws
 *
 */
public class JbHaSingletonSvcSample extends ServiceMBeanSupport implements
JbHaSingletonSvcSampleMBean {
// The lifecycle
public void startSingletonService() throws Exception {
System.out.println("### Starting JbHaSingletonSvcSample Singleton Service..");
}
public void stopSingletonService() throws Exception {
System.out.println("### Stopping JbHaSingletonSvcSample Singleton Service..");
}
}



Create a "META-INF" folder directly under project root:

TestHASingleton/
                            src/
                            META-INF/



Create a "jboss-service.xml" file in "META-INF" folder with following content:

<?xml version="1.0" encoding="UTF-8"?>

<server>
  <mbean code="test.ha.JbHaSingletonSvcSample" name="myexample:service=testHaSample"/>
    <mbean code="org.jboss.ha.singleton.HASingletonController" name="myexample:service=SingletonServiceControllerA">
        <attribute name="HAPartition"><inject bean="HAPartition" /></attribute>
         <attribute name="Target"><inject bean="myexample:service=testHaSample" /></attribute>
        <attribute name="TargetStartMethod">startSingletonService</attribute>
        <attribute name="TargetStopMethod">stopSingletonService</attribute>
    </mbean>
</server>


MBean will be deployed as ".sar" archive. An ".sar" is nothing else that a ".jar" file. To make an ".sar" file, we just need to export the project binaries as "jar" file with Eclipse, i.e. "TestHASingleton.jar",
and then rename it to "TestHASingleton.sar".


HASingleton, Cluster Wide Singleton as MBean in JBoss 5, part3/3   part1/3



Monday, October 1, 2012

HA Singleton, Cluster Wide Singleton as MBean in JBoss 5, part 1/3

Document  Version 1.0

   Copyright © 2012-2013 beijing.beijing.012@gmail.com


Keywords:
JBoss HA  Service, HA Singleton, JBoss Cluster, load migration, load distribution, HA Service deployment



What is HA Singleton?


We know that singletons are the kind of instances or services that exists only once in an application context. When you write a singleton, deploy it in your server, you will get only one instance of this singleton on the server.

The singleton mentioned above is actually class-loader wide singleton,  i.e. one instance per class-loader. In case of high availabile cluster, when the above singleton is deployed in cluster, there will be one singleton instance in each cluster-node.

But there are cases where we need cluster-wide singleton. For example, in a clustered auction system,
bid orders can come from different node, but the process which deals with the final trading settelment should run only once in the whole cluster. When the node on which the singleton service fails, another node will start  an singleton service automatically. Such singleton is so called cluster-wide singleton, i.e. HA singleton.


We will now take JBoss as example to show how to implement a HA singleton.


JBoss supports deployment of singleton as HA singleton. There are generally 2 ways to deplyoment HA singleton on JBoss

  • option1: just put deployment archive under "../deploy-singleton/ ", and the deployment service bekomes a HA singleton. Disadvantages of this way are, no hot-deployment support, in case of node failure, service startup time takes longer...
  • option 2: deploy service as MBean. MBean support hot deployment. Singleton MBean will be deployment on all nodes,  but provides service only on one node. 

We take the second option, and show how to deploy a MBean as HA singleton:
We will configure and run a 2-nodes JBoss cluster
We will write a simple MBean.
We will deploy the MBean as HA singlleton.


Configure and run a 2-nodes JBoss cluster

Download JBoss  5.1.0_GA from www.boss.org (http://sourceforge.net/projects/jboss/files/JBoss/JBoss-5.1.0.GA).  Extract the file to some location, we will name it JBOSS_HOME hereafter.

In the extracted JBoss directory goto JBOSS_HOME/server/, create 2 new folders directly here:

node1
node2

Copy all files in JBOSS_HOME/server/all into node1.
Copy all files in JBOSS_HOME/server/all into node2.

Now we have a JBoss cluster with two nodes ready to run.


Start node1:


./run.sh -c node1 -Djboss.service.binding.set=ports-01 -Djboss.messaging.ServerPeerID=1



When you have followed the above steps, you will see in the console like these:


09:57:56,551 INFO  [GroupMember] I am (127.0.0.1:56944)
09:57:56,551 INFO  [GroupMember] New Members : 1 ([127.0.0.1:56944])
09:57:56,551 INFO  [GroupMember] All Members : 1 ([127.0.0.1:56944])
09:57:56,556 INFO  [STDOUT] 

.......
09:58:03,364 INFO  [Http11Protocol] Starting Coyote HTTP/1.1 on http-127.0.0.1-8180
09:58:03,377 INFO  [AjpProtocol] Starting Coyote AJP/1.3 on ajp-127.0.0.1-8109
09:58:03,382 INFO  [ServerImpl] JBoss (Microcontainer) [5.1.0.GA (build: SVNTag=JBoss_5_1_0_GA date=200905221634)] Started in 25s:474ms

The node1 is now started.


Start node2:


./run.sh -c node2 -Djboss.service.binding.set=ports-02 -Djboss.messaging.ServerPeerID=2


You will see in the console lines like these:

|1] [127.0.0.1:56944, 127.0.0.1:55687], old view is null
10:01:57,587 INFO  [GroupMember] I am (127.0.0.1:55687)
10:01:57,587 INFO  [GroupMember] New Members : 2 ([127.0.0.1:56944, 127.0.0.1:55687])
10:01:57,587 INFO  [GroupMember] All Members : 2 ([127.0.0.1:56944, 127.0.0.1:55687])
10:01:57,629 INFO  [STDOUT] 
.....

10:01:59,745 INFO  [Http11Protocol] Starting Coyote HTTP/1.1 on http-127.0.0.1-8280
10:01:59,756 INFO  [AjpProtocol] Starting Coyote AJP/1.3 on ajp-127.0.0.1-8209
10:01:59,762 INFO  [ServerImpl] JBoss (Microcontainer) [5.1.0.GA (build: SVNTag=JBoss_5_1_0_GA date=200905221634)] Started in 17s:400ms


The second node is started, and it joined the cluster.