This document describes some of the ways to leverage the high availability features of Oracle Real Application Cluster (RAC) databases by moving the load balancing functionality from the Oracle server/Oracle driver to your application. In this article, I discuss the ways to achieve load balancing of connections across RAC nodes and yet maintain the sticky nature to the same RAC node for the all of the participants of a single global transaction. I have used Java examples wherever possible, but most of the programming constructs can be mapped to other sufficiently advanced programming languages.
An Oracle Real Application Cluster database is a clustered database. A cluster is a group of independent servers that cooperate as a single system. Clusters provide improved fault resilience and modular incremental system growth over single, symmetrical multiprocessor systems. Oracle’s RAC page has more information on RAC technology.
A global transaction is a mechanism that allows a set of programming tasks, potentially using more than one resource manager and potentially executing on multiple servers, to be treated as one logical unit. Once a process is in transaction mode, any service requests made to servers may be processed on behalf of the current transaction. The services that are called and join the transaction are referred to as transaction participants. The value returned by a participant may affect the outcome of the transaction. A global transaction may be composed of several local transactions, each accessing the same resource manager. The resource manager is responsible for performing concurrency control and atomicity of updates.
Problems with Global Transactions and Oracle RAC
Problems occur when connections participating in a distributed transaction are routed to different instances of an RAC cluster. A split transaction is a distributed transaction that spans more than one instance of an RAC database. This implies that different branches of the same distributed transaction are located on different instances of an RAC database. This could result in erroneous situations, as follows:
- During normal operation: Neither branch can see changes made by the other branch. This can cause row-level lock conflicts amongst these branches, leading to ORA-2049 errors (timeout waiting for distributed lock).
- During recovery operation: Failures can occur during two-phase commit (2PC). Sometimes 2PC requires its own connection to the database (e.g., an abort). In such cases, a 2PC operation may be attempted on a transaction branch at an instance where that branch does not exist, causing missing transaction errors (ORA-24756). This in turn leaves the branch hanging as an active branch to be cleaned up by the Process Monitor daemon (PMON). While the branch is active, it still holds row-level locks on all rows that it modified. Similarly, if the 2PC operation that failed is a commit, then in-doubt transactions can remain in the database. This can cause ORA-1591 errors when another transaction attempts to access data that has been modified by the in-doubt transaction. You can find more detailed explanation of this problem in Oracle’s “Best Practices for using XA with RAC” (PDF, 219 KB).
Recommended Approach
The following design forms the foundation of the recommended approach:
- Create multiple pools, one for each RAC node in the cluster. I call each pool a node pool. For example, if you have three RAC nodes, create three node pools. All popular pooling components and application servers support creating multiple pools.
- Disable server-side load balancing for each of the pools. Setting the
INSTANCE_NAME
attribute in your JDBC connect descriptor aliases will disable server-side load balancing. For example:jdbc:oracle:thin:@(DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(HOST=MYDBHOST1)(PORT=1522)) ) (CONNECT_DATA= (SERVICE_NAME=MYDB)(INSTANCE_NAME=MYDB1) ) )
If you miss the
INSTANCE_NAME
attribute in your JDBC connect descriptor, the Oracle TNS Listener could still redirect the connection request to some other instance, depending on the load on the instance in question. - Write a connection factory that does the following when a connection is requested:
- Checks if a node pool is already associated with the thread.
- If a node pool is not associated with the thread, randomly picks a pool from the available pools and associates the node pool with the thread.
- If a pool is associated with the thread, just reuses the pool.
- Depending on your application behavior (discussed later), disassociates the pool from the thread.
Multithreaded Application
In a multithreaded application, multiple threads are spawned to perform concurrent tasks. Usually global transactions have thread affinity; i.e., they are identified and associated by the threads in which they run. All connections participating in the same global transaction must be drawn from the pool or a data source in the same thread. You can maintain affinity to a node pool by associating the thread with the node pool. The connection factory responsible for providing the connection should then inspect the thread and reuse the node pool referenced in the thread space. The connection factory is allowed to randomly pick a pool only if there is no node pool associated with the thread. You can implement this in Java using the ThreadLocal
class. When the connection factory requests a connection, it inspects the thread space to check if a node pool is associated with the thread; if there is, it reuses it. Otherwise, it randomly picks one.
/*
* Method ConnectionFactory.getConnection
*/
Connection getConnection() {
// nodePoolTracker is a ThreadLocal object
// defined in the class definition
NodePoolIndentifier poolId =
(NodePoolIndentifier) nodePoolTracker.get();
if (poolId == null) {
// There is no node pool associated with
// this thread, randomly pick
// one and store it in the thread
NodePool pool = getLeastLoadedInstacePool();
nodePoolTracker.set(
new NodePoolIdentifier(pool.getId());
return pool.getConnection();
}
else {
// get a reference to the pool
NodePool pool = getPoolById(poolId);
return pool.getConnection();
}
}
The above method creates an affinity between the thread and the node pool. If your application spawns a new thread for every new task, you don’t need to manually clean up the thread space. However, for applications that have thread pools or reuse threads (like web servers, app servers, etc.), you need to manually clean up the thread space after each global transaction or task completion. This entirely depends on how you manage threads in your application. If you don’t clean up the thread space, the node pool will get associated with the thread for the complete lifespan of the thread, which may result in suboptimal load balancing across RAC instances.
Web Application
Modern web applications have a distinct advantage over standalone applications. They run in a managed environment which delegates most of the dirty work to the underlying application/web server. Here you can leverage an approach similar to standalone applications–using a thread pool–but by using HTTP filters, you can eliminate the problem with reusable threads by resetting the state of the thread in the HTTP filters.
You can do this using Java servlets as follows:
- Configure custom servlet filters to intercept requests and responses from your applications.
- Pick a node pool and set it in the
ThreadLocal
object in the in-filter before the request is sent to the actual request handler. - After the request handler processes the request (in the out filter), reset the thread state.
All of the connections for a particular request are directed to a single RAC instance. Alternatively, you can also use the request object to store the reference of the node pool rather then the ThreadLocal
object. The downside of this approach is that you need to pass the request object to all of the components that draw a connection to the database.
Known Issues
This approach assumes a specific pattern in the life span of global transactions that happens to work with most popular applications. Some places where this would not work include:
- Global transactions spanning multiple threads.
- Multithreaded application reusing threads: this is kind of tricky, though not very dangerous. In this case, the node pool (RAC node) gets tightly coupled to the thread for the complete lifespan of the thread. You can still work around this problem depending on your application. For example, in a web server, you can have the request filters reset the thread state after each request. You will have to figure out your application-specific logic to reset the thread state after the thread has processed each request.
- What happens if an RAC node goes down while the corresponding node pool is being used by some threads? I would still prefer to let the complete global transaction roll back and report the operation as a failed operation.
http://www.oracle.com/technology/products/database/clustering/pdf/TWP_RAC10gR2.pdf
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/ThreadLocal.html
http://www.oracle.com/technology/products/database/clustering/pdf/bestpracticesforxaandrac.pdf
http://www.oracle.com/database/rac_home.html
http://smartpool.sourceforge.net/
package pooling; import java.sql.*; import java.util.HashMap; /** * This class demonstrates the use of ThreadLocal to pin a particular connection to a RAC node. * */ public class RACConnectionFactory { /** * Instance connect descriptors to the database */ private static final String url1 = "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=MYHOST1)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DEV)(INSTANCE_NAME=DEV1)))"; private static final String url2 = "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=MYHOST2)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DEV)(INSTANCE_NAME=DEV2)))"; /** * Store the list of connections, ideally should store references to node pools */ private HashMap poolMap = new HashMap(); /** * Simple algo to keep track of which node is least used */ private String lastNodePoolUsed = null; /** * Singleton pattern */ private static RACConnectionFactory singletonFactory = null; private static boolean isDebug = true; /* * Our Golden Horse, Thread local. */ private static ThreadLocal nodePoolTracker = new ThreadLocal(); /** * Easy method for logging */ private static void log(String log) { if (isDebug) { System.out.println(log); System.out.flush(); } } /** * Gets the instance name from the V$instance view, the connected user should have the required prevelige * @param conn * @return * @throws Exception */ private String getInstanceName(Connection conn) throws Exception { PreparedStatement stmt = null; ResultSet rs = null; try { stmt = conn.prepareStatement("select instance_name from v$instance"); rs = stmt.executeQuery(); if (rs.next()) { return rs.getString(1); } else { throw new Exception("No Rows found, should never happen"); } } finally { stmt.close(); rs.close(); } } /** * Gets a connection from the database * @param url * @return * @throws Exception */ private Connection getRawConnection(String url) throws Exception { Class.forName("oracle.jdbc.driver.OracleDriver"); return DriverManager.getConnection(url, "sachin", "sachin"); } /** * Just returns the last not used pool * @return */ private String getLeastLoadedPool() { if (lastNodePoolUsed == null) { lastNodePoolUsed = "1"; return "1"; } if (lastNodePoolUsed.equals("1")) { lastNodePoolUsed = "2"; return "2"; } else { lastNodePoolUsed = "1"; return "1"; } } private Connection getConnection(String poolId) { return (Connection)poolMap.get(poolId); } private RACConnectionFactory() throws Exception { Connection conn = getRawConnection(url1); // Validate that this connection is indeed gone to DEV1 if (!(getInstanceName(conn).equalsIgnoreCase("DEV1"))) { throw new Exception("This was supposed to hit DEV1, actual: " + getInstanceName(conn)); } // Not creating any pools, assume pools are of size 1 ;) poolMap.put("1", conn); conn = getRawConnection(url2); // Validate that this connection is indeed gone to DEV2 if (!(getInstanceName(conn).equalsIgnoreCase("DEV2"))) { throw new Exception("This was supposed to hit DEV2, actual: " + getInstanceName(conn)); } poolMap.put("2", conn); } /** * This method does all the work of getting the connection, sticking the poolid in to the thread .... * @return * @throws Exception */ public synchronized static Connection getConnection() throws Exception { if (singletonFactory == null) singletonFactory = new RACConnectionFactory(); String poolId = (String) nodePoolTracker.get(); if (poolId == null) { // one and store it in the thread poolId = singletonFactory.getLeastLoadedPool(); nodePoolTracker.set(poolId); log("No Pool Associated:" + Thread.currentThread().getId() + ", adding: " + poolId); return singletonFactory.getConnection(poolId); } else { log("Pool Associated:" + Thread.currentThread().getId() + ", " + poolId); return singletonFactory.getConnection(poolId); } } public static void main (String args[] ) throws Exception { log("main thread is: " + Thread.currentThread().getId()); TaskProcessor taskProcessor[] = new TaskProcessor[5]; Thread threads[] = new Thread[taskProcessor.length]; for (int i=0; i<taskProcessor.length; i++) { taskProcessor[i] = new TaskProcessor(i); threads[i] = new Thread(taskProcessor[i]); threads[i].start(); } for (int i=0; i<taskProcessor.length; i++) { if (threads[i].isAlive()) { threads[i].join(); } } } /** * Thread to run many concurrent connections */ private static class TaskProcessor implements Runnable { private int number = 0; public TaskProcessor(int number) { this.number = number; } public void run() { try { log("Thread started: " + Thread.currentThread().getId()); // Get a connection Connection conn = RACConnectionFactory.getConnection(); // Sleep and yield so that other threads can run Thread.yield(); //Thread.sleep(1000); Thread.yield(); Connection conn1 = RACConnectionFactory.getConnection(); log("Thread " + Thread.currentThread().getId() + ": First Con: " + conn); log("Thread " + Thread.currentThread().getId() + ": Second Con: " + conn1); //ideally we should be checking if both the connections have come from the same pool, but since this // is just a mockup and we are running a single connection pool, we will directly check if the connections are same references. if (conn1 != conn) throw new Exception("Connections dont match: " + conn + ": " + conn1); log("Thread " + Thread.currentThread().getId() + " over"); } catch (Exception e) { throw new RuntimeException(e); } } } }
Assuming a 2 node Oracle RAC cluster , Give the connect url as jdbc:oracle:thin:@ (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = host1-vip)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = host2-vip)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = db-service) (FAILOVER_MODE =(TYPE = SELECT)(METHOD = BASIC)(RETRIES = 180)(DELAY = 5)) ) ) where host1-vip is virtual IP of node1 host2-vip is virtual IP of node2 db_service is the Service name of the database Oracle Database Programming using Java and Web Services http://db360.blogspot.com/2007/01/is-your-java-application-failoverproof.html http://www.oracle.com/technology/tech/java/sqlj_jdbc/index.html