Clients wait indefinitely when pool exhausted #638

mohitanchlia · 2013-10-28T17:21:51Z

Recently we saw a issue where all our threads were waiting on waitForCompletion and it never came out of this condition.

   at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
    at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)
    at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:140)
    at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:108)

It appears that there might a bug in Hector which is triggered in a very specific condition:

During high load when pool is fully utilized
During this time if there is a hiccup in the network or communication issue with the node, then at that point hector is not able to add the HClient back to the pool and throws runtime exception, but in borrowClient call of HClient increments prematurely even though cassandraClient may return null.

HClient cassandraClient = availableClientQueue.poll();
int currentActiveClients = activeClientsCount.incrementAndGet();

The above logic then puts it in this condition:

if (currentActiveClients <= cassandraHost.getMaxActive()) {
  cassandraClient = createClient();
} else {
  // We can't grow so let's wait for a connection to become available.
  cassandraClient = waitForConnection();
}

And eventually blocks everything because there are no elements in availableClientQueue.

I think the fix is to increment only after cassandraClient != null.

Let me know if this looks ok?

The text was updated successfully, but these errors were encountered:

mohitanchlia · 2013-10-29T16:43:42Z

It appears that

int currentActiveClients = activeClientsCount.incrementAndGet();

should really be
int currentActiveClients = availableClientQueue.size();

and then this condition makes sense:

  if ( cassandraClient == null ) {

    if (currentActiveClients <= cassandraHost.getMaxActive()) {
      cassandraClient = createClient();
    } else {
      // We can't grow so let's wait for a connection to become available.
      cassandraClient = waitForConnection();
    }

  }

Currently we seem to be over incrementing this count and possibility of ending up with bad pool and not noticing it until errors start happening

mcfongtw · 2017-03-31T05:56:09Z

We have tried the change suggested by @mohitanchlia; however, we encounter another lock competition havoc. availableClientQueue of type ArrayBlockingQueue would internally access the critical section protected by an ReentrantLock. Under heavy traffic of database access, this lock might impact the overall query performance form a client. We ended up using the socket timeout mechanism to allow a thread to exit when its has waited for too long.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clients wait indefinitely when pool exhausted #638

Clients wait indefinitely when pool exhausted #638

mohitanchlia commented Oct 28, 2013

mohitanchlia commented Oct 29, 2013

mcfongtw commented Mar 31, 2017

Clients wait indefinitely when pool exhausted #638

Clients wait indefinitely when pool exhausted #638

Comments

mohitanchlia commented Oct 28, 2013

mohitanchlia commented Oct 29, 2013

mcfongtw commented Mar 31, 2017