Skip to content
This repository has been archived by the owner on Oct 10, 2023. It is now read-only.

ProducerMetricsInterceptor fails with IllegalStateException and/or IllegalAccessException #194

Open
gmcrobert opened this issue Dec 7, 2021 · 0 comments
Labels
bug Something isn't working

Comments

@gmcrobert
Copy link
Member

gmcrobert commented Dec 7, 2021

Issue Description

When producing messages, the following exception is sometimes seen in the Kafka pod logs:

java.lang.IllegalStateException: Queue full
	at java.base/java.util.AbstractQueue.add(AbstractQueue.java:98)
	at java.base/java.util.concurrent.ArrayBlockingQueue.add(ArrayBlockingQueue.java:326)
	at com.ibm.eventstreams.interceptors.metrics.ProducerMetricsQueue.add(ProducerMetricsQueue.java:31)
	at com.ibm.eventstreams.interceptors.metrics.ProducerMetricsInterceptor.intercept(ProducerMetricsInterceptor.java:103)
	at com.ibm.eventstreams.interceptors.metrics.ProducerMetricsInterceptor.intercept(ProducerMetricsInterceptor.java:30)
	at com.ibm.eventstreams.interceptors.framework.FlowRequestResponse.interceptResponse(FlowRequestResponse.java:125)
	at com.ibm.eventstreams.interceptors.framework.InterceptorChain.lambda$new$4(InterceptorChain.java:134)
	at com.ibm.eventstreams.interceptors.framework.InterceptorChain$$Lambda$665/0x0000000099a13e60.handle(Unknown Source)
	at com.ibm.eventstreams.interceptors.framework.InterceptorChain.lambda$intercept$2(InterceptorChain.java:111)
	at com.ibm.eventstreams.interceptors.framework.InterceptorChain$$Lambda$1767/0x000000004801a980.accept(Unknown Source)

This problem is caused by the java thread that removes metrics from the ArrayBlockingQueue to terminate after receiving an exception. When the thread terminates, the queue is not emptied and the result is that all produce messages will receive the exception above. The exception has a significant impact on the performance of the produce messages and will prevent further producer metrics from being captured.

The problem is also sometimes seen in conjunction with an IllegalAccessException trying to set a final field through reflection. This happens when the producer acks are set to 0 or 1. Once the IllegalAccessException has occurred, it will be followed by the IllegalStateExceptions.

Once the queue is full, the only way to restore the system back to normal service is to delete the Kafka pods that contain the exceptions and allow a fresh pod to be started.

Environment

  • IBM Event Streams Version: 10.4.0
  • Operating system: Any supported OCP release
@gmcrobert gmcrobert added the bug Something isn't working label Dec 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant