Add concurrency condition to the soak test using exisiting blocking api #11658

zbilun · 2024-10-30T16:17:11Z

PTAL @apolcyn

linux-foundation-easycla · 2024-10-30T16:17:15Z

The committers listed above are authorized under a signed CLA.

✅ login: zbilun (fedbd64, 58e2ebd, 84dac62, 56883b5, 885e109, 7a9f3a9)

apolcyn · 2024-10-30T23:40:01Z

interop-testing/src/main/java/io/grpc/testing/integration/AbstractInteropTest.java

    Histogram latencies = new Histogram(4 /* number of significant value digits */);
    long startNs = System.nanoTime();
    ManagedChannel soakChannel = createChannel();
    TestServiceGrpc.TestServiceBlockingStub soakStub = TestServiceGrpc
        .newBlockingStub(soakChannel)
        .withInterceptors(recordClientCallInterceptor(clientCallCapture));
+    List<Thread> threads = new ArrayList<>();
+    // Only allow up to 10 threads to run concurrently
+    Semaphore semaphore = new Semaphore(10);
    for (int i = 0; i < soakIterations; i++) {


Rather than hardcoding 10 let's make parallelism configurable with a flag.

We can add a new flag, something like --soak_num_threads, where the value of the flag determines the number of number of threads to independently run the soak test loop on.

The flag can default to 1.

I have updated the code based on your comments. Currently, I am implementing a uniform distribution for the soak iterations across different threads. However, there are several other approaches to distribute the soak iterations dynamically, such as round robin or load balancing. Please let me know if you would prefer a different implementation.

Also I tried to run "tools/distrib/sanitize.sh" to format the code, however i could not find any sanitize.sh file from the root directory. Any hint?

I may also need to modify the related .py files. I will update those files and submit a change list (CL) for review afterward.

apolcyn · 2024-10-30T23:41:52Z

interop-testing/src/main/java/io/grpc/testing/integration/TestServiceClient.java

+        break;
+      }
+
+      case RPC_SOAK_CONCURRENT: {


Rather than introduce new test cases, let's reuse the existing rpc_soak and channel_soak test cases, but add one more parameter for number of threads (passing the value in from the aforementioned --soak_num_threads flag)

apolcyn · 2024-10-30T23:43:34Z

interop-testing/src/main/java/io/grpc/testing/integration/AbstractInteropTest.java

+      final TestServiceGrpc.TestServiceBlockingStub currSoakStub = soakStub;
+      if (concurrent) {
+        semaphore.acquire();
+        // Create a new thread for each soak iteration


Rather than create a new thread per iteration, I think it would be better to create --num_threads threads at the beginning of the soak test, and then have each thread independently run the soak test loop (e.g. run --soak_iterations RPCs).

…ing soak_num_threads Flag

apolcyn · 2024-10-31T17:53:29Z

interop-testing/src/main/java/io/grpc/testing/integration/AbstractInteropTest.java

-        TimeUnit.NANOSECONDS.sleep(remainingNs);
-      }
+    Thread[] threads = new Thread[numThreads];
+    int soakIterationsPerThread = (int) Math.ceil((double) soakIterations / numThreads);


Let's check these flags and fail at startup (before running the test), if --soak_iterations is not evenly divisible by --num_threads.

Then, let's just have each thread independently run soakIterationsPerThread RPCs (no need to calculate startIteration and endIteration)

apolcyn · 2024-10-31T17:53:32Z

interop-testing/src/main/java/io/grpc/testing/integration/AbstractInteropTest.java

+            result = performOneSoakIteration(currentStub, soakRequestSize, soakResponseSize);
+          } catch (Exception e) {
+            synchronized (this) {
+              totalFailures.incrementAndGet();


we shouldn't need to examine totalFailures until the thread is done and joined.

We can have a list of per-thread counters that each thread tallies up while running. Then after joining all threads, the main thread can examine all of those counters and create a summary.

apolcyn · 2024-10-31T17:54:07Z

interop-testing/src/main/java/io/grpc/testing/integration/AbstractInteropTest.java

+      final int startIteration = threadInd * soakIterationsPerThread;
+      final int endIteration = Math.min(startIteration + soakIterationsPerThread, soakIterations);
+      threads[threadInd] = new Thread(() -> {
+        ManagedChannel currentChannel = soakChannel;


nit: let's put the thread body in its own function

apolcyn · 2024-10-31T17:56:13Z

interop-testing/src/main/java/io/grpc/testing/integration/AbstractInteropTest.java

+            } catch (InterruptedException e) {
+              Thread.currentThread().interrupt();
+            }
+            currentChannel = createChannel();


nit: the way that we're resetting the channel here might be problematic here, because, for example, on the first iteration, all threads will shut down the same channel.

I think we need to have all threads start out initially with an independent channel.

…, channel creation logic, and refactor thread body for performSoakTest

apolcyn · 2024-11-02T04:23:20Z

interop-testing/src/main/java/io/grpc/testing/integration/AbstractInteropTest.java

+      StringBuilder logStr = new StringBuilder(
+          String.format(
+              Locale.US,
+              "soak iteration: %d elapsed_ms: %d peer: %s server_uri: %s",


We should include thread ID in this log

Let's do: "thread id: %d soak iteration: %d ...."

We can get the thread ID with Thread.currentThread().getId();

Side note: because we're adding flags and changing the logging format of this test, we'll probably need to send a separate pull request to update the interop test description document as well (https://github.com/grpc/grpc/blob/master/doc/interop-test-descriptions.md#rpc_soak and https://github.com/grpc/grpc/blob/master/doc/interop-test-descriptions.md#channel_soak should mention the new logging format as well as the new flags)

apolcyn · 2024-11-05T18:19:59Z

interop-testing/src/main/java/io/grpc/testing/integration/AbstractInteropTest.java

-    * Runs large unary RPCs in a loop with configurable failure thresholds
-    * and channel creation behavior.
+   * Runs large unary RPCs in a loop with configurable failure thresholds
+   * and channel creation behavior.
   */
  public void performSoakTest(


Rather than take a resetChannelPerIteration parameter, I think we can clean things up by taking a function parameter here instead.

The signature of the function param can be like this: ManagedChannel maybeCreateNewChannel(ManagedChannel channel)

In the rpc soak test, we can provide a callback that always returns the same channel.

In the channel soak test, we can provide a callback that closes the pass-in channel and returns a new one.

apolcyn · 2024-11-05T18:24:34Z

interop-testing/src/main/java/io/grpc/testing/integration/AbstractInteropTest.java

+          startNs,
+          resetChannelPerIteration,
+          minTimeMsBetweenRpcs,
+          threadFailures[currentThreadInd],


There are several pieces of data that are output from each of these threads:

threadFailures

iterationsDone

latencies

Rather than pass each of these as separate atomic parameters, I think it would be cleaner to pass a single parameter to each of the threads. We can create a simple wrapper object to wrap each of threadFailures, iterationsDone, and latencies.

In the main thread here, we can create a list of such wrapper objects.

That way, each thread has a separate copy of mutable output data, and there is no need to synchronize anything.

After all threads are joined, the main thread can merge results across the list.

Add concurrency condition to the soak test using exisiting blocking api

fedbd64

zbilun marked this pull request as draft October 30, 2024 16:23

zbilun marked this pull request as ready for review October 30, 2024 16:24

zbilun marked this pull request as draft October 30, 2024 16:29

Modify the influenced files

885e109

zbilun marked this pull request as ready for review October 30, 2024 16:49

zbilun closed this Oct 30, 2024

zbilun reopened this Oct 30, 2024

zbilun changed the title ~~Add concurrency condition to the soak test using exisiting blocking api~~ Add concurrency condition to the soak test using exisiting blocking api - PTAL @apolcyn Oct 30, 2024

zbilun changed the title ~~Add concurrency condition to the soak test using exisiting blocking api - PTAL @apolcyn~~ Add concurrency condition to the soak test using exisiting blocking api Oct 30, 2024

apolcyn reviewed Oct 30, 2024

View reviewed changes

Address code review comments from Alex: improve soak test logic by us…

58e2ebd

…ing soak_num_threads Flag

apolcyn reviewed Oct 31, 2024

View reviewed changes

zbilun added 3 commits November 1, 2024 03:35

Address code review comments from Alex: modify totalFailures handling…

56883b5

…, channel creation logic, and refactor thread body for performSoakTest

Removed useless file

7a9f3a9

Modify the channel implementation for rpc_soak test.

84dac62

apolcyn reviewed Nov 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add concurrency condition to the soak test using exisiting blocking api #11658

Add concurrency condition to the soak test using exisiting blocking api #11658

zbilun commented Oct 30, 2024 •

edited

Loading

linux-foundation-easycla bot commented Oct 30, 2024 •

edited

Loading

apolcyn Oct 30, 2024

zbilun Oct 31, 2024

apolcyn Oct 30, 2024

apolcyn Oct 30, 2024

apolcyn Oct 31, 2024

apolcyn Oct 31, 2024

apolcyn Oct 31, 2024

apolcyn Oct 31, 2024

apolcyn Nov 2, 2024

apolcyn Nov 5, 2024

apolcyn Nov 5, 2024

Add concurrency condition to the soak test using exisiting blocking api #11658

Are you sure you want to change the base?

Add concurrency condition to the soak test using exisiting blocking api #11658

Conversation

zbilun commented Oct 30, 2024 • edited Loading

linux-foundation-easycla bot commented Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zbilun commented Oct 30, 2024 •

edited

Loading

linux-foundation-easycla bot commented Oct 30, 2024 •

edited

Loading