Hey all,
I've heard various complaints that build times in trunk are taking too long, some taking as much as 8 hours (the timeout) - and this is slowing us down from being able to meet the code freeze deadline.
I took it upon myself to gather up some data in Gradle Enterprise to see if there are any outlier tests that are causing this slowness. Turns out there are, in this particular build - https://ge.apache.org/s/un2hv7n6j374k/ - which took 10 hours and 29 minutes in total
Here are the top offending tests:
- β - PASS
- π§ - FLAKY
- β - FAILED
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:core:test/details/kafka.api.TransactionsTest/testReadCommittedConsumerShouldNotSeeUndecidedData(String)%5B1%5D?top-execution=1
- logs: https://gist.github.com/stanislavkozlovski/f84f045510fbf1dbd3cd5b321277406a
- error:
org.opentest4j.AssertionFailedError: Can't create topic topic1
kafka.api.SaslScramSslEndToEndAuthorizationTest# testNoProduceWithDescribeAcl(String, boolean) - 2h 12m β
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:core:test/details/kafka.api.SaslScramSslEndToEndAuthorizationTest/testNoProduceWithDescribeAcl(String%2C%20boolean)%5B1%5D?top-execution=1
- logs: https://gist.github.com/stanislavkozlovski/6332957c19e6692e01cce4802af51d64
- error:
java.util.concurrent.TimeoutException: Timed out while waiting for the broker metadata publishers to be installed
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:core:test/details/kafka.server.MetadataRequestTest/testRack(String)%5B1%5D?top-execution=1
- logs: https://gist.github.com/stanislavkozlovski/cfba15bec612eb552a88c3e35866adbd
- error:
kafka.zookeeper.ZooKeeperClientExpiredException: Session expired either before or while waiting for connection
kafka.api.SaslOAuthBearerSslEndToEndAuthorizationTest#testProduceConsumeWithWildcardAcls(String) - 1h 37m β
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:core:test/details/kafka.api.SaslOAuthBearerSslEndToEndAuthorizationTest/testProduceConsumeWithWildcardAcls(String)%5B2%5D
- logs: https://gist.github.com/stanislavkozlovski/6d589f915ce47c2ddb22ef494cc6aa90
- error:
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:core:test/details/kafka.api.PlaintextEndToEndAuthorizationTest/testProduceConsumeViaAssign(String)%5B1%5D?top-execution=1
- logs: https://gist.github.com/stanislavkozlovski/39d39438fb8859ec915bf5fee5b3b8d7
- error:
java.util.concurrent.TimeoutException: testProduceConsumeViaAssign(java.lang.String) timed out after 60 seconds
- error:
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:core:test/details/kafka.admin.ResetConsumerGroupOffsetTest/testResetOffsetsExistingTopicAllGroups()?top-execution=1
- logs: https://gist.github.com/stanislavkozlovski/55d4f6057f518412c466a31b114f8463
org.apache.kafka.tiered.storage.integration.TransactionsWithTieredStoreTest#testFencingOnSend(String) 2h 59m π§
- error:
Timed out while waiting for the controller metadata publishers to be installed
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:storage:test/details/org.apache.kafka.tiered.storage.integration.TransactionsWithTieredStoreTest/testFencingOnSend(String)%5B2%5D?top-execution=1
- logs: https://gist.github.com/stanislavkozlovski/c323e681229b84f80bca68f97f21daa7
- error:
java.util.concurrent.TimeoutException: Future timed out after 5 minutes
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:storage:test/details/org.apache.kafka.tiered.storage.integration.TransactionsWithTieredStoreTest/testFailureToFenceEpoch(String)%5B1%5D?top-execution=1
- logs: https://gist.github.com/stanislavkozlovski/92f3e82b51f7ce43520c35480dc54f47
org.apache.kafka.tiered.storage.integration.DeleteSegmentsByRetentionSizeTest#executeTieredStorageTest(String) - 1h 38m π§
- error:
java.lang.AssertionError: Failed to close the tear down the test harness
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:storage:test/details/org.apache.kafka.tiered.storage.integration.DeleteSegmentsByRetentionSizeTest/executeTieredStorageTest(String)%5B2%5D?top-execution=1
- logs: https://gist.github.com/stanislavkozlovski/5919238e9c6e6329f1604023fa10771f
org.apache.kafka.streams.integration.TimeWindowedKStreamIntegrationTest shouldAggregateWindowedWithNoGrace ON_WINDOW_UPDATE_false - 1h 2m β
- error:
org.junit.runners.model.TestTimedOutException: test timed out after 600 seconds
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:streams:test/details/org.apache.kafka.streams.integration.TimeWindowedKStreamIntegrationTest/shouldAggregateWindowedWithNoGrace%5BON_WINDOW_UPDATE_false%5D?top-execution=1
- logs: https://gist.github.com/stanislavkozlovski/e525b2aefd2a62e72e378e1663a84ec6
- testRequestFailsWithRetriableError_RetrySucceeds(Errors)[6] - 40m 58s β
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:clients:test/details/org.apache.kafka.clients.consumer.internals.OffsetsRequestManagerTest/testRequestFailsWithRetriableError_RetrySucceeds(Errors)%5B6%5D
- testRequestFailsWithRetriableError_RetrySucceeds(Errors)[5] - 37m 49s β
- link: https://ge.apache.org/s/un2hv7n6j374k/tests/task/:clients:test/details/org.apache.kafka.clients.consumer.internals.OffsetsRequestManagerTest/testRequestFailsWithRetriableError_RetrySucceeds(Errors)%5B5%5D
Even some of the test harnesses seem to be taking a long time just to run - for a clear example, each individual test under shell:test and raft:test is taking miniscule amounts of time, yet the general test is taking 2 hours