fix: 4 concurrency & null-safety bugs from code review#531
Conversation
CATCH_ANY_INPUT_AS_PROPERTY handler dereferences getLatestData(input:initial) without null check. When a client sends an empty/whitespace-only message, Conversation.storeUserInputInMemory skips storing input:initial, causing NPE that kills the pipeline and puts the conversation in ERROR state. Added null guards for both initialInputData and initialInput. 3 new regression tests cover the blank input scenarios.
HistorizedResourceStore.update() had a non-atomic read-increment-write cycle. Two concurrent edits both reading version N would both write N+1, with the last write silently winning. On PostgreSQL, ON CONFLICT DO UPDATE in history also silently merged duplicates. Introduced storeIfCurrentVersion() default method on IResourceStorage with atomic conditional writes in both backends: - MongoDB: version-conditioned updateOne, catch duplicate-key on history - PostgreSQL: UPDATE WHERE version=?, ON CONFLICT DO NOTHING for history ResourceModifiedException now surfaces as HTTP 409 instead of silent data loss.
ComponentCache is @ApplicationScoped (singleton) but used plain HashMap. computeIfAbsent on HashMap is not thread-safe — concurrent reads (every conversation turn) and writes (lazy agent deployment) can corrupt the bucket structure. ConcurrentHashMap is a thread-safe drop-in replacement. Added concurrent stress test (8 threads, 500 ops each).
When an agent times out, future.cancel(true) sets the interrupt flag but cannot stop threads blocked in non-interruptible I/O (LLM HTTP calls). The completed callable's onComplete callback would fire and do an unconditional replaceOne that overwrites newer conversation state. Added interrupt flag check before onComplete — cancelled executions now route to onFailure with a log warning instead of persisting stale data.
- HistorizedResourceStoreTest: add InOrder verification to prove history is archived BEFORE the conditional store (ordering is the whole point of the race fix). Verify history archived even when concurrent update throws ResourceModifiedException. - BaseRuntimeTest: fix misleading comment — interrupt simulation is post-completion (non-interruptible I/O), not mid-execution cancel. Document pair-test relationship for the no-interruption test.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
💤 Files with no reviewable changes (2)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThis PR applies four focused fixes with tests: version-checked conditional storage for resources, concurrent-safe component maps, interrupt-aware runtime task completion, and null-safe property setting; plus changelog and CI documentation describing the changes. ChangesConcurrency & Null Safety Fixes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
There was a problem hiding this comment.
Pull request overview
This PR addresses four previously identified defects affecting null-safety and concurrent execution/persistence across the runtime, component lifecycle cache, and resource versioning layer (MongoDB/PostgreSQL), with accompanying regression tests and changelog documentation.
Changes:
- Add null-guards in
PropertySetterTaskforinput:initialwhenCATCH_ANY_INPUT_AS_PROPERTYis active. - Introduce optimistic locking in
HistorizedResourceStore.update()viaIResourceStorage.storeIfCurrentVersion()and implement conditional writes in MongoDB/PostgreSQL backends. - Make
ComponentCachethread-safe by replacingHashMapwithConcurrentHashMap, and prevent “zombie-write” persistence by routing interrupted executions toonFailure()inBaseRuntime.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/main/java/ai/labs/eddi/modules/properties/impl/PropertySetterTask.java | Adds null checks around input:initial to avoid NPEs on blank/missing user input. |
| src/main/java/ai/labs/eddi/datastore/IResourceStorage.java | Adds storeIfCurrentVersion() default method to support optimistic locking in storage backends. |
| src/main/java/ai/labs/eddi/datastore/mongo/MongoResourceStorage.java | Implements conditional update for optimistic locking and hardens history insert against duplicate-key races. |
| src/main/java/ai/labs/eddi/datastore/postgres/PostgresResourceStorage.java | Implements UPDATE ... WHERE version=? optimistic locking and avoids history merge on conflicts. |
| src/main/java/ai/labs/eddi/datastore/HistorizedResourceStore.java | Switches update path to call storeIfCurrentVersion() instead of unconditional store. |
| src/main/java/ai/labs/eddi/engine/lifecycle/internal/ComponentCache.java | Replaces non-thread-safe HashMap usage with ConcurrentHashMap. |
| src/main/java/ai/labs/eddi/engine/runtime/BaseRuntime.java | Prevents interrupted executions from calling onComplete() (avoids stale persistence after cancel/timeout). |
| src/test/java/ai/labs/eddi/modules/properties/impl/PropertySetterTaskTest.java | Adds regression tests for missing/null/empty input:initial cases. |
| src/test/java/ai/labs/eddi/datastore/HistorizedResourceStoreTest.java | Updates tests to assert storeIfCurrentVersion() usage and ordering; adds concurrent-modification test. |
| src/test/java/ai/labs/eddi/engine/lifecycle/internal/ComponentCacheTest.java | Adds concurrent access stress test to validate cache thread-safety. |
| src/test/java/ai/labs/eddi/engine/runtime/BaseRuntimeTest.java | Adds tests ensuring interrupted executions route to onFailure() and non-interrupted executions still complete normally. |
| docs/changelog.md | Documents all four bug fixes with context and affected files/tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/changelog.md`:
- Around line 7-38: Add a new "Design decisions made" subsection to the
2026-06-10 changelog entry that briefly documents the rationale behind each fix
and any trade-offs or alternatives considered: mention why null guards were
chosen for PropertySetterTask (avoid NPE vs. stricter input normalization), why
optimistic locking via IResourceStorage.storeIfCurrentVersion with
MongoResourceStorage/PostgresResourceStorage overrides was used (avoid
DB-specific atomic ops vs. heavier locking), why ComponentCache was switched
from HashMap to ConcurrentHashMap (thread-safety with minimal API changes vs.
full cache redesign), and why BaseRuntime now checks
Thread.currentThread().isInterrupted() before onComplete() (prevent zombie-write
clobber vs. more complex cancellation handling); also include a short "what's
next" note listing follow-ups (e.g., monitor for contention, consider
schema-level constraints, evaluate input normalization) and reference the
affected symbols PropertySetterTask, IResourceStorage, MongoResourceStorage,
PostgresResourceStorage, ComponentCache, and BaseRuntime so reviewers can find
the code.
In `@src/main/java/ai/labs/eddi/engine/runtime/BaseRuntime.java`:
- Around line 116-118: The warning lacks conversation/agent context: update the
log.warn call in BaseRuntime where the cancellation path discards results to
include identifying context (e.g., conversationId, agentId, and any
execution/requestId available in scope) using JBoss Logger placeholders, and
also augment the InterruptedException message passed to callback.onFailure to
include the same identifiers so logs and the exception can be correlated; locate
the block around BaseRuntime where log.warn(...) and callback.onFailure(new
InterruptedException(...)) are invoked and add the context variables to both the
log message and exception text.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d9f82b4e-50a0-4a4b-8b3b-ac5b163ed99d
📒 Files selected for processing (12)
docs/changelog.mdsrc/main/java/ai/labs/eddi/datastore/HistorizedResourceStore.javasrc/main/java/ai/labs/eddi/datastore/IResourceStorage.javasrc/main/java/ai/labs/eddi/datastore/mongo/MongoResourceStorage.javasrc/main/java/ai/labs/eddi/datastore/postgres/PostgresResourceStorage.javasrc/main/java/ai/labs/eddi/engine/lifecycle/internal/ComponentCache.javasrc/main/java/ai/labs/eddi/engine/runtime/BaseRuntime.javasrc/main/java/ai/labs/eddi/modules/properties/impl/PropertySetterTask.javasrc/test/java/ai/labs/eddi/datastore/HistorizedResourceStoreTest.javasrc/test/java/ai/labs/eddi/engine/lifecycle/internal/ComponentCacheTest.javasrc/test/java/ai/labs/eddi/engine/runtime/BaseRuntimeTest.javasrc/test/java/ai/labs/eddi/modules/properties/impl/PropertySetterTaskTest.java
… accuracy, log context - BaseRuntime: return null instead of stale result when thread is interrupted (prevents leaking via Future.get()), add thread name to cancellation warning log for traceability - IResourceStorage: fix misleading Javadoc — default storeIfCurrentVersion does not provide atomicity, clarify override responsibility - HistorizedResourceStoreTest: fix inaccurate ordering comment — real risk is missing history entry, not orphaned new version - changelog.md: add required Design Decisions subsection per guidelines
Strengthens submitCallable_cancelledRoutesToOnFailure test to also assert that future.get() returns null — directly validates the return-null fix that prevents stale result leakage via the Future.
…s in changelog and PropertySetterTaskTest
Summary
Fixes 4 verified bugs found during code review, ranging from HIGH to MEDIUM severity. All fixes include regression tests and were validated by a thorough self-review.
Changes
Fix # 1 — PropertySetterTask NPE on blank input (HIGH)
The
CATCH_ANY_INPUT_AS_PROPERTYhandler dereferencesgetLatestData("input:initial")without a null check. When a client sends an empty/whitespace-only message,Conversation.storeUserInputInMemoryskips storinginput:initial→ NPE → pipeline dies → conversation enters ERROR state.Fix: Added null guards for both
initialInputDataandinitialInput.Fix # 2 — Config version race condition (HIGH PG / MEDIUM Mongo)
HistorizedResourceStore.update()performs a non-atomic read→increment→write cycle. Two concurrent edits both reading version N would both write N+1 — last write wins silently. On PostgreSQL,ON CONFLICT DO UPDATEin history also silently merges duplicates. On MongoDB, historyinsertOnethrows unhandledMongoWriteException(HTTP 500 instead of 409).Fix: Introduced optimistic locking via
storeIfCurrentVersion()onIResourceStoragewith atomic conditional writes in both backends.ResourceModifiedExceptionnow surfaces as HTTP 409.Fix # 3 — ComponentCache HashMap race (MEDIUM)
ComponentCacheis@ApplicationScoped(singleton) using plainHashMap.computeIfAbsentonHashMapis not thread-safe — concurrent reads (every conversation turn) and writes (lazy agent deployment) can corrupt the bucket structure.Fix: Replaced
HashMapwithConcurrentHashMapfor both outer and inner maps.Fix # 4 — Zombie-write snapshot clobber after timeout (MEDIUM)
When an agent times out,
future.cancel(true)sets the interrupt flag but can't stop threads blocked in non-interruptible I/O (LLM HTTP calls). The completed callable'sonCompletecallback fires and does an unconditionalreplaceOnethat overwrites newer conversation state.Fix: Check
Thread.currentThread().isInterrupted()before callingonComplete(). Interrupted executions route toonFailure()instead.Test Coverage
PropertySetterTaskTestHistorizedResourceStoreTestComponentCacheTestBaseRuntimeTestAll tests use
InOrderverification where ordering matters (history archived before conditional store).Files Changed
PropertySetterTask.javainput:initialIResourceStorage.javastoreIfCurrentVersion()default methodMongoResourceStorage.javaupdateOne+ dup-key catchPostgresResourceStorage.javaUPDATE WHERE version=?+ON CONFLICT DO NOTHINGHistorizedResourceStore.javastore()→storeIfCurrentVersion()inupdate()ComponentCache.javaHashMap→ConcurrentHashMapBaseRuntime.javaonComplete()docs/changelog.mdVerification
./mvnw compile— ✅ BUILD SUCCESS./mvnw test— ✅ BUILD SUCCESS (all targeted test classes pass, 0 failures)Type of Change
Checklist
./mvnw clean verify -DskipITs)Summary by CodeRabbit