UN-2771 [FEAT] Report text extraction time in API deployment metrics by athul-rs · Pull Request #2032 · Zipstack/unstract

athul-rs · 2026-06-11T08:09:11Z

What

API deployment responses with include_metrics=True now report the text extraction (LLMWhisperer/X2Text) duration as metrics.extraction["time_taken(s)"], alongside the existing per-output indexing time.

Why

UN-2771 — metrics only included indexing time; the LLMWhisperer call (often the dominant cost for large documents) was invisible, making it hard to attribute slow executions.

How

Reworked per review (@chandrasekharan-zipstack): the original version added timing to tools/structure (deprecated Docker path); that's reverted. The change now lives in the live celery flow: LegacyExecutor._handle_structure_pipeline times the extract step (time.monotonic()) and _finalize_pipeline_result merges {"extraction": {"time_taken(s)": ...}} into result metrics via the existing _merge_pipeline_metrics, exactly mirroring how _index_pipeline_output records indexing time. The Prompt Studio IDE flow (_handle_ide_index) is untouched.

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No. Additive metrics key only; existing indexing metrics and response shape unchanged. Paths that skip extraction (smart-table) leave the dict empty — nothing added, same as today.

Database Migrations

None

Env Config

None

Relevant Docs

N/A

Related Issues or PRs

Jira: UN-2771

Notes on Testing

ruff check/format clean for the changed lines (remaining findings in the file pre-exist on main).
Manual: run an API deployment (structure-tool workflow routed via the worker pipeline) with include_metrics=True → response metrics contain extraction.time_taken(s); smart-table path unchanged.

Screenshots

N/A

Checklist

I have read and understood the Contribution Guidelines.

🤖 Generated with Claude Code

The structure tool timed indexing but not the text extraction (LLMWhisperer/X2Text) call, so API responses with include_metrics=True reported indexing time only. Time dynamic_extraction the same way and merge it into the result metrics as extraction.time_taken(s). Bump structure tool to 0.0.102. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

coderabbitai · 2026-06-11T08:09:29Z

Walkthrough

The structure pipeline executor now tracks extraction timing by initializing and recording metrics during extraction execution, capturing the start time only when extraction is not skipped, measuring the duration after successful completion, and passing the recorded metrics to the pipeline finalization method to be merged with existing metrics.

Changes

Extraction timing metrics in pipeline execution

Layer / File(s)	Summary
Extraction metrics initialization and recording `workers/executor/executors/legacy_executor.py`	`extraction_metrics` dict is initialized before extraction runs, `extraction_start` timestamp is captured when extraction is executed (non-skipped), extraction duration is computed and recorded under the `"extraction"` namespace after successful extraction completion, and the metrics are passed to pipeline finalization.
Finalization method signature and metrics merging `workers/executor/executors/legacy_executor.py`	`_finalize_pipeline_result` now accepts an optional `extraction_metrics` parameter and merges both extraction and indexing metrics with any existing pipeline metrics using `_merge_pipeline_metrics`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding text extraction time reporting to API deployment metrics.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	PR description is comprehensive and follows the template structure with all required sections completed.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch UN-2771-extraction-time-metric

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-06-11T08:13:10Z

Greptile Summary

This PR adds text extraction duration to the API deployment metrics response. When include_metrics=True, callers now see metrics.extraction["time_taken(s)"] alongside the existing per-output indexing times.

legacy_executor.py: wraps _handle_extract with time.monotonic() bookends, stores the result as extraction_metrics = {"extraction": {"time_taken(s)": ...}}, and threads it through _finalize_pipeline_result where _merge_pipeline_metrics is applied twice — once to combine indexing and extraction metrics, then again to merge with any pre-existing metrics from the answer step.
All pipeline paths are handled correctly: the skip-extraction (Excel smart-table) path leaves extraction_metrics = {}, so no key is added, matching the described behaviour.
The refactor of _finalize_pipeline_result changes if index_metrics: to if new_metrics:, making the gate apply to the combined result rather than indexing alone — this is a necessary and correct change.

Confidence Score: 5/5

Safe to merge — purely additive metrics key, all pipeline branches handled, no changes to existing response shape or logic.

The change is a small, self-contained addition: a timer wrapping an existing call and the result threaded through an already-present merge helper. Every code path (skip-extraction, single-pass, summarization, normal) leaves the pipeline behaviour unchanged — only the metrics dict gains a new key when extraction actually ran. The refactored gate in _finalize_pipeline_result correctly broadens from if index_metrics to if new_metrics so extraction-only runs also surface their timing.

No files require special attention.

Important Files Changed

Filename	Overview
workers/executor/executors/legacy_executor.py	Adds extraction-timing metrics alongside existing indexing metrics; logic is correct for all pipeline paths (skip-extraction, summarization, single-pass, normal)

Sequence Diagram

sequenceDiagram
    participant Caller
    participant LegacyExecutor
    participant HandleExtract
    participant HandleAnswer

    Caller->>LegacyExecutor: _run_pipeline()
    Note over LegacyExecutor: extraction_metrics = {}
    alt not skip_extraction
        Note over LegacyExecutor: extraction_start = time.monotonic()
        LegacyExecutor->>HandleExtract: _handle_extract(extract_ctx)
        HandleExtract-->>LegacyExecutor: extract_result
        Note over LegacyExecutor: extraction_metrics = {"extraction": {"time_taken(s)": elapsed}}
    end
    LegacyExecutor->>HandleAnswer: _run_pipeline_answer_step()
    HandleAnswer-->>LegacyExecutor: answer_result
    LegacyExecutor->>LegacyExecutor: _finalize_pipeline_result(index_metrics, extraction_metrics)
    Note over LegacyExecutor: new_metrics = merge(index_metrics, extraction_metrics)
    Note over LegacyExecutor: structured_output[metrics] = merge(existing, new_metrics)
    LegacyExecutor-->>Caller: ExecutionResult with metrics

_{Reviews (2): Last reviewed commit: "UN-2771 Rework: capture extraction time ..." | Re-trigger Greptile}

chandrasekharan-zipstack

@athul-rs I like your intent here however I don't think the tool related code or logic is being used anymore. This is deprecated and in fact we'll be removing this code soon. Can you check if this is handled in the equivalent flow involving the celery tasks and ensure this concern is addressed?
cc: @harini-venkataraman

Per review, the structure tool's Docker path is deprecated — the live flow is the celery-based LegacyExecutor structure pipeline. Time the extract step there and merge {'extraction': {'time_taken(s)': ...}} into the result metrics alongside the existing per-output indexing timing. Structure tool changes reverted (no tool version bump needed). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

athul-rs · 2026-06-11T10:52:51Z

@chandrasekharan-zipstack you were right — reworked. The tools/structure change (and the tool version bump) is reverted; extraction timing now lives in the celery flow: LegacyExecutor._handle_structure_pipeline times the extract step and merges extraction.time_taken(s) into result metrics through _finalize_pipeline_result/_merge_pipeline_metrics, mirroring the existing indexing timing in _index_pipeline_output. Confirmed the worker path is what cloud runs (workflows using the structure tool image are routed to it via _is_structure_tool_workflow). PR is now a 12-line change to legacy_executor.py only.

sonarqubecloud · 2026-06-11T10:53:26Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@workers/executor/executors/legacy_executor.py`:
- Around line 645-647: The current extraction_metrics dict uses a top-level key
"extraction" which can collide with user-defined output/prompt names when merged
later; update this to use a reserved pipeline namespace (e.g., set
extraction_metrics = {"_pipeline": {"extraction": {"time_taken(s)": ...}}}) and
then merge into the main metrics map so pipeline-level metrics live under
metrics["_pipeline"]; adjust the merge logic where metrics and
extraction_metrics are combined (the existing merge around the metrics variable
at lines ~804-811) to preserve the "_pipeline" namespace, or alternatively add a
pre-merge check to disallow user outputs named "extraction" if you prefer the
blocking approach. Ensure you update references to extraction_metrics and the
merge operation accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d5ab6512-472a-43ec-a802-245f014d585e

📥 Commits

Reviewing files that changed from the base of the PR and between ab98fc5 and ec778bc.

📒 Files selected for processing (1)

workers/executor/executors/legacy_executor.py

coderabbitai · 2026-06-11T10:55:29Z

+                extraction_metrics = {
+                    "extraction": {"time_taken(s)": time.monotonic() - extraction_start}
+                }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid top-level metrics key collision with user prompt names

"extraction" is currently injected as a normal top-level metrics key (Line [645]), but top-level metrics keys are also used for output/prompt names. If a user defines an output named "extraction", the merge at Line [804] will silently co-mingle unrelated metrics under the same key.

Use a reserved namespace for pipeline-level metrics (e.g., metrics["_pipeline"]["extraction"]) or enforce/reserve "extraction" as a disallowed output name before merge.

Also applies to: 804-811

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workers/executor/executors/legacy_executor.py` around lines 645 - 647, The current extraction_metrics dict uses a top-level key "extraction" which can collide with user-defined output/prompt names when merged later; update this to use a reserved pipeline namespace (e.g., set extraction_metrics = {"_pipeline": {"extraction": {"time_taken(s)": ...}}}) and then merge into the main metrics map so pipeline-level metrics live under metrics["_pipeline"]; adjust the merge logic where metrics and extraction_metrics are combined (the existing merge around the metrics variable at lines ~804-811) to preserve the "_pipeline" namespace, or alternatively add a pre-merge check to disallow user outputs named "extraction" if you prefer the blocking approach. Ensure you update references to extraction_metrics and the merge operation accordingly.

github-actions · 2026-06-11T10:56:20Z

Unstract test results

Per-group results

Status	Group	Tier	Passed	Failed	Errors	Skipped	Duration (s)
❌	`unit-connectors`	unit	64	12	0	3	16.7
❌	`unit-core`	unit	0	0	2	0	1.2
❌	`unit-platform-service`	unit	9	0	1	0	1.4
✅	`unit-prompt-service`	unit	15	0	0	0	20.2
✅	`unit-rig`	unit	53	0	0	0	3.3
✅	`unit-runner`	unit	11	0	0	0	3.2
✅	`unit-sdk1`	unit	390	0	0	0	20.7
❌	`unit-tool-registry`	unit	0	0	1	0	1.4
⚪	`unit-workers`	unit	0	0	0	0	17.3
	TOTAL		542	12	4	3	85.3

Critical paths

⚠️ Critical paths not yet covered

auth-login — User can log in and obtain a session cookie. (entry: POST /api/v1/auth/login; declared coverage: no groups declared)
adapter-register-llm — Register and validate an LLM adapter. (entry: POST /api/v1/adapter/; declared coverage: no groups declared)
workflow-create-execute — Create a workflow, configure source+destination, execute, poll, fetch result. (entry: POST /api/v1/workflow/{id}/execute/; declared coverage: e2e-workflow)
api-deployment-run — Deploy a workflow as an API, POST a document, receive structured JSON. (entry: POST /deployment/api/{org}/{name}/; declared coverage: e2e-api-deployment)
prompt-studio-fetch-response — Prompt Studio: create project, add prompt, run single-pass, get response. (entry: POST /api/v1/prompt-studio/prompt-studio-tool/{id}/fetch_response/; declared coverage: e2e-prompt-studio)
pipeline-etl-execute — Run an ETL pipeline from source connector to destination. (entry: POST /api/v1/pipeline/{id}/execute/; declared coverage: no groups declared)
usage-token-tracking — Per-execution token usage is recorded and retrievable. (entry: GET /api/v1/usage/get_token_usage/; declared coverage: no groups declared)
workflow-execution-fan-out — Multi-file workflow execution fans out to file-processing workers and rejoins. (entry: internal: backend → rabbitmq → workers/file_processing; declared coverage: no groups declared)
callback-result-delivery — Async results are posted back via the callback worker. (entry: internal: workers/callback → backend /internal endpoints; declared coverage: no groups declared)

✅ Covered critical paths

tool-sandbox-exec — covered by unit-runner

jaseemjaskp

Automated review (PR Review Toolkit) of the extraction-time metric.

Net assessment: the merge logic is correct, empty-dict-safe, and introduces no silent failures — error propagation (LegacyExecutorError re-raise) is byte-for-byte unchanged. No blocking bugs. The substantive open item is the top-level-key collision already flagged by @coderabbitai on L647; the inline notes below are distinct consistency/maintainability/test points.

Biggest gap — test coverage (not inline-able; test_phase5d.py isn't in this diff): the PR adds zero tests for the new behavior. _finalize_pipeline_result / _merge_pipeline_metrics are exercised by test_phase5d.py (test_index_metrics_merged L261, test_merge_* L915-927) but none assert on extraction_metrics. Recommend adding before merge:

test_extraction_metrics_recorded — normal run: result.data["metrics"]["extraction"]["time_taken(s)"] is a float >= 0 and coexists with per-prompt keys.
test_extract_failure_records_no_extraction_metric — metric is computed after the _failure early-return, so a failed extract must record no timing (pin via a _finalize_pipeline_result spy / assert_not_called).
test_index_and_extraction_metrics_merged — both families merge (top-level extraction + per-output indexing).
test_skip_extraction_records_no_extraction_metric — skip_extraction ⇒ no "extraction" key.

jaseemjaskp · 2026-06-18T17:41:42Z

                _absorb(extract_result)
                extracted_text = extract_result.data.get(IKeys.EXTRACTED_TEXT, "")
+                extraction_metrics = {
+                    "extraction": {"time_taken(s)": time.monotonic() - extraction_start}


Maintainability (distinct from the key-collision note on L647): this hardcodes the literal "extraction", but PromptServiceConstants.EXTRACTION = "extraction" already exists and is imported as PSKeys in this file — use PSKeys.EXTRACTION so the metric key and the constant can't drift. Separately, "time_taken(s)" is duplicated here and at L1031 (index_metrics); consider promoting it to a shared constant.

Minor readability: for parity with the indexing path (elapsed = ...; index_metrics[output_name] = {"indexing": {"time_taken(s)": elapsed}}, L1029-1031), capture the duration in a named local first:

extraction_time = time.monotonic() - extraction_start extraction_metrics = {PSKeys.EXTRACTION: {"time_taken(s)": extraction_time}}

jaseemjaskp · 2026-06-18T17:41:42Z

            # ---- Step 1: Extract ----
            if not skip_extraction:
                step += 1
+                extraction_start = time.monotonic()


Clock-source consistency: extraction times the step with time.monotonic(), while the sibling indexing metric uses wall-clock datetime.datetime.now() (L1029). time.monotonic() is the correct choice for duration measurement (immune to NTP / system-clock adjustments), so this code is right — but the two producers of "time_taken(s)" now use different clocks. Worth a follow-up to align the indexing path onto monotonic() too.

jaseemjaskp · 2026-06-18T17:41:42Z

        source_file_name: str,
        extracted_text: str,
        index_metrics: dict,
+        extraction_metrics: dict | None = None,


The new extraction_metrics param isn't reflected in the docstring. Its shape differs from index_metrics in a load-bearing way: extraction_metrics is a top-level {"extraction": {"time_taken(s)": float}} (a sibling of per-output keys), whereas index_metrics is keyed per output {output_name: {"indexing": {...}}}. A short Args: note documenting both shapes (and that this is None/empty when extraction is skipped) would stop a future maintainer from incorrectly nesting/merging them. Consider typing it as dict[str, dict] | None or a small TypedDict rather than bare dict.

athul-rs requested review from chandrasekharan-zipstack, hari-kuriakose and jaseemjaskp June 11, 2026 08:14

chandrasekharan-zipstack reviewed Jun 11, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

jaseemjaskp reviewed Jun 18, 2026

View reviewed changes

Conversation

athul-rs commented Jun 11, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

Database Migrations

Env Config

Relevant Docs

Related Issues or PRs

Notes on Testing

Screenshots

Checklist

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

greptile-apps Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

chandrasekharan-zipstack left a comment

Choose a reason for hiding this comment

Uh oh!

athul-rs commented Jun 11, 2026

Uh oh!

sonarqubecloud Bot commented Jun 11, 2026

Quality Gate passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 11, 2026

Unstract test results

Per-group results

Critical paths

⚠️ Critical paths not yet covered

Uh oh!

jaseemjaskp left a comment

Choose a reason for hiding this comment

Uh oh!

jaseemjaskp Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

jaseemjaskp Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

jaseemjaskp Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

athul-rs commented Jun 11, 2026 •

edited by atlassian Bot

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

greptile-apps Bot commented Jun 11, 2026 •

edited

Loading