Skip to content

Commit f69ff22

Browse files
Karim BaidarKarim Baidar
authored andcommitted
make scanner UI compact and evidence-focused
1 parent ec8a6db commit f69ff22

7 files changed

Lines changed: 723 additions & 44 deletions

File tree

README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
[Static Pages mirror](https://karimbaidar.github.io/false-success-lab/) |
1010
[Core package: agent-consistency](https://github.com/karimbaidar/agent-consistency)
1111

12-
Scan your AI workflow repo for unverified completion risks.
12+
Stop false "done" before it ships.
1313

1414
False Success Lab is the interactive developer lab for `agent-consistency`.
1515
It helps you explore false-success risks in AI workflows and see how
@@ -43,7 +43,8 @@ the completion claim.
4343

4444
## What you can do in the lab
4545

46-
- Scan public repos for false-success risk.
46+
- Scan public repos for false-success risk, with repo-fit confidence instead of
47+
fake certainty.
4748
- Import local scan reports without giving the browser filesystem access.
4849
- Run built-in false-success scenarios.
4950
- Compare naive vs protected behavior.
@@ -69,9 +70,10 @@ https://github.com/org/repo
6970

7071
The FastAPI backend calls the scanner exposed by the installed
7172
`agent-consistency` package, downloads the public repo to a temporary directory,
72-
and returns a false-success report card plus Markdown output. If the backend is
73-
running with an older `agent-consistency` package that does not expose the
74-
scanner yet, it returns a clear `503` instead of pretending a scan happened.
73+
and returns a false-success report card plus Markdown output. The scanner accepts
74+
any public GitHub repo, but reports whether the repo looks like an
75+
agentic-workflow repo, workflow-adjacent repo, or general code. Weak matches are
76+
shown as possible risks that need review.
7577

7678
### Local Report Import
7779

@@ -135,8 +137,8 @@ report cards, proof trails, and copyable fixes.
135137
the demo lightweight and deterministic.
136138
- **Lab backend:** validates public scan requests, calls the scanner, and runs
137139
the refund scenario through the real workflow path where available.
138-
- **Scanner:** reads source code and returns report-card metrics, findings,
139-
severity, confidence, missing evidence, and suggested fixes.
140+
- **Scanner:** reads source code and returns repo applicability, grouped
141+
findings, severity, confidence, missing evidence, and suggested fixes.
140142
- **Verified action / outcome gate:** blocks or reviews unverified completions
141143
before customer-visible claims continue.
142144
- **Verifier packs:** scenario-specific checks for the expected result, such as
@@ -227,11 +229,9 @@ still have duration and resource limits, so very large repository scans may need
227229
the local CLI path. The UI stays honest: if the backend is unavailable, it shows
228230
static demo mode and still supports local report import and built-in scenarios.
229231

230-
The hosted backend currently installs `agent-consistency` from the pinned public
231-
GitHub commit in `requirements.txt`, aligned with the current scanner-enabled
232-
`agent-consistency` 0.3.2 source. After PyPI is confirmed to have the same
233-
scanner APIs, switch the dependency back to a PyPI range such as
234-
`agent-consistency>=0.3.2,<0.4.0`.
232+
The hosted backend installs `agent-consistency>=0.3.5,<0.4.0` from PyPI. That
233+
version includes repo applicability, grouped findings, raw exposure, and
234+
conservative low-confidence wording for weak matches.
235235

236236
To deploy the backend:
237237

docs/STATE.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,9 @@ false-success workflow classes.
2525
- UI now renders report-card metrics, confidence, top findings, missing
2626
evidence, suggested fixes, proof trail, receipt JSON, and copyable Python,
2727
LangGraph, and tool-wrapper fixes.
28+
- Scanner UI now uses the stronger `agent-consistency` 0.3.5 scan shape:
29+
repo applicability, grouped risk findings, raw exposure, and conservative
30+
low-confidence wording for weak matches.
2831
- README, contribution, security, governance, trademarks, DCO, issue templates,
2932
and scenario contribution docs were added or updated.
3033
- README now embeds the repo-local architecture image at
@@ -37,9 +40,7 @@ false-success workflow classes.
3740
`https://karimbaidar.github.io/false-success-lab/`.
3841
- Backend deployment target moved to Vercel. The free Vercel API URL is
3942
`https://false-success-lab-api.vercel.app`.
40-
- The hosted backend currently uses a pinned public GitHub dependency for
41-
`agent-consistency` because PyPI Trusted Publishing has not yet authorized
42-
the scanner-enabled package upload.
43+
- The hosted backend uses `agent-consistency>=0.3.5,<0.4.0` from PyPI.
4344

4445
## Decisions
4546

@@ -64,6 +65,9 @@ false-success workflow classes.
6465
must run through the CLI and paste/upload flow.
6566
- Do not claim static scans prove safety. They find configured patterns and
6667
should feed review or runtime gate work.
68+
- The scanner can accept any public repo, but it should frame general-code
69+
results as low-applicability review prompts. Its strongest signal is for
70+
agentic and workflow-heavy repos.
6771
- Keep refund as the flagship scenario, not the whole product identity.
6872
- The GitHub repository has moved from `agent-consistency-refund-demo` to
6973
`false-success-lab`. Keep old-name references only when describing migration

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ version = "0.3.0"
66
description = "Interactive app for false-success scenarios and scanner report cards."
77
requires-python = ">=3.9"
88
dependencies = [
9-
"agent-consistency @ git+https://github.com/karimbaidar/agent-consistency.git@10d1616b2a6e8a178b8ee2f8d8212d3cf552498d",
9+
"agent-consistency>=0.3.5,<0.4.0",
1010
"fastapi>=0.115,<0.129",
1111
"uvicorn>=0.30,<0.40",
1212
]

0 commit comments

Comments
 (0)