Official implementation of the OrgSense framework and StratDrift-10K corpus from:
Beyond Sentiment: Large Language Models as Organizational Sensors for Detecting Strategic Drift in Corporate Communications
Justin Yan — Baylor University
Organizations use language to manage institutional expectations and negotiate legitimacy with external stakeholders. OrgSense repurposes instruction-tuned LLMs as theory-grounded organizational sensors capable of detecting strategic drift — the gradual misalignment between an organization's stated strategy and its enacted behaviors.
| Model | Zero-Shot F1 | Few-Shot F1 | vs. Expert |
|---|---|---|---|
| GPT-4o + OrgSense | 0.78 | 0.81 | +4.1 pts |
| Claude 3.5 Sonnet + OrgSense | 0.76 | 0.79 | +2.1 pts |
| Gemini 1.5 Pro + OrgSense | 0.73 | 0.77 | +0.1 pts |
| Llama-3-70B + OrgSense | 0.69 | 0.74 | −2.9 pts |
| Fine-tuned RoBERTa-large | — | 0.72 | −4.9 pts |
| Domain Expert Baseline | — | 0.77 | 0.0 pts |
LLM-extracted drift signals predict financial restatements (AUC = 0.74) and unplanned CEO turnover (AUC = 0.69) in prospective holdout data.
orgsense/
├── orgsense/
│ ├── __init__.py # Package exports
│ ├── framework.py # OrgSense prompting framework (Section 4)
│ ├── corpus.py # StratDrift-10K corpus construction (Section 3)
│ ├── annotation.py # Annotation schema & IRR computation (Section 3.3)
│ ├── benchmark.py # Benchmarking & ablation harness (Section 5)
│ └── predictive_validity.py # Downstream validity analyses (Section 6)
├── scripts/
│ ├── run_benchmark.py # Full multi-model benchmark runner
│ ├── collect_corpus.py # SEC EDGAR corpus collection
│ └── run_predictive_validity.py
├── tests/
│ └── test_orgsense.py # Unit & integration tests
├── configs/ # Model and task configuration files
├── notebooks/ # Analysis notebooks (see below)
├── requirements.txt
└── pyproject.toml
git clone https://github.com/justinyan-baylor/orgsense.git
cd orgsense
pip install -e ".[dev]"Set environment variables for your LLM providers:
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."from openai import OpenAI
from orgsense import OrgSenseAnalyzer
client = OpenAI()
analyzer = OrgSenseAnalyzer(client=client, model="gpt-4o")
# Q4 earnings call transcript text
transcript = """
Our cloud infrastructure investment remains our top strategic priority.
We are doubling down on this initiative despite near-term revenue headwinds.
European segment revenue declined 8% year-over-year due to macroeconomic factors.
"""
# Optional: prior-year transcript for Consistency scoring
prior_transcript = """
We are strategically pivoting toward on-premise enterprise solutions
as the core of our long-term value proposition.
"""
# Optional: 10-K MD&A for Audience Divergence scoring
mda_text = """
Management is evaluating strategic priorities and may adjust resource
allocation in the coming fiscal year given market conditions.
"""
result = analyzer.analyze(
target_text=transcript,
prior_text=prior_transcript,
cross_document_text=mda_text,
)
print(f"Composite drift score: {result.composite_drift_score:.2f} / 5.0")
print(f"Consistency: {result.consistency.score} / 5")
print(f"Commitment escalation: {result.commitment_escalation.score} / 5")
print(f"Hedging asymmetry: {result.hedging_asymmetry.score} / 5")
print(f"Audience divergence: {result.audience_divergence.score} / 5")
print(f"\nJustification: {result.composite_justification}")
if result.hallucination_flags:
print(f"\nWarning: {len(result.hallucination_flags)} unverified evidence quote(s)")python scripts/run_benchmark.py \
--test-annotations data/annotations/test_set.jsonl \
--bundles-dir data/corpus \
--output results/benchmark_results.json \
--models gpt-4o claude-3-5-sonnet-20241022 \
--shot-modes zero_shot few_shot \
--max-docs 100python scripts/collect_corpus.py \
--sp500-tickers data/sp500_tickers.csv \
--years 2003 2023 \
--output-dir data/corpus \
--user-agent "Your Name your@email.edu" \
--earnings-calls-dir data/earnings_callspython scripts/run_predictive_validity.py \
--drift-scores results/drift_scores.csv \
--compustat data/compustat_annual.csv \
--restatements data/audit_analytics_restatements.csv \
--ceo-turnover data/execucomp_departures.csv \
--output results/predictive_validity.jsonThe framework uses a four-component prompt structure:
| Component | Purpose |
|---|---|
| 1. Theoretical Context | Communicates the construct's theoretical meaning to the model |
| 2. Dimensional Scaffolding | Decomposes strategic drift into four sub-questions answered sequentially |
| 3. Evidence Requirements | Requires verbatim quote anchoring for each judgment |
| 4. Output Schema | Structured JSON with Likert scores, quotes, confidence, and justification |
A calibration block (three annotated examples) is appended by default, reducing hallucination rates from 8.9% to 3.2% (Section 5.4).
| Dimension | Theoretical basis | F1 (best model) |
|---|---|---|
| Consistency | Burgelman (2002); Zajac & Shortell (1989) | 0.86 |
| Hedging Asymmetry | Staw et al. (1983) | 0.83 |
| Commitment Escalation | Staw (1981) | 0.77 |
| Audience Divergence | Gioia & Chittipeddi (1991) | 0.72 |
The corpus comprises 10,847 document-year observations across 487 S&P 500 firms (2003–2023). Each observation includes:
- Q4 earnings call transcript
- 10-K MD&A section (Item 7)
- 10-K Risk Factors (Item 1A)
- 127,412 paragraph-level annotations across four dimensions
Access: Due to data licensing restrictions (Refinitiv Eikon; SEC EDGAR terms), the raw corpus cannot be distributed directly. The annotation JSONL files, codebook, and all prompts are released in this repository. SEC EDGAR 10-K filings can be collected freely using scripts/collect_corpus.py. Earnings call transcripts require a Refinitiv Eikon or equivalent subscription.
data/
├── sp500_tickers.csv # S&P 500 constituent tickers (2003–2023)
├── annotations/
│ ├── train_set.jsonl # Training annotations (70%)
│ ├── val_set.jsonl # Validation annotations (15%)
│ └── test_set.jsonl # Test annotations (15%, stratified)
├── few_shot_examples.jsonl # 8 curated (prompt, response) pairs
├── corpus/ # Built by collect_corpus.py
│ └── {TICKER}/{YEAR}/
│ ├── earnings_call.txt
│ ├── 10k_mda.txt
│ ├── 10k_risks.txt
│ └── bundle.json
└── codebook/
└── annotation_codebook.pdf # Full annotation codebook with examples
pytest tests/ -v --cov=orgsense --cov-report=term-missing@article{yan2024orgsense,
title = {Beyond Sentiment: Large Language Models as Organizational Sensors
for Detecting Strategic Drift in Corporate Communications},
author = {Yan, Justin},
year = {2024},
journal = {Working Paper},
institution = {Baylor University}
}MIT License. See LICENSE for details.
The StratDrift-10K annotation files are released under CC BY 4.0. Raw transcript and filing data is subject to the terms of Refinitiv Eikon and SEC EDGAR respectively.