OrgSense: LLMs as Organizational Sensors for Strategic Drift

Official implementation of the OrgSense framework and StratDrift-10K corpus from:

Beyond Sentiment: Large Language Models as Organizational Sensors for Detecting Strategic Drift in Corporate Communications
Justin Yan — Baylor University

Overview

Organizations use language to manage institutional expectations and negotiate legitimacy with external stakeholders. OrgSense repurposes instruction-tuned LLMs as theory-grounded organizational sensors capable of detecting strategic drift — the gradual misalignment between an organization's stated strategy and its enacted behaviors.

Key results

Model	Zero-Shot F1	Few-Shot F1	vs. Expert
GPT-4o + OrgSense	0.78	0.81	+4.1 pts
Claude 3.5 Sonnet + OrgSense	0.76	0.79	+2.1 pts
Gemini 1.5 Pro + OrgSense	0.73	0.77	+0.1 pts
Llama-3-70B + OrgSense	0.69	0.74	−2.9 pts
Fine-tuned RoBERTa-large	—	0.72	−4.9 pts
Domain Expert Baseline	—	0.77	0.0 pts

LLM-extracted drift signals predict financial restatements (AUC = 0.74) and unplanned CEO turnover (AUC = 0.69) in prospective holdout data.

Repository structure

orgsense/
├── orgsense/
│   ├── __init__.py               # Package exports
│   ├── framework.py              # OrgSense prompting framework (Section 4)
│   ├── corpus.py                 # StratDrift-10K corpus construction (Section 3)
│   ├── annotation.py             # Annotation schema & IRR computation (Section 3.3)
│   ├── benchmark.py              # Benchmarking & ablation harness (Section 5)
│   └── predictive_validity.py    # Downstream validity analyses (Section 6)
├── scripts/
│   ├── run_benchmark.py          # Full multi-model benchmark runner
│   ├── collect_corpus.py         # SEC EDGAR corpus collection
│   └── run_predictive_validity.py
├── tests/
│   └── test_orgsense.py          # Unit & integration tests
├── configs/                      # Model and task configuration files
├── notebooks/                    # Analysis notebooks (see below)
├── requirements.txt
└── pyproject.toml

Installation

git clone https://github.com/justinyan-baylor/orgsense.git
cd orgsense
pip install -e ".[dev]"

Set environment variables for your LLM providers:

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

Quickstart

Analyze a single document

from openai import OpenAI
from orgsense import OrgSenseAnalyzer

client = OpenAI()
analyzer = OrgSenseAnalyzer(client=client, model="gpt-4o")

# Q4 earnings call transcript text
transcript = """
Our cloud infrastructure investment remains our top strategic priority.
We are doubling down on this initiative despite near-term revenue headwinds.
European segment revenue declined 8% year-over-year due to macroeconomic factors.
"""

# Optional: prior-year transcript for Consistency scoring
prior_transcript = """
We are strategically pivoting toward on-premise enterprise solutions
as the core of our long-term value proposition.
"""

# Optional: 10-K MD&A for Audience Divergence scoring
mda_text = """
Management is evaluating strategic priorities and may adjust resource
allocation in the coming fiscal year given market conditions.
"""

result = analyzer.analyze(
    target_text=transcript,
    prior_text=prior_transcript,
    cross_document_text=mda_text,
)

print(f"Composite drift score: {result.composite_drift_score:.2f} / 5.0")
print(f"Consistency:           {result.consistency.score} / 5")
print(f"Commitment escalation: {result.commitment_escalation.score} / 5")
print(f"Hedging asymmetry:     {result.hedging_asymmetry.score} / 5")
print(f"Audience divergence:   {result.audience_divergence.score} / 5")
print(f"\nJustification: {result.composite_justification}")

if result.hallucination_flags:
    print(f"\nWarning: {len(result.hallucination_flags)} unverified evidence quote(s)")

Run the full benchmark

python scripts/run_benchmark.py \
    --test-annotations data/annotations/test_set.jsonl \
    --bundles-dir      data/corpus \
    --output           results/benchmark_results.json \
    --models gpt-4o claude-3-5-sonnet-20241022 \
    --shot-modes zero_shot few_shot \
    --max-docs 100

Build the corpus from SEC EDGAR

python scripts/collect_corpus.py \
    --sp500-tickers    data/sp500_tickers.csv \
    --years 2003 2023 \
    --output-dir       data/corpus \
    --user-agent       "Your Name your@email.edu" \
    --earnings-calls-dir data/earnings_calls

Run predictive validity analyses (Section 6)

python scripts/run_predictive_validity.py \
    --drift-scores results/drift_scores.csv \
    --compustat    data/compustat_annual.csv \
    --restatements data/audit_analytics_restatements.csv \
    --ceo-turnover data/execucomp_departures.csv \
    --output       results/predictive_validity.json

OrgSense prompt architecture (Section 4.2)

The framework uses a four-component prompt structure:

Component	Purpose
1. Theoretical Context	Communicates the construct's theoretical meaning to the model
2. Dimensional Scaffolding	Decomposes strategic drift into four sub-questions answered sequentially
3. Evidence Requirements	Requires verbatim quote anchoring for each judgment
4. Output Schema	Structured JSON with Likert scores, quotes, confidence, and justification

A calibration block (three annotated examples) is appended by default, reducing hallucination rates from 8.9% to 3.2% (Section 5.4).

Four dimensions of strategic drift

Dimension	Theoretical basis	F1 (best model)
Consistency	Burgelman (2002); Zajac & Shortell (1989)	0.86
Hedging Asymmetry	Staw et al. (1983)	0.83
Commitment Escalation	Staw (1981)	0.77
Audience Divergence	Gioia & Chittipeddi (1991)	0.72

Data

StratDrift-10K corpus

The corpus comprises 10,847 document-year observations across 487 S&P 500 firms (2003–2023). Each observation includes:

Q4 earnings call transcript
10-K MD&A section (Item 7)
10-K Risk Factors (Item 1A)
127,412 paragraph-level annotations across four dimensions

Access: Due to data licensing restrictions (Refinitiv Eikon; SEC EDGAR terms), the raw corpus cannot be distributed directly. The annotation JSONL files, codebook, and all prompts are released in this repository. SEC EDGAR 10-K filings can be collected freely using scripts/collect_corpus.py. Earnings call transcripts require a Refinitiv Eikon or equivalent subscription.

Data directory layout

data/
├── sp500_tickers.csv           # S&P 500 constituent tickers (2003–2023)
├── annotations/
│   ├── train_set.jsonl         # Training annotations (70%)
│   ├── val_set.jsonl           # Validation annotations (15%)
│   └── test_set.jsonl          # Test annotations (15%, stratified)
├── few_shot_examples.jsonl     # 8 curated (prompt, response) pairs
├── corpus/                     # Built by collect_corpus.py
│   └── {TICKER}/{YEAR}/
│       ├── earnings_call.txt
│       ├── 10k_mda.txt
│       ├── 10k_risks.txt
│       └── bundle.json
└── codebook/
    └── annotation_codebook.pdf  # Full annotation codebook with examples

Testing

pytest tests/ -v --cov=orgsense --cov-report=term-missing

Citation

@article{yan2024orgsense,
  title   = {Beyond Sentiment: Large Language Models as Organizational Sensors
             for Detecting Strategic Drift in Corporate Communications},
  author  = {Yan, Justin},
  year    = {2024},
  journal = {Working Paper},
  institution = {Baylor University}
}

License

MIT License. See LICENSE for details.

The StratDrift-10K annotation files are released under CC BY 4.0. Raw transcript and filing data is subject to the terms of Refinitiv Eikon and SEC EDGAR respectively.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OrgSense: LLMs as Organizational Sensors for Strategic Drift

Overview

Key results

Repository structure

Installation

Quickstart

Analyze a single document

Run the full benchmark

Build the corpus from SEC EDGAR

Run predictive validity analyses (Section 6)

OrgSense prompt architecture (Section 4.2)

Four dimensions of strategic drift

Data

StratDrift-10K corpus

Data directory layout

Testing

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
annotation.py		annotation.py
benchmark.py		benchmark.py
corpus.py		corpus.py
framework.py		framework.py
orgsense_repo.zip		orgsense_repo.zip
predictive_validity.py		predictive_validity.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

OrgSense: LLMs as Organizational Sensors for Strategic Drift

Overview

Key results

Repository structure

Installation

Quickstart

Analyze a single document

Run the full benchmark

Build the corpus from SEC EDGAR

Run predictive validity analyses (Section 6)

OrgSense prompt architecture (Section 4.2)

Four dimensions of strategic drift

Data

StratDrift-10K corpus

Data directory layout

Testing

Citation

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages