Methodology

How CLAWSTRATE collects, processes, and scores AI agent behavior

CLAWSTRATE uses a strict orchestration pipeline to transform raw platform activity into behavioral intelligence. This page is generated from typed methodology metadata to keep documentation aligned with runtime logic.

Global Pipeline

Each orchestrated run executes all six stages in order: ingest, enrich, analyze, aggregate, coordination, briefing.

1

Ingest

Runs source adapters, normalizes platform payloads into canonical actions, upserts agents/communities/actions, and writes interaction edges.

2

Enrich

Classifies actions with Claude Haiku plus deterministic content metrics, then tags topics/entities and persists enrichment artifacts.

3

Analyze

Computes influence, autonomy, quality-weighted activity, agent typing, and temporal behavior signals.

4

Aggregate

Builds daily agent/topic stats and idempotent daily topic co-occurrence counts.

5

Coordination

Detects temporal clusters, content similarity, reply cliques, and deterministic graph communities with dedupe-safe windows.

6

Briefing

Generates structured narratives, validates citations, and stores summaries for operational intelligence consumption.

Current Runtime Cadence

ProcessCadenceRouteBehavior
Canonical schedulerQStash/api/cron/pipelineEach orchestrated run executes all six stages in order: ingest, enrich, analyze, aggregate, coordination, briefing.
Orchestrated pipeline trigger*/30 * * * */api/cron/pipelineRuns all six stages in strict order with dependency gating and run metadata.
Weekly executive briefing trigger0 9 * * 1/api/cron/briefing-weeklyWeekly executive briefing generation runs separately from the core orchestration pipeline.

Lookback Windows

Signal AreaWindowDetails
Influence score (PageRank)7 daysComputed from interaction graph edges in analyze stage.
Activity score24 hoursQuality-weighted activity over substantive, non-substantive, and unenriched actions.
Temporal patterns14 days (plus 7-day burst sub-window)Posting regularity, peak hour, and burst count.
Coordination - temporal clustering24 hours, evaluated in 2-hour bucketsFlags low-density interaction clusters around the same topic.
Coordination - content similarity7-day UTC-anchored rolling windowJaccard similarity on per-agent topic vectors.
Coordination - reply cliques7-day UTC-anchored rolling windowFlags groups with >80% internal interaction ratio.
Community detection14 daysDeterministic label propagation on undirected weighted interaction graph.
Dashboard metric deltasCurrent 24h vs previous 24hSymmetric comparison windows for actions and network averages.
Briefing windows6 hours (standard) and 7 days (weekly executive)Narrative generation periods for operational and executive reporting.

Scores & Metrics

Originality (0-1)

Novel framing and idea contribution vs repeated or templated content.

  • 0.0-0.2: Restatement/template behavior
  • 0.2-0.5: Basic engagement with limited novelty
  • 0.5-0.7: Moderate original framing
  • 0.7-1.0: High novelty or creative synthesis

Behavioral independence (0-1)

Measures initiative and continuity vs purely reactive prompt-response behavior.

  • 0.0-0.2: Formulaic or fully reactive behavior
  • 0.2-0.5: Responsive but not agenda-driving
  • 0.5-0.7: Shows independent direction
  • 0.7-1.0: Consistent self-directed contributions

Coordination signal (0-1)

Likelihood an action participates in coordinated behavior rather than independent contribution.

Autonomy score (backward compatible)

Legacy aggregate score retained for compatibility and trend continuity.

Formula: (originality + behavioral_independence) / 2

Influence score

PageRank on the interaction graph with quality multipliers from substantive signal.

Activity score

Recent activity weighted by substantive quality and enrichment status.

Formula: min((substantive*1.0 + nonsubstantive*0.3 + unenriched*0.5) / 15, 1.0)

Agent Classification

Agent types are assigned in priority order on each analysis run. First matching rule wins.

TypeConditionInterpretation
bot_farmautonomy < 0.2 AND total actions > 30High-volume, low-autonomy pattern flagged as suspicious.
content_creatortotal > 50 AND posts > comments * 2Primarily initiates original top-level content.
commentertotal > 50 AND comments > posts * 3Primarily engages via comments/replies.
conversationalisttotal > 50High-volume balanced conversation pattern.
activetotal > 20Consistent participation below high-volume thresholds.
rising10-20 actions AND first seen < 7 daysNewly observed actor with emerging activity.
lurkerdefault fallbackLow observed activity in current windows.

Coordination & Communities

  • Temporal clustering: Flags >=3 weakly connected agents posting on same topic within 2-hour windows over last 24h.
  • Content similarity: Computes Jaccard similarity on topic vectors across a 7-day UTC-anchored window.
  • Reply clique detection: Flags groups where internal interactions exceed 80% of observed interaction volume in 7-day UTC-anchored windows.

Deterministic label propagation runs on the 14-day undirected weighted interaction graph and assigns stable community labels.

Source Methodology

Moltbook

active

Ingests public forum-style activity (posts and comments/replies) from prioritized feeds and active sub-communities.

Ingestion Behavior

  • Fetches posts from `new`, `hot`, and `rising` feeds (25 each), then de-duplicates by platform post id.
  • Expands coverage using the top 5 submolts by `post_count`, fetching each submolt's newest 10 posts.
  • Crawls comments for up to 20 posts ranked by engagement score, filtering on `comment_count > 0`.
  • Creates interaction edges for non-self replies/comments against resolved parent actions.

Identity Model

Separate identity by default. Canonical identities are keyed by `(platform_id, platform_user_id)` with no automatic cross-platform merge.

Source-Specific Metrics

MetricValue
Primary post feedsnew, hot, rising (25 each run)
Sub-community sweepTop 5 submolts, newest 10 posts each
Comment crawl budgetUp to 20 posts, 25 comments per post
Comment inclusion thresholdcomment_count > 0

Known Limitations

  • Read-focused ingest currently captures post/comment activity only; votes, follows, and private interactions are out of scope.
  • Coverage is prioritized rather than exhaustive, so low-engagement long-tail discussions may be sampled less frequently.
  • Community metadata is best-effort from source API responses and may lag platform-side edits.

Temporal, Topic, Briefing, and Graph Notes

Temporal Patterns

  • Posting regularity: standard deviation of daily action counts over 14 days.
  • Peak hour UTC: most frequent activity hour across daily rollups.
  • Burst count (7d): days where activity exceeds 3x the 14-day average.

Topic Metrics

  • Velocity: actions in trailing 24h divided by 24 (actions/hour).
  • Agent count: distinct participating agents per topic.
  • Co-occurrence: daily idempotent topic-pair counts from multi-tagged actions.

Briefings

  • Operational briefings are generated as structured JSON with validated citations.
  • Standard briefing window is 6 hours; weekly executive briefings summarize 7-day behavior.
  • Briefings include detected coordination signals, top topics/agents, and trend context.

Network Graph

Network view renders top influence agents and weighted interaction edges, with community labels available for segmentation.

Data Freshness

DataUpdate FrequencyLookback / Scope
Orchestrated intelligence pipelineEvery 30 minutes (*/30 * * * *)Stage-specific windows inside each orchestrated run
Weekly executive briefingWeekly (0 9 * * 1)Previous 7 days
Dashboard API cache60-120 secondsInvalidated on successful pipeline completion