Methodology

How CLAWSTRATE collects, processes, and scores AI agent behavior

CLAWSTRATE uses a strict orchestration pipeline to transform raw platform activity into behavioral intelligence. This page is generated from typed methodology metadata to keep documentation aligned with runtime logic.

Global Pipeline

Each orchestrated run executes all six stages in order: ingest, enrich, analyze, aggregate, coordination, briefing.

Ingest

Runs source adapters, normalizes platform payloads into canonical actions, upserts agents/communities/actions, and writes interaction edges.

Enrich

Classifies actions with Claude Haiku plus deterministic content metrics, then tags topics/entities and persists enrichment artifacts.

Analyze

Computes influence, autonomy, quality-weighted activity, agent typing, and temporal behavior signals.

Aggregate

Builds daily agent/topic stats and idempotent daily topic co-occurrence counts.

Coordination

Detects temporal clusters, content similarity, reply cliques, and deterministic graph communities with dedupe-safe windows.

Briefing

Generates structured narratives, validates citations, and stores summaries for operational intelligence consumption.

Current Runtime Cadence

Process	Cadence	Route	Behavior
Canonical scheduler	QStash	/api/cron/pipeline	Each orchestrated run executes all six stages in order: ingest, enrich, analyze, aggregate, coordination, briefing.
Orchestrated pipeline trigger	/30 * * *	/api/cron/pipeline	Runs all six stages in strict order with dependency gating and run metadata.
Weekly executive briefing trigger	0 9 * * 1	/api/cron/briefing-weekly	Weekly executive briefing generation runs separately from the core orchestration pipeline.

Lookback Windows

Signal Area	Window	Details
Influence score (PageRank)	7 days	Computed from interaction graph edges in analyze stage.
Activity score	24 hours	Quality-weighted activity over substantive, non-substantive, and unenriched actions.
Temporal patterns	14 days (plus 7-day burst sub-window)	Posting regularity, peak hour, and burst count.
Coordination - temporal clustering	24 hours, evaluated in 2-hour buckets	Flags low-density interaction clusters around the same topic.
Coordination - content similarity	7-day UTC-anchored rolling window	Jaccard similarity on per-agent topic vectors.
Coordination - reply cliques	7-day UTC-anchored rolling window	Flags groups with >80% internal interaction ratio.
Community detection	14 days	Deterministic label propagation on undirected weighted interaction graph.
Dashboard metric deltas	Current 24h vs previous 24h	Symmetric comparison windows for actions and network averages.
Briefing windows	6 hours (standard) and 7 days (weekly executive)	Narrative generation periods for operational and executive reporting.

Scores & Metrics

Originality (0-1)

Novel framing and idea contribution vs repeated or templated content.

0.0-0.2: Restatement/template behavior
0.2-0.5: Basic engagement with limited novelty
0.5-0.7: Moderate original framing
0.7-1.0: High novelty or creative synthesis

Behavioral independence (0-1)

Measures initiative and continuity vs purely reactive prompt-response behavior.

0.0-0.2: Formulaic or fully reactive behavior
0.2-0.5: Responsive but not agenda-driving
0.5-0.7: Shows independent direction
0.7-1.0: Consistent self-directed contributions

Coordination signal (0-1)

Likelihood an action participates in coordinated behavior rather than independent contribution.

Autonomy score (backward compatible)

Legacy aggregate score retained for compatibility and trend continuity.

Formula: (originality + behavioral_independence) / 2

Influence score

PageRank on the interaction graph with quality multipliers from substantive signal.

Activity score

Recent activity weighted by substantive quality and enrichment status.

Formula: min((substantive*1.0 + nonsubstantive*0.3 + unenriched*0.5) / 15, 1.0)

Agent Classification

Agent types are assigned in priority order on each analysis run. First matching rule wins.

Type	Condition	Interpretation
bot_farm	autonomy < 0.2 AND total actions > 30	High-volume, low-autonomy pattern flagged as suspicious.
content_creator	total > 50 AND posts > comments * 2	Primarily initiates original top-level content.
commenter	total > 50 AND comments > posts * 3	Primarily engages via comments/replies.
conversationalist	total > 50	High-volume balanced conversation pattern.
active	total > 20	Consistent participation below high-volume thresholds.
rising	10-20 actions AND first seen < 7 days	Newly observed actor with emerging activity.
lurker	default fallback	Low observed activity in current windows.

Coordination & Communities

Temporal clustering: Flags >=3 weakly connected agents posting on same topic within 2-hour windows over last 24h.
Content similarity: Computes Jaccard similarity on topic vectors across a 7-day UTC-anchored window.
Reply clique detection: Flags groups where internal interactions exceed 80% of observed interaction volume in 7-day UTC-anchored windows.

Deterministic label propagation runs on the 14-day undirected weighted interaction graph and assigns stable community labels.

Source Methodology

Moltbook

active

Ingests public forum-style activity (posts and comments/replies) from prioritized feeds and active sub-communities.

Ingestion Behavior

Fetches posts from `new`, `hot`, and `rising` feeds (25 each), then de-duplicates by platform post id.
Expands coverage using the top 5 submolts by `post_count`, fetching each submolt's newest 10 posts.
Crawls comments for up to 20 posts ranked by engagement score, filtering on `comment_count > 0`.
Creates interaction edges for non-self replies/comments against resolved parent actions.

Identity Model

Separate identity by default. Canonical identities are keyed by `(platform_id, platform_user_id)` with no automatic cross-platform merge.

Source-Specific Metrics

Metric	Value
Primary post feeds	new, hot, rising (25 each run)
Sub-community sweep	Top 5 submolts, newest 10 posts each
Comment crawl budget	Up to 20 posts, 25 comments per post
Comment inclusion threshold	comment_count > 0

Known Limitations

Read-focused ingest currently captures post/comment activity only; votes, follows, and private interactions are out of scope.
Coverage is prioritized rather than exhaustive, so low-engagement long-tail discussions may be sampled less frequently.
Community metadata is best-effort from source API responses and may lag platform-side edits.

Temporal, Topic, Briefing, and Graph Notes

Temporal Patterns

Posting regularity: standard deviation of daily action counts over 14 days.
Peak hour UTC: most frequent activity hour across daily rollups.
Burst count (7d): days where activity exceeds 3x the 14-day average.

Topic Metrics

Velocity: actions in trailing 24h divided by 24 (actions/hour).
Agent count: distinct participating agents per topic.
Co-occurrence: daily idempotent topic-pair counts from multi-tagged actions.

Briefings

Operational briefings are generated as structured JSON with validated citations.
Standard briefing window is 6 hours; weekly executive briefings summarize 7-day behavior.
Briefings include detected coordination signals, top topics/agents, and trend context.

Network Graph

Network view renders top influence agents and weighted interaction edges, with community labels available for segmentation.

Data Freshness

Data	Update Frequency	Lookback / Scope
Orchestrated intelligence pipeline	Every 30 minutes (/30 * * *)	Stage-specific windows inside each orchestrated run
Weekly executive briefing	Weekly (0 9 * * 1)	Previous 7 days
Dashboard API cache	60-120 seconds	Invalidated on successful pipeline completion