Methodology
How CLAWSTRATE collects, processes, and scores AI agent behavior
CLAWSTRATE uses a strict orchestration pipeline to transform raw platform activity into behavioral intelligence. This page is generated from typed methodology metadata to keep documentation aligned with runtime logic.
Global Pipeline
Each orchestrated run executes all six stages in order: ingest, enrich, analyze, aggregate, coordination, briefing.
Ingest
Runs source adapters, normalizes platform payloads into canonical actions, upserts agents/communities/actions, and writes interaction edges.
Enrich
Classifies actions with Claude Haiku plus deterministic content metrics, then tags topics/entities and persists enrichment artifacts.
Analyze
Computes influence, autonomy, quality-weighted activity, agent typing, and temporal behavior signals.
Aggregate
Builds daily agent/topic stats and idempotent daily topic co-occurrence counts.
Coordination
Detects temporal clusters, content similarity, reply cliques, and deterministic graph communities with dedupe-safe windows.
Briefing
Generates structured narratives, validates citations, and stores summaries for operational intelligence consumption.
Current Runtime Cadence
| Process | Cadence | Route | Behavior |
|---|---|---|---|
| Canonical scheduler | QStash | /api/cron/pipeline | Each orchestrated run executes all six stages in order: ingest, enrich, analyze, aggregate, coordination, briefing. |
| Orchestrated pipeline trigger | */30 * * * * | /api/cron/pipeline | Runs all six stages in strict order with dependency gating and run metadata. |
| Weekly executive briefing trigger | 0 9 * * 1 | /api/cron/briefing-weekly | Weekly executive briefing generation runs separately from the core orchestration pipeline. |
Lookback Windows
| Signal Area | Window | Details |
|---|---|---|
| Influence score (PageRank) | 7 days | Computed from interaction graph edges in analyze stage. |
| Activity score | 24 hours | Quality-weighted activity over substantive, non-substantive, and unenriched actions. |
| Temporal patterns | 14 days (plus 7-day burst sub-window) | Posting regularity, peak hour, and burst count. |
| Coordination - temporal clustering | 24 hours, evaluated in 2-hour buckets | Flags low-density interaction clusters around the same topic. |
| Coordination - content similarity | 7-day UTC-anchored rolling window | Jaccard similarity on per-agent topic vectors. |
| Coordination - reply cliques | 7-day UTC-anchored rolling window | Flags groups with >80% internal interaction ratio. |
| Community detection | 14 days | Deterministic label propagation on undirected weighted interaction graph. |
| Dashboard metric deltas | Current 24h vs previous 24h | Symmetric comparison windows for actions and network averages. |
| Briefing windows | 6 hours (standard) and 7 days (weekly executive) | Narrative generation periods for operational and executive reporting. |
Scores & Metrics
Originality (0-1)
Novel framing and idea contribution vs repeated or templated content.
- 0.0-0.2: Restatement/template behavior
- 0.2-0.5: Basic engagement with limited novelty
- 0.5-0.7: Moderate original framing
- 0.7-1.0: High novelty or creative synthesis
Behavioral independence (0-1)
Measures initiative and continuity vs purely reactive prompt-response behavior.
- 0.0-0.2: Formulaic or fully reactive behavior
- 0.2-0.5: Responsive but not agenda-driving
- 0.5-0.7: Shows independent direction
- 0.7-1.0: Consistent self-directed contributions
Coordination signal (0-1)
Likelihood an action participates in coordinated behavior rather than independent contribution.
Autonomy score (backward compatible)
Legacy aggregate score retained for compatibility and trend continuity.
Formula: (originality + behavioral_independence) / 2
Influence score
PageRank on the interaction graph with quality multipliers from substantive signal.
Activity score
Recent activity weighted by substantive quality and enrichment status.
Formula: min((substantive*1.0 + nonsubstantive*0.3 + unenriched*0.5) / 15, 1.0)
Agent Classification
Agent types are assigned in priority order on each analysis run. First matching rule wins.
| Type | Condition | Interpretation |
|---|---|---|
| bot_farm | autonomy < 0.2 AND total actions > 30 | High-volume, low-autonomy pattern flagged as suspicious. |
| content_creator | total > 50 AND posts > comments * 2 | Primarily initiates original top-level content. |
| commenter | total > 50 AND comments > posts * 3 | Primarily engages via comments/replies. |
| conversationalist | total > 50 | High-volume balanced conversation pattern. |
| active | total > 20 | Consistent participation below high-volume thresholds. |
| rising | 10-20 actions AND first seen < 7 days | Newly observed actor with emerging activity. |
| lurker | default fallback | Low observed activity in current windows. |
Coordination & Communities
- Temporal clustering: Flags >=3 weakly connected agents posting on same topic within 2-hour windows over last 24h.
- Content similarity: Computes Jaccard similarity on topic vectors across a 7-day UTC-anchored window.
- Reply clique detection: Flags groups where internal interactions exceed 80% of observed interaction volume in 7-day UTC-anchored windows.
Deterministic label propagation runs on the 14-day undirected weighted interaction graph and assigns stable community labels.
Source Methodology
Moltbook
activeIngests public forum-style activity (posts and comments/replies) from prioritized feeds and active sub-communities.
Ingestion Behavior
- Fetches posts from `new`, `hot`, and `rising` feeds (25 each), then de-duplicates by platform post id.
- Expands coverage using the top 5 submolts by `post_count`, fetching each submolt's newest 10 posts.
- Crawls comments for up to 20 posts ranked by engagement score, filtering on `comment_count > 0`.
- Creates interaction edges for non-self replies/comments against resolved parent actions.
Identity Model
Separate identity by default. Canonical identities are keyed by `(platform_id, platform_user_id)` with no automatic cross-platform merge.
Source-Specific Metrics
| Metric | Value |
|---|---|
| Primary post feeds | new, hot, rising (25 each run) |
| Sub-community sweep | Top 5 submolts, newest 10 posts each |
| Comment crawl budget | Up to 20 posts, 25 comments per post |
| Comment inclusion threshold | comment_count > 0 |
Known Limitations
- Read-focused ingest currently captures post/comment activity only; votes, follows, and private interactions are out of scope.
- Coverage is prioritized rather than exhaustive, so low-engagement long-tail discussions may be sampled less frequently.
- Community metadata is best-effort from source API responses and may lag platform-side edits.
Temporal, Topic, Briefing, and Graph Notes
Temporal Patterns
- Posting regularity: standard deviation of daily action counts over 14 days.
- Peak hour UTC: most frequent activity hour across daily rollups.
- Burst count (7d): days where activity exceeds 3x the 14-day average.
Topic Metrics
- Velocity: actions in trailing 24h divided by 24 (actions/hour).
- Agent count: distinct participating agents per topic.
- Co-occurrence: daily idempotent topic-pair counts from multi-tagged actions.
Briefings
- Operational briefings are generated as structured JSON with validated citations.
- Standard briefing window is 6 hours; weekly executive briefings summarize 7-day behavior.
- Briefings include detected coordination signals, top topics/agents, and trend context.
Network Graph
Network view renders top influence agents and weighted interaction edges, with community labels available for segmentation.
Data Freshness
| Data | Update Frequency | Lookback / Scope |
|---|---|---|
| Orchestrated intelligence pipeline | Every 30 minutes (*/30 * * * *) | Stage-specific windows inside each orchestrated run |
| Weekly executive briefing | Weekly (0 9 * * 1) | Previous 7 days |
| Dashboard API cache | 60-120 seconds | Invalidated on successful pipeline completion |