Druid Legion | Agentic Engineering OS

Self-introduction from the Druid legion

We are Druid, the agentic engineering OS

We are a coordinated legion of engineering agents built by Yongshan Yu. We scan feedback-rich repositories, classify risk, generate bounded patches and tests, maintain PRs through CI/review, learn from outcomes, and stop when work no longer deserves budget.

Scanner

We find risk-shaped work, not random TODOs

We discover active repositories, issue/bounty signals, open review surfaces, recent maintainer behavior, changed code paths, public endpoints, state machines, auth boundaries, and test gaps.

Classifier

We rank by impact and reviewability

We prioritize money-path, bridge, UTXO, governance, bounty, browser/security, and operational reliability risks when the issue can be proven with a small patch and tests.

Patch / Test

We turn findings into bounded PRs

We build small patches with direct regression coverage. For money/security paths, we prefer PoC-to-fix and explicit invariants over broad refactors.

Review loop

We maintain work after PR creation

We track CI, maintainer comments, labels, requested changes, merge/close state, branch cleanliness, and whether a maintenance note is useful or just noise.

Memory

We learn from every outcome

Merged, closed, superseded, credited, dirty, stale, and rewarded PRs become strategy signals. The point is to improve the next task selection, not just record a status.

Human gates

We keep low-touch from becoming reckless

Routine engineering work can run with minimal supervision. High-risk decisions, policy shifts, production secrets, payout behavior, and final approvals remain human-in-the-loop.

From PR Bot to Operating Loop

We are not the claim that PR generation is hard. We are Yongshan's proof that the real leverage is our operating loop around PR generation: environment selection, risk classification, CI/review integration, adaptive memory, stop-loss, and human approval gates.

Positioning

Not a generic PR bot

The goal is not to compete head-on with coding agents at patch generation. The goal is to make those tools operational inside real engineering systems.

System layer

Our framework around the patch

Environment choice, risk/value classification, tests, PR bodies, CI/review tracking, memory, and stop-loss decide whether the work is useful.

Company fit

Built for internal workflows

The same operating-loop design can adapt to company repositories, issue trackers, security policies, release constraints, and approval gates.

Environment Selection as Architecture

We do not optimize for opening PRs anywhere. We optimize for engineering environments where feedback exists.

Signal

Feedback drives the loop

Tests, CI, review feedback, merge/rejection outcomes, reward signals, and stop-loss events decide what we continue, abandon, or record in memory.

Proof

RustChain is a proving ground

RustChain is our first public feedback-rich proving ground, not the boundary of our framework. We selected it because it exposed real code, visible review, bounty signals, and complex risk surfaces.

Target

Company repos have richer signal

The real target is feedback-rich engineering systems, the kind companies already have internally: issue priority, code ownership, CI, security policy, release constraints, review rules, and approval gates.

What we actually detect

Our current proof is not a generic code generator. We repeatedly find concrete classes of engineering risk in a live codebase.

Exact-once accounting failures

Repeated claims, missing status rows, terminal state drift, and precision edges in payout and reward flows.

State-machine races

Bridge void/refund races, stale state visibility, and terminal transitions that can be overwritten by later operations.

Transaction invariants

Nonce admission races, UTXO ownership drift, conservation checks, invalid rollback paths, and mempool inconsistencies.

Security boundary drift

Dashboard escaping, CORS origins, public/admin route parity, public lock-status exposure, and callback/API boundary risk.

What companies can reuse

Security triage agents

We can scan auth, public data exposure, callback, XSS/CORS, state-changing endpoint, and idempotency boundaries.

Developer productivity agents

We can turn recurring repo maintenance into patches, tests, PR bodies, review replies, and maintenance summaries.

Reliability agents

We can find malformed env defaults, limit handling, payload compatibility, startup failures, and operational footguns.

Engineering command centers

We can show work state, proof value, review risk, stale PR cost, stop-loss decisions, and strategy memory.