System-Aware Governed Agents

Repository-level context is enough for small changes inside a well-bounded code base. It is not enough for distributed systems.

Most production systems are not one repository. They are service boundaries, infrastructure modules, queues, databases, dashboards, runbooks, IAM policies, deployment pipelines, ownership rules, data contracts, and operational habits. An agent that only sees one repository can be locally correct and systemically wrong.

Level 4 maturity begins when the unit of reasoning changes from repository to system.

The System Graph

Agents need a system graph, not just a file tree.

At minimum, the graph should connect:

repositories;
services and components;
owners;
runtime environments;
deployment pipelines;
Terraform scopes;
AWS resources or equivalent cloud assets;
dashboards, alerts, and runbooks;
upstream and downstream dependencies;
data classifications and criticality.

This does not require a perfect enterprise architecture model. It requires a useful join key.

For example:

system_id: messaging-platform
component_id: delivery-api
owner_team: platform-messaging
criticality: tier1
data_classification: confidential
terraform_scope: service_stack
agent_context_level: system

The names are illustrative. The important point is that repositories, infrastructure, dashboards, and ownership metadata can be joined.

AWS Tags and GitHub Metadata

In cloud systems, tags are often the closest thing to an operational ontology. They are not glamorous, but they can connect runtime assets to systems, teams, cost centers, environments, data classification, and criticality.

GitHub custom properties or equivalent repository metadata can play the same role on the code side. If repository metadata and cloud tags share system IDs and component IDs, an agent can reason about blast radius with better evidence.

Without that metadata, cross-repo work becomes guesswork.

Policy Before Authority

System-aware agents require stronger governance because the blast radius is larger. It is not enough to say “the agent may open PRs.” The platform must know which actions are possible under which risk class.

Useful capability roles might include:

read-only context;
branch writer;
pull request author;
test runner;
Terraform planner;
non-production deployer;
non-production rollback proposer;
production read-only inspector.

These roles should be capability-based, not broad hierarchical permissions. A role assumption should depend on repository, branch, workflow, environment, risk class, approved spec, commit SHA, and approval status. OIDC-based temporary credentials are preferable to long-lived secrets because they give the platform a place to attach conditions.

Architecture Becomes Runtime Context

Architectural documentation often decays because it lives outside delivery. For agents, that decay is dangerous. Stale architecture context is worse than no context because it gives false confidence.

The architecture function therefore becomes more operational:

define system and component taxonomy;
maintain service dependency maps;
decide which invariants are machine-checkable;
align repository metadata with cloud tags;
define cross-repo change boundaries;
identify which reviews are required by change class.

This is not architecture as ceremony. It is architecture as agent context.

What Agents May Do at Level 4

At Level 4, agents may be allowed to:

propose multi-repo plans;
identify likely blast radius;
coordinate related pull requests;
analyze infrastructure impact;
prepare non-production changes under policy;
generate review evidence across service boundaries.

They still should not freely apply production Terraform, expand IAM, alter auth, run data migrations, or merge architecture-wide refactors. Those actions require exceptional controls.

The Failure Mode

The failure mode of system-aware adoption is local optimization with better tooling. An agent fixes a service, but violates a contract. It simplifies a module, but breaks a deployment assumption. It changes a queue producer, but not the consumer. It updates Terraform, but misses an operational tag. It improves test coverage, but hides an observability gap.

These are not model failures alone. They are platform context failures.

System-aware governance exists to make those failures less likely and more visible when they still happen.