Skip to main content

The Plumbing: A Composable Reference Implementation for FedRAMP 20x and Rev 5

Two frameworks, one truth, and the layer below them.

Author's Note

This paper documents an implementation that runs on the infrastructure of this website. Every assertion the paper makes about the artifacts it describes is verifiable: anyone can curl the published documents from /.well-known/, run cosign verify-blob against the Sigstore bundle, browse the artifacts in the live trust dashboard, and check that what the paper claims matches what is actually being served. The implementation is open-source at github.com/sam-aydlette/samaydlette.com.

The system is small. One CSP, one operator, no end users, no federal customer data. The patterns it demonstrates are not.

I wrote earlier in Pearls, PURLs and the FedRAMP 20x Inventory Problem about why a standardized component-naming layer is what FedRAMP 20x needs to deliver on its premise. This paper is the implementation of that argument, plus the part the article didn't cover: how the same primitive serves Rev 5 SSPs, and what the broader business and societal benefits are when compliance reports start composing.

Abstract

Compliance evidence in federal cloud is structurally fragmented. CSPs produce per-system reports (KSI signals under FedRAMP 20x, SSPs under Rev 5, vulnerability scans, SBOMs, audit trails), each in its own format, with its own identifiers, requiring its own reconciliation when consumers want a portfolio view. The fragmentation is the cost: assessment time, drift between SSPs and live systems, the impossibility of cross-CSP risk aggregation, the assessment-to-authorization gap that produces "compliance as a snapshot" instead of "security as a continuous property."

This paper presents a reference implementation that addresses the fragmentation by treating the naming of components as a primitive layer below all the reports. Every report references components by canonical identifiers — PURLs for software, content hashes for static artifacts, normalized cloud types plus native IDs for cloud resources, HBOM references for hardware — so two systems naming the same thing produce bit-identical identifiers and reports compose across systems without reconciliation.

The implementation runs on a static personal website. It includes: a JSON Schema for KSI signals extended with normalized component identifiers, FIPS 199 security categories, and information-flow summaries per the FedRAMP 20x Minimum Assessment Scope rule; a deploy-time emitter that generates a signal from Terraform state, package locks, content hashes, GitHub Actions provenance, and OPA policy results; a runtime emitter (AWS Lambda) that re-validates the live configuration daily; Sigstore keyless signing of the deploy-time signal with verification anchored in the public Rekor transparency log; an OSCAL Rev 5 System Security Plan generator and an OSCAL Plan of Action and Milestones generator (both validated against the official NIST OSCAL schemas) that derive from the same canonical inventory; a folder of family policy documents (one per -1 control) that the SSP references by path; a Vulnerability Detection and Response aggregator that runs at build time, classifies findings on the FedRAMP 20x PAIN scale, and blocks the deploy if any finding exceeds the Class C remediation tolerance; a Significant Change Notification gate that requires every pull request to declare its change-class; and a FedRAMP Integrated Inventory Workbook (SSP Appendix M) projector that emits the same canonical inventory in IIW column shape. Eight artifacts are published at /.well-known/ on every deploy.

The technical pattern is implementable today. The hard work that remains is governance, not engineering.

Background: The Inventory Problem

Two FedRAMP 20x KSI signals from different CSPs nominally describe similar things — components and validations — but they do not compose into a portfolio view, because the components are named differently. CSP A identifies an EC2 instance by ARN; CSP B identifies an Azure VM by resource ID. Both are "compute instances" in any reasonable model. No automatic join exists across them.

The same fragmentation appears within a single CSP across frameworks. A 20x KSI signal and a Rev 5 SSP from the same provider describe the same system, but they are produced by different teams from different source data and rarely reconcile without manual reviewer effort. The result is what assessors call SSP drift: the security plan documents one configuration, the live system has evolved into another. Most audit findings reduce to some form of this gap.

Cross-portfolio risk aggregation is the most visible casualty. A query like "which of my authorized systems contain a particular library that just had a critical CVE" cannot be answered in any reasonable time, because the systems were inventoried by different schemas and the join across them does not exist. Each system has reasonable inventory; the inventories do not compose.

The Architectural Move: Layer Before Report

The reframe at the heart of this implementation is to treat the canonical inventory as a layer rather than a deliverable: the architectural primitive that makes everything above it composable.

The distinction matters because most compliance discussions blur it. The phrase "we need better inventories" is usually heard as "we need a better deliverable called Inventory." That framing concedes the structure of the problem to the existing fragmented model: another report, another reconciliation, another deliverable. The right framing is closer to "we need every report to reference components the same way." The inventory is what every report uses, not a separate document anyone reads.

The closest analog comes from networking. TCP/IP became the universal substrate of the internet by solving the addressing problem one level below where applications care, not by being the most expressive protocol. HTTP, SMTP, DNS, SSH, and every other application protocol that rides on top of TCP/IP never had to negotiate addresses with each other. They could each evolve independently, each express their own semantics, each handle their own concerns: the layer below resolved example.com to 93.184.215.14 the same way for all of them. The composability of the modern internet is a property of the addressing layer, not of any single protocol that runs on it.

The canonical inventory plays the same architectural role for compliance reports: a small set of canonical identifiers that compose without negotiation. The full identifier vocabulary is unpacked in the next section; the architectural point is that two systems naming the same thing produce bit-identical identifiers, and reports built on those identifiers join automatically.

Architecturally, this is significant: the difference between an ecosystem of reports that fragments at every framework boundary and one that composes across them. And the move requires no single party (not FedRAMP, not NIST, not CISA, not any individual CSP) to control the namespace. PURL is an open standard, content hashes are physics, and cloud-resource ARNs are CSP-internal but globally unique by construction. The composability is a property of the identifiers themselves, not of any registry that issues them.

When this layer is in place, a KSI signal is no longer a "framework-specific report a CSP produces." It is "a structured document that references inventory components and reports validation results against them, in a 20x-shaped wrapper." An OSCAL Rev 5 System Security Plan is "a structured document that references inventory components and asserts control implementations against them, in a 53-shaped wrapper." An SBOM is the same again, in a CycloneDX or SPDX wrapper. A vulnerability report is the same. None of these reports needs to know about the others. They all reference the same components, and consumers join on that.

Implementation

The system runs on a static website hosted on AWS: S3 for origin, CloudFront for CDN, Lambda for runtime validation, Route 53 for DNS, ACM for TLS. The compliance pipeline is a layer on top, expressed as Terraform configuration and Open Policy Agent policies. GitHub Actions drives every deploy.

The deploy chain is short. OPA evaluates the Terraform plan and every HTML artifact against the policies in infrastructure/policies.rego; violations block. Terraform applies. The deploy-time emitter (scripts/build-ksi-signal.py) joins five sources into one canonical document: Terraform state, the Lambda's package-lock, the hashed website tree, GitHub Actions provenance, and the OPA results. cosign signs that document via Sigstore keyless using the GitHub Actions OIDC identity; the certificate is issued by Fulcio and the signature is recorded in the public Rekor transparency log. Four report generators then read the canonical inventory and emit derived artifacts: the FedRAMP 20x KSI signal, the NIST OSCAL Rev 5 SSP, the NIST OSCAL POA&M, and the FedRAMP IIW (Appendix M) projection. The VDR aggregator classifies any scanner findings on the FedRAMP 20x PAIN scale and blocks the deploy if anything exceeds the Class C remediation tolerance. All artifacts plus the schema are published at /.well-known/, each Sigstore-signed.

Two CR26 rules sit alongside the deploy chain rather than inside it. Significant Change Notifications are gated at commit time: every push to main (and every PR description) is checked for an SCN-Type: tag with one of three recognized values; the default is routine-recurring per SCN-RTR-NNR and missing tags pass. The git history is the SCN audit record (SCN-CSO-MAR). The Minimum Assessment Scope rule lives in the schema: every component carries a FIPS 199 security category and an information-flow summary, populated by the deploy-time emitter (MAS-CSO-FLO). Families inherited from AWS — PE, MA, MP, parts of CP, parts of SC — cite the AWS authorization package AGENCYAMAZONEW and stop there.

A daily AWS Lambda (infrastructure/lambda/index.js) re-evaluates the live AWS configuration against the same compiled Rego the deploy gate uses, loaded as a Wasm module compiled at build time via opa build -t wasm. The Lambda is a thin host: input transformation from AWS API responses, then policy evaluation, then signal emission. There is no JavaScript port of the rules. The deploy-time gate and the runtime emitter run identical compiled bytes; drift between them is structurally impossible. The Lambda writes a fresh runtime KSI signal at /.well-known/ksi-signal-runtime.json; the divergence between deploy-time and runtime is the externally-visible drift surface.

The Sigstore chain replaces every key-custody question with three public-infrastructure questions. A consumer wanting to verify any signed artifact runs cosign verify-blob with the published bundle and a regex matching the GitHub Actions workflow identity. The verifier checks that the signature is valid for the bytes, that the signing certificate was issued by Fulcio to the named GitHub Actions identity, and that the signature appears in the public Rekor log with a valid inclusion proof. There is no per-CSP key to manage, no per-CSP trust relationship to establish, no rotation schedule to track, no key compromise to recover from. Trust reduces to: do you trust GitHub's OIDC issuer, Fulcio's CA, and Rekor's log? Three public services, thousands of independent observers, and any compromise is detectable by anyone running independent verification.

The total annual operating cost is approximately $125. That number is not a typo. Lambda executions, EventBridge rules, and CloudWatch logs together come to about $125 per year for a daily cadence. KMS, Sigstore, OPA, and AWS API costs are zero.

The Canonical Inventory in Practice

The canonical inventory is the components[] field of the KSI signal. Each entry has a normalized type drawn from a small enum, an optional native_id (the CSP's own identifier where applicable), and a global_id block carrying one or more cross-CSP identifiers. The schema is published at /.well-known/ksi-signal.schema.json.

Five normalization decisions make the inventory composable:

Software components are identified by Package URL. PURL is global by construction. A consumer querying pkg:npm/left-pad@1.3.0 across all KSI signals from all CSPs gets exactly the systems that include it, regardless of CSP. This works because PURL is a self-describing global identifier; it would not work if each CSP minted its own software identifier.

Container images are identified by OCI digest. The sha256:<hex> form serves the same role as PURL for software libraries: a global, content-addressable identifier that does not depend on which registry hosts the image.

Cloud resources are identified by normalized type plus native ID. Cloud resources have no global identifier today. The schema does not invent a synthetic cross-CSP ID, because doing so would require either a registry (operationally expensive and politically fraught) or a hashing scheme (loses the round-trip to the CSP's own tooling). Both options are worse than the alternative. Each CSP's native shape projects into a small normalized type vocabulary, with the native_id alongside. object_store covers an S3 bucket, an Azure Blob container, a GCS bucket; cdn_distribution covers CloudFront, Azure Front Door, Cloud CDN; function covers Lambda, Azure Functions, Cloud Functions; compute_instance covers EC2, Azure VM, Compute Engine. The type is the join key for portfolio reasoning. The native ID stays specific because anyone querying the CSP's API needs it.

Static artifacts are identified by SHA-256. No PURL ecosystem exists for arbitrary static content. The bytes' hash is a perfectly good global identifier and lets consumers compare bit-equality across systems.

Hardware components are identified by HBOM reference. Per CycloneDX HBOM and CISA's 2023 Hardware Bill of Materials Framework, hardware components reference a CycloneDX HBOM document via <bom-ref>@<bom-uri>. The schema slot is reserved; this static-site implementation does not exercise it because there is no hardware to inventory.

Equally important is what the schema deliberately does not normalize: regions, account or subscription identifiers, tag schemas, IAM principal naming, and policy semantics. These vary across CSPs in ways that have no clean projection. They live in attributes, free-form, opaque to the schema. The discipline of naming what is canonical and what is not is itself a contribution. Past attempts at compliance schemas have over-reached, tried to standardize too much, and failed because the standardization cost exceeded the composability benefit. A small, defensible set of canonical names that handles the join cases is more useful than a comprehensive vocabulary that no one adopts.

Differentiation: Per-Control Status and Origination

Each implemented-requirement in the generated SSP carries differentiated values for two OSCAL properties: implementation status and control origination.

Implementation status uses the OSCAL-standard enum:

  • implemented: fully implemented by this system's code and configuration
  • partial: partially implemented, with the rest noted
  • planned: not implemented but planned (tracked in the POA&M)
  • alternative: addressed via compensating control
  • not-applicable: does not apply to this system, with rationale

Control origination uses the FedRAMP-standard enum:

  • sp-system: implemented in this system's code or configuration
  • sp-corporate: implemented at the operator/organization level (training, IR planning, review cadences)
  • shared: partially inherited from the underlying CSP, partially this system's responsibility
  • inherited: fully inherited from the underlying CSP
  • customer-configured: would be configured by an end customer (N/A here; no end customer)
  • customer-provided: would be provided by an end customer (N/A here)

The generator resolves status and origination per control through three layers:

  1. Per-control overrides for ~30 of the most load-bearing controls (CM-2/3/3.2/4/5/6/7/8, AC-2/3/4/6/17, AU-2/3/12, IA-2/3, IR-4/6/8, RA-5, SC-7/8/13, SI-2/3/4/7, SR-3/11) carry hand-written implementation statements that name specific evidence: configuration files, IAM policies, log paths, signed artifacts.
  2. Family-level fall-throughs (covering AC, AT, AU, CA, CM, CP, IA, IR, MA, MP, PE, PL, PM, PS, RA, SA, SC, SI, SR) handle the remaining controls in each family with statements that name how the family is addressed at the operator or inherited level.
  3. The *-1 (policy and procedures) controls in every family route to a templated statement pointing at the relevant documentation.

Live-signal validation results downgrade implemented to partial automatically. If a contributing KSI shows failing validations in the runtime signal, the SSP control downgrades and a remark points at the failure.

The current generator emits 331 implemented-requirements (full FedRAMP Rev 5 Moderate baseline of 323 controls plus eight KSI-extension controls beyond the baseline) with this distribution: 74% implemented, 26% not-applicable, 0% partial. By origination: 52% sp-system, 19% sp-corporate, 16% shared with AWS, 13% inherited from the AWS authorization (package AGENCYAMAZONEW).

An assessor reviewing this SSP has specific things to verify per control rather than blanket statements to reconcile. The work of assessment shifts from verifying that the SSP is accurate against the live system, which is mechanical, toward verifying that the implementation statements are sufficient, which is judgment.

What This Proves

Each claim below is observable in the running system, not asserted in the abstract.

  • The schema is implementable: the live KSI signal validates against it on every deploy.
  • Sigstore keyless signing works end-to-end: cosign verify-blob succeeds against all five published bundles. No private key in the system.
  • The four-reports-from-one-inventory composition runs on every deploy, not as an architectural argument but as the deploy chain.
  • Differentiated-status SSP generation: 30 hand-written overrides, ~300 family-default-generated, 331 total — observable in the published artifact.
  • Runtime/deploy policy parity: identical compiled Wasm bytes, drift between policy versions structurally impossible.
  • Public publication is viable at this scale: threat-modeled, bounded blast radius. Access control of compliance state is an editorial choice, not a security requirement.
  • Build-as-report for VDR: per-deploy emission, Class C SLA enforced, blocking conditions wired.
  • Annual operating cost: ~$125, observed.

Demonstrated by schema construction, gated on multi-party adoption. The implementation is a single producer. Cross-CSP join, portfolio-level aggregation, and programmatic third-party-risk consumption all rely on other parties publishing conformant signals. The schema makes those joins correct in principle; a working federation requires actual conformant signals from multiple producers. That is the chicken-and-egg pattern every coordination protocol begins with.

Alignment with the Consolidated Rules for 2026

FedRAMP’s Consolidated Rules for 2026 (CR26) reorganize the framework around machine-readable authorization data, text-based equivalents in place of Word and Excel deliverables, and five rule sets that previously lived in separate processes and now fold into the default requirements. This implementation integrates all five, citing specific rule IDs the system actually mechanizes:

  • MAS — Minimum Assessment Scope. Every component in the canonical inventory declares a FIPS 199 security category and information flows per MAS-CSO-FLO. Third-party AWS resources are documented at the inheritance level per MAS-CSO-TPR. The schema enforces that the fields exist and are populated; the values are an editorial decision per system.
  • SCN — Significant Change Notifications. The CI workflow validates every push to main and every PR description for an SCN-Type: tag (one of adaptive, routine-recurring, transformative), with routine-recurring as the default per SCN-RTR-NNR. The git history is the audit record per SCN-CSO-MAR. Adaptive and Transformative changes carry the corresponding tier-rule notification timeframes from SCN-ADP-NTF (10 business days) and SCN-TRF-NIP / SCN-TRF-NFP / SCN-TRF-NAF (30 + 10 + 5 business days).
  • CCM — Collaborative Continuous Monitoring. Every monitoring artifact is published publicly at /.well-known/, signed via Sigstore, verifiable from anywhere. The runtime KSI emitter, deploy-time OPA gate, and build-time VDR aggregator together satisfy the persistent-reporting model in CCM-* (rules CCM-OCR-AVL, CCM-OCR-RPS, CCM-OCR-AFS, CCM-QTR-MTG). The consolidated strategy lives in docs/continuous-monitoring-plan.md.
  • VDR — Vulnerability Detection and Response. The build is the report. The aggregator classifies findings per VDR-EVA-EPA (PAIN N1-N5), VDR-EVA-EIR (internet-reachable), VDR-EVA-ELX (likely-exploitable), and the CISA KEV catalog per VDR-TFR-KEV; it blocks the deploy when any finding exceeds the Class C tolerance from VDR-TFR-PVR. Per-deploy emission satisfies VDR-RPT-PER and overshoots the monthly minimum in VDR-TFR-MHR. The VDR-RPT-VDT and VDR-RPT-AVI fields are populated for each finding and each accepted vulnerability respectively.
  • CDS — Certification Data Sharing. All certification data is published at /.well-known/ with no access gating — KSI signal, Sigstore bundle, schema, OSCAL SSP plus bundle, OSCAL POA&M plus bundle, VDR report plus bundle, IIW CSV plus bundle, runtime KSI signal — consistent with CDS-CSO-PUB (publicly available), CDS-CSO-CBF (consumer-friendly format), and CDS-CSO-RPS (continuously refreshed). A consumer with curl, jq, and cosign verifies the system’s compliance state independently from anywhere on the public internet.

CR26 also retires DOCX and XLSX as acceptable formats for Class A–C certifications in favor of simple text-based equivalents (per notice NTC-0009). Every artifact in this implementation is JSON, markdown, CSV, or YAML — text-based and diff-able. The IIW CSV is the text-based equivalent of the Excel workbook the FedRAMP processes still ask for; the OSCAL SSP carries the structured inventory the IIW projects from; the family policies, POA&M, CMP, PTA, and Rules of Behavior are markdown. CR26 also removes the diagram-and-illustration requirement after the MAS transition; the boundary diagram in this codebase is supplementary human-readable context, not a load-bearing compliance deliverable.

The argument the paper makes is therefore not “this is what FedRAMP could accept” but “this is what CR26 will require, mechanized.”

Business Problems This Addresses

The technical pattern, generalized, addresses several recurring business problems in cloud compliance.

Assessment cost. Most of a 3PAO assessment's cost goes to reconciling the SSP's claims against the live system's actual configuration and writing the gap into the assessment report, not to evaluating control implementations. When SSPs are auto-generated from canonical inventories, the gap is closed by construction: the SSP describes what is. An assessor's time shifts from "verifying that the SSP is accurate" to "verifying that the inventory reflects reality and the implementation statements are sufficient." The first is mechanical; the second is judgment, which is the higher-value activity. The cost of full assessment drops, and the cost of continuous assessment, which is what cATO promises and what 20x is structured around, becomes tractable instead of theoretical.

SSP drift. The most common audit finding across federal cloud authorizations is some form of "the SSP says one thing, the system does another." This finding exists because SSPs are written once, by humans, and the system changes daily. Auto-derivation eliminates the class of finding. Drift cannot exist between the SSP and the system because the SSP is generated from the system on every deploy. A consumer reading the SSP at any moment is reading the current system, not a snapshot from the last formal review.

Framework transition. CSPs in 2026 face a transition between Rev 5 and 20x, and many face simultaneous compliance with state-level programs (StateRAMP, TX-RAMP), industry-specific frameworks (HIPAA, PCI), and customer-driven assertions (SOC 2). Each framework asks for largely the same evidence in a different shape. The dual-emit pattern shows that the cost of supporting multiple frameworks is not multiplicative in the system's complexity; it is additive in the number of report generators above one shared inventory. The work of writing each report generator is bounded and one-time. The work of maintaining each report's content is zero, because the content derives from the live inventory.

Cyber insurance underwriting. The cyber insurance market has struggled to price risk because insurers cannot get reliable, current data on the systems they are insuring. They rely on questionnaires (subject to misrepresentation), point-in-time audits (immediately stale), or self-reported attestations (varying in quality). A canonical inventory with a Sigstore-signed verifiable provenance chain gives insurers something they currently cannot get: a continuously updated, cryptographically attested, machine-queryable description of the insured system's components and their security posture. Whether insurers actually adopt this is a market question, not a technical one. The technical barrier to adoption, "we cannot get reliable data," is structurally addressable.

Vendor risk management. Most third-party-risk-management functions spend the majority of their time chasing answers from suppliers. The supplier sends a SOC 2 report twelve months out of date; the TPRM team translates it into their internal scoring system; the score informs a procurement decision; and by the time the decision is made, the supplier's actual posture has changed. A supplier publishing a continuously updated KSI signal at a stable URL (even if the URL is gated behind authorized customer access) collapses the cycle. The TPRM consumer queries the signal, joins it against their portfolio, and gets an answer in seconds rather than weeks.

What Remains

The patterns above are domain-specific. The structural problem they each address is the same, and shows up across society wherever risk is aggregated across heterogeneous systems run by independent parties: insurance pricing, banking stability, defense capability assessment, public-health surveillance. Each depends on composing information across organizations that have no other reason to share data. When every reporting entity uses different identifiers, different schemas, different attestation chains, the aggregator’s job becomes bounded by reconciliation cost rather than analytic insight. The canonical-inventory pattern is one instance of a more general structural answer: do the standardization at the layer below where the differentiation actually matters.

The components for this layer already exist. PURL is well-established for software supply chain. Sigstore is operational. CycloneDX HBOM exists. SLSA provenance is gaining adoption. The piece that does not exist is the will to assemble these into a coherent compliance substrate, plus the political agreement that this is where the standardization should live.

Engineering at scale. A real IaaS CSP has tens of thousands of components in its canonical inventory. The schema scales because each component is a flat record and joins are by identifier rather than by traversal. The bottleneck is the inventory generator: a real CSP would need incremental rebuilds rather than full regeneration on every deploy. Multi-tenancy needs explicit handling: a SaaS publishing one canonical inventory for the platform plus per-tenant inventories for each customer’s configured surface is the obvious shape; the operational discipline of producing them at scale is harder. Per-customer overlays (Agency A wants FIPS 140-3, Agency B wants StateRAMP-equivalent) are bounded engineering — each is another report generator above the same inventory layer. Publication policy is a knob; this PoC is fully public, a real regulated CSP would gate per-system signals through agency-authenticated access without changing the pipeline.

Coordination is the hard part. Cloud Service Providers compete partly on inventory opacity; uniform comparison across providers is not a thing several CSPs are eager to enable. Compliance gatekeepers earn the gap that canonical inventories would close. Regulators have sunk-cost reasons to maintain current frameworks rather than coalesce around a shared substrate. Each is a rational local incentive that pulls against the global optimum.

Coordination protocols are slow and not always solved. TCP/IP took roughly a decade from initial deployment to clear dominance. Container shipping took 30 years. They win on patient work plus lucky political openings plus early adopters who tolerate the cost of going first. They almost never win on being technically obviously correct from the beginning. This implementation is one early adopter, going first on a static personal website, demonstrating that the cost of compliance composability is roughly $125 per year and a few weekends of engineering. Whether enough other actors follow is a question about coordination, not about the schema. The technical answer exists.

Background reading on the proposal this paper implements: