Test Automation

Alloy Analyzer for Software Design Verification

HelpMeTest

20 May 2026 — 6 min read

Most software bugs are born not in code but in the design — the mental model of how a system should behave. By the time that model becomes code, tests, and documentation, the original flaw has been replicated across dozens of files and thousands of lines. Alloy is a lightweight formal modeling language that lets you express design ideas precisely and then automatically find counterexamples to your assumptions. It has a low learning curve compared to other formal methods tools, and it excels at the kinds of structural, relational problems that appear in access control, data modeling, and system architecture.

Alloy vs TLA+

Before diving in, it is worth clarifying how Alloy differs from TLA+, which was covered in our previous post on distributed systems verification.

TLA+ is primarily temporal — it describes how systems change over time. It is ideal for protocols, concurrent systems, and anything where the sequence of events matters. TLA+ specs describe state transitions and the TLC model checker explores all possible execution sequences.

Alloy is primarily structural — it describes relationships between entities at a point in time. It is ideal for data models, access control policies, type systems, and architectural constraints. The Alloy Analyzer uses a SAT solver to find instances (models that satisfy your constraints) or counterexamples (models that violate a property you claim should hold).

In practice, Alloy is often the better starting point because the learning curve is gentler and the tool provides immediate visual feedback. Many teams use Alloy for design validation and TLA+ for protocol verification.

The Alloy Language

Alloy is built on first-order relational logic. Everything is a relation — even scalar values and sets are degenerate cases of relations. The basic building blocks are:

Signatures define types (sets of atoms). Fields define relations between signatures. Facts are constraints that always hold. Predicates are named formulas. Assertions are properties you claim hold for all instances. Functions compute values from relations.

Here is a simple model of a file system:

sig File {}
sig Dir {
  contents: set File + Dir
}
one sig Root extends Dir {}

fact NoSelfContainment {
  no d: Dir | d in d.^contents
}

pred show {}
run show for 4

The run show command asks Alloy to find any instance with up to 4 atoms of each type. The Alloy Analyzer opens a visualization showing a file system tree with no cycles.

The ^ operator means transitive closure — d.^contents is all things reachable from d through the contents relation. no d: Dir | d in d.^contents means no directory is reachable from itself, preventing infinite loops.

Finding Design Flaws Early: Access Control

Access control is one of the most error-prone parts of software design. Policy rules interact in subtle ways that human reasoning misses. Alloy is particularly good here.

Consider a simple role-based access control model:

sig User {}
sig Role {}
sig Resource {}
sig Permission {}

sig System {
  userRoles: User -> set Role,
  rolePerms: Role -> set Permission,
  resPerms: Resource -> set Permission
}

pred canAccess(s: System, u: User, r: Resource) {
  some p: Permission |
    p in u.(s.userRoles).(s.rolePerms) and
    p in r.(s.resPerms)
}

-- Claim: if a user has no roles, they cannot access any resource
assert NoRoleNoAccess {
  all s: System, u: User, r: Resource |
    no u.(s.userRoles) => not canAccess[s, u, r]
}

check NoRoleNoAccess for 5

The check command asks Alloy to find a counterexample — an instance where NoRoleNoAccess is violated. If the model is correct, Alloy reports "No counterexample found." If it finds one, it displays exactly which users, roles, and resources are involved.

Now add a complexity: guest access. Resources can have permissions that anyone can access without a role.

sig System {
  userRoles: User -> set Role,
  rolePerms: Role -> set Permission,
  resPerms: Resource -> set Permission,
  publicPerms: set Permission  -- permissions anyone has
}

pred canAccess(s: System, u: User, r: Resource) {
  some p: Permission |
    p in r.(s.resPerms) and
    (p in u.(s.userRoles).(s.rolePerms) or p in s.publicPerms)
}

Now NoRoleNoAccess gets a counterexample immediately: a user with no roles can access a resource if that resource only requires public permissions. This is correct behavior, but you need to update your assertion to reflect it. The process of updating assertions forces you to make your policy explicit and precise.

Modeling Database Constraints

Relational database schemas have constraints that are easy to state informally but subtle to get right. Alloy can verify that your schema enforces what you think it enforces.

Consider an order management system:

abstract sig Status {}
one sig Pending, Processing, Shipped, Cancelled extends Status {}

sig Order {
  status: Status,
  items: some Item
}

sig Item {
  order: Order
}

fact ItemsBelongToOneOrder {
  all i: Item | one i.order
  all o: Order | o.items = order.o
}

-- Orders can only be cancelled if they are pending or processing
pred validCancellation(o: Order) {
  o.status in Pending + Processing
}

-- Can a shipped order ever become pending?
assert ShippedNotRevertable {
  all o, o': Order |
    o.status = Shipped =>
    o'.status != Pending
}

This assertion models a constraint over time — Alloy's default is static, but you can model transitions explicitly:

pred transition(o, o': Order) {
  (o.status = Pending and o'.status in Processing + Cancelled)
  or (o.status = Processing and o'.status in Shipped + Cancelled)
  -- Shipped is terminal
}

assert ShippedTerminal {
  all o, o': Order |
    o.status = Shipped and transition[o, o'] =>
    o'.status = Shipped
}

check ShippedTerminal for 5

If your transition predicate accidentally allows Shipped -> Processing, Alloy finds the counterexample instantly.

Case Study: Verifying an Authorization Policy

A real-world case study comes from a team building a multi-tenant SaaS application. They had a complex authorization policy:

Users belong to organizations
Organizations have projects
Projects have roles (viewer, editor, admin)
Admins can invite users to their organization
Users can only see projects in their organization
Super-admins can see everything

Written in natural language, this sounds clear. Written in Alloy:

sig User {}
sig Org {
  members: set User,
  admins: set User,
  projects: set Project
}
sig Project {
  org: Org,
  viewers: set User,
  editors: set User
}
one sig SuperAdmin extends User {}

fact AdminsAreMembers {
  all o: Org | o.admins in o.members
}

fact ProjectBelongsToOrg {
  all p: Project | p in p.org.projects
}

pred canSeeProject(u: User, p: Project) {
  u = SuperAdmin
  or (u in p.org.members and u in p.viewers + p.editors + p.org.admins)
}

assert CrossOrgVisibility {
  all u: User, p: Project |
    canSeeProject[u, p] and u != SuperAdmin =>
    u in p.org.members
}

check CrossOrgVisibility for 6

When the team first wrote this model, Alloy immediately found a counterexample: an editor of a project whose organization had been transferred was still in p.editors but no longer in p.org.members. The policy as coded would let them see the project. The fix was to add a fact that project roles are always a subset of the owning org's members.

This bug would have been a serious security vulnerability in production. Alloy found it in seconds from a 30-line model, before any code was written.

The Alloy Analyzer Workflow

The workflow with Alloy is iterative:

Start with run: Ask Alloy to find any instance. Look at the visualization. Does it match your mental model? If not, your facts are wrong or incomplete.
Add assertions and check: State properties you believe hold. Run check. If Alloy finds a counterexample, your design has a bug. If not, increase the scope and check again.
Increase scope gradually: Start with for 3 (3 atoms per type). Increase to for 5, for 8. Alloy uses a SAT solver, so larger scopes take exponentially longer but catch more edge cases.
Refine until confident: When all your assertions pass at a scope you trust, the design is verified within that scope.

The key limitation — common to all bounded model checking — is that you can only verify properties within a finite scope. Alloy cannot prove properties hold for all possible instances of any size. For many practical purposes, scope 5-8 is sufficient because real design bugs manifest with 3-4 entities.

Integrating Alloy into Your Development Process

Alloy works best as a pre-code design tool. The workflow is:

Before writing any code, write an Alloy model of your data model or access control policy. Spend an hour verifying it. Find and fix the design bugs at the model level, where they are cheap to fix.

Keep models alongside code. Store .als files in a specs/ directory. When the design changes, update the model and re-verify.

Use models as documentation. Alloy models are precise, executable specifications. They are better documentation than prose because they can be verified.

For CI integration, the Alloy command-line runner can be invoked from scripts:

java -cp alloy4.2.jar edu.mit.csail.sdg.alloy4whole.ExampleUsingTheCompiler MyModel.als

The exit code is non-zero if any assertion fails, making it CI-friendly.

Learning Resources

The official Alloy tutorial at alloytools.org walks through the language in detail. Daniel Jackson's book "Software Abstractions" is the definitive reference and is written accessibly for practicing engineers rather than academics.

The Alloy community has published models for OAuth 2.0, TLS handshake, file systems, type systems, and many other common patterns. These are worth studying before building your own models.

Alloy does not replace testing. It verifies design properties, not implementation correctness. The two tools work at different levels and complement each other. A verified Alloy model tells you the design is sound; tests tell you the implementation matches the design.

For teams dealing with complex access control, intricate data models, or architectural constraints that are hard to reason about, Alloy is one of the highest-leverage tools available. An hour with Alloy can save a week of debugging security vulnerabilities or data corruption bugs in production.

Alloy Analyzer for Software Design Verification

HelpMeTest

Alloy vs TLA+

The Alloy Language

Finding Design Flaws Early: Access Control

Modeling Database Constraints

Case Study: Verifying an Authorization Policy

The Alloy Analyzer Workflow

Integrating Alloy into Your Development Process

Learning Resources

Read more

Testing React Router v7 with Vite + Vitest: Setup and Best Practices

E2E Testing React Router v7 Apps with Playwright

Migrating from Remix to React Router v7: Testing Your Migration

Testing React Router v7 Loaders and Actions with Vitest