Arch principles · Architecture · Intermediate

Aggregates & Repositories

Clustering entities behind an aggregate root that guards invariants, and the repository interface that loads/saves whole aggregates — with persistence ignorance, demonstrated in runnable Go.

Arch principles Intermediate ⏱ 5 min read Complete

📦 Analogy

An aggregate is a sealed parcel. You don’t reach inside someone’s order and shuffle the line items by hand — you hand the request to the order (“add this item”), and it decides whether that’s allowed and keeps its own total correct. The aggregate root is the only address on the parcel; everything inside is reached through it. A repository is the warehouse: you ask it for parcel #42 or hand it one to store, and you never see the shelving system.

Aggregates: a consistency boundary with one door

In DDD, an aggregate is a cluster of entities and value objects treated as a single unit for changes. One entity is the aggregate root — the only object outside code may hold a reference to — and it guards the invariants of everything inside. The aggregate is also the transaction/consistency boundary: one transaction modifies one aggregate, and its invariants always hold afterwards.

graph TD
subgraph Order["Order aggregate"]
  R["Order (root)<br/>total, status"] --> L1["OrderLine"]
  R --> L2["OrderLine"]
end
Svc["application service"] -->|"only via root"| R
Repo["OrderRepository"] -->|"load / save whole aggregate"| Order
L1 -.->|"NO direct outside access"| X["✗"]

The rules in practice: reference only the root from outside; mutate inner objects only through root methods; keep one aggregate small (it’s loaded and saved as a whole); and let one transaction touch one aggregate, with cross-aggregate consistency handled eventually (e.g. domain events / a message queue).

See it: an aggregate root enforcing an invariant

The Order root keeps its total correct and refuses illegal transitions (you can’t add lines to a shipped order). Outside code can’t break the invariant because the lines aren’t exported for mutation — every change goes through a method:

▶ order.go — editable & runnable

package main

import (
"errors"
"fmt"
)

type OrderLine struct {
SKU      string
Quantity int
Cents    int64
}

// Order is the aggregate root. Lines are unexported: the only way to change
// them is through methods, so the root can keep its invariants.
type Order struct {
id      string
status  string // "open" | "shipped"
lines   []OrderLine
totalC  int64
}

func NewOrder(id string) *Order { return &Order{id: id, status: "open"} }

// AddLine enforces invariants: only on open orders, and keep total correct.
func (o *Order) AddLine(l OrderLine) error {
if o.status != "open" {
	return errors.New("cannot modify a shipped order")
}
if l.Quantity <= 0 {
	return errors.New("quantity must be positive")
}
o.lines = append(o.lines, l)
o.totalC += int64(l.Quantity) * l.Cents // invariant: total == sum(lines)
return nil
}

func (o *Order) Ship() error {
if len(o.lines) == 0 {
	return errors.New("cannot ship an empty order")
}
o.status = "shipped"
return nil
}

func (o *Order) Total() int64 { return o.totalC }

func main() {
o := NewOrder("ord-42")
_ = o.AddLine(OrderLine{SKU: "BOOK", Quantity: 2, Cents: 1500})
_ = o.AddLine(OrderLine{SKU: "PEN", Quantity: 3, Cents: 200})
fmt.Printf("total: %.2f\n", float64(o.Total())/100) // 36.00

if err := o.Ship(); err != nil {
	fmt.Println(err)
}
// Invariant protected: no edits after shipping.
if err := o.AddLine(OrderLine{SKU: "LATE", Quantity: 1, Cents: 999}); err != nil {
	fmt.Println("blocked:", err)
}
fmt.Printf("final total still: %.2f\n", float64(o.Total())/100) // 36.00
}

Because lines and totalC are unexported and only methods touch them, the total == sum(lines) invariant cannot drift — there’s no way to append a line without updating the total. That’s the whole point of an aggregate root.

Repositories: a collection illusion over storage

A repository gives the domain a collection-like interface — Save, FindByID — for an aggregate root, while hiding how it’s stored. Crucially, the interface lives in the domain (it expresses what the domain needs) and the implementation lives in infrastructure (Postgres, in-memory, a remote API). This dependency inversion keeps the domain persistence-ignorant: it never imports the database driver.

▶ repository.go — editable & runnable

package main

import (
"errors"
"fmt"
)

type User struct {
ID    string
Email string
}

// UserRepository is the PORT — defined where the domain/app needs it.
// In Go, implementations satisfy it implicitly; they don't import it.
type UserRepository interface {
Save(u User) error
FindByID(id string) (User, error)
}

// inMemoryUsers is an ADAPTER — an infrastructure detail. Swap it for
// Postgres without changing a line of domain code.
type inMemoryUsers struct{ data map[string]User }

func NewInMemoryUsers() *inMemoryUsers { return &inMemoryUsers{data: map[string]User{}} }

func (r *inMemoryUsers) Save(u User) error { r.data[u.ID] = u; return nil }
func (r *inMemoryUsers) FindByID(id string) (User, error) {
u, ok := r.data[id]
if !ok {
	return User{}, errors.New("not found")
}
return u, nil
}

// Application code depends on the INTERFACE, never the concrete type.
func register(repo UserRepository, id, email string) (User, error) {
u := User{ID: id, Email: email}
if err := repo.Save(u); err != nil {
	return User{}, err
}
return repo.FindByID(id)
}

func main() {
var repo UserRepository = NewInMemoryUsers() // <- only line that picks the adapter
u, err := register(repo, "u1", "ada@example.com")
fmt.Println(u, err)

_, err = repo.FindByID("missing")
fmt.Println("lookup missing:", err)
}

🐹 Go makes the repository pattern almost free

Because Go interfaces are satisfied implicitly, the Postgres adapter doesn’t import or “implement” the domain interface — it just happens to have Save and FindByID. The domain declares the interface it needs; infrastructure provides a type with matching methods; you wire them together in main. That means your business logic and aggregates are unit-testable against the in-memory adapter with no database, no mocks framework, and no build tags — exactly the seam hexagonal architecture is built on. Keep one repository per aggregate root.

⚠️ Don't make aggregates huge, and don't span them in one transaction

Two classic mistakes. First, a giant aggregate (a Customer that holds every order, address, and invoice) becomes slow to load and a contention hotspot — keep aggregates small and reference other aggregates by ID, not by embedding them. Second, modifying multiple aggregates in one transaction couples them and blocks scaling: change one aggregate per transaction and propagate to others eventually via domain events / an outbox. If you find yourself needing two aggregates strongly consistent together, your aggregate boundaries are probably wrong.

Check your understanding

Score: 0 / 5

1. What is an aggregate root?

An aggregate is a cluster of objects treated as one unit for changes; the aggregate root is its single entry point. External code holds a reference to the root only (never to inner entities directly), and all modifications go through the root so it can enforce the cluster's invariants. Example: an Order is the root; OrderLines live inside it, and you add a line via order.AddLine(...), never by mutating the slice from outside.

2. What is the core responsibility of a Repository?

A repository gives the domain a collection-like illusion — Save(order), FindByID(id) — for an aggregate, while hiding whether that's Postgres, an in-memory map, or a remote API. It's defined as an *interface in the domain layer* and implemented in the infrastructure layer, so the domain stays ignorant of the database. One repository per aggregate root is the rule of thumb.

3. What does 'persistence ignorance' mean for the domain model?

Persistence ignorance means the domain model is pure business logic — it doesn't import the database driver, doesn't shape itself around table columns, and doesn't know if it's persisted to SQL, a file, or memory. The repository interface (in the domain) and its concrete implementation (in infrastructure) form the seam. This keeps business rules testable without a database and lets you swap storage without touching the domain.

4. Why should an aggregate be the consistency boundary for a transaction?

An aggregate defines what must be consistent *together, right now* — e.g. an Order's total must always equal the sum of its lines. That invariant is enforced atomically within one transaction on one aggregate. Trying to keep multiple aggregates strongly consistent in one transaction couples them and kills scalability; instead, update one aggregate per transaction and propagate to others *eventually* (domain events, outbox). This rule is what makes aggregates the unit of scaling and sharding too.

5. Where is a repository INTERFACE defined vs its implementation?

This is dependency inversion: the domain declares the interface it needs (OrderRepository with Save/FindByID), and infrastructure provides a concrete type that satisfies it. The domain depends on the abstraction, not on Postgres. In Go this is especially clean because interfaces are satisfied implicitly — the Postgres type just has the right methods; it doesn't import or 'implement' the domain interface explicitly. See hexagonal architecture for the full picture.

Sync across devices

Aggregates & Repositories

Aggregates: a consistency boundary with one door

See it: an aggregate root enforcing an invariant

Repositories: a collection illusion over storage

See also

Check your understanding

Comments

Aggregates: a consistency boundary with one door

See it: an aggregate root enforcing an invariant

Repositories: a collection illusion over storage

See also

Related topics

Check your understanding

Comments