Skip to main content
Pixel art: a town square with a community bulletin board. One character pins 'Evening scene activated'. Others react independently — updating a phone, writing on a clipboard, adjusting a device. Nobody talks to each other. Tagline: Post Once. React Everywhere.

CS 3100: Program Design and Implementation II

Lecture 33: Event-Driven Architecture

©2026 Jonathan Bell, CC-BY-SA

Learning Objectives

After this lecture, you will be able to:

  1. Describe the Observer pattern and how it reduces coupling
  2. Define event-driven architecture and the role of event brokers
  3. Evaluate delivery guarantees and explain why idempotency matters
  4. Understand consistency models (strong vs eventual)
  5. Describe common broker patterns (work queue, pub-sub, fan-in, DLQ)

Async Solved Waiting — Inside One Process

Yesterday's key insight: fire all the requests, don't pay for a thread to wait for each response.

  • sendCommandAsync(light) — returns immediately, callback fires when the light ACKs
  • CompletableFuture.allOf(...) — 15 commands in flight, zero idle threads
  • Platform.runLater() — safely push results back to the GUI thread

This works beautifully when everything is in one JVM — one hub dispatching commands to devices.

But SceneItAll isn't one JVM. The hub, the mobile app, the cloud service, and device firmware run on different machines, different networks, different timelines. They can't share a thread pool. They can't call each other's methods. When the hub activates a scene, how does the mobile app find out?

You Already Know This Pattern — Now We Name It

The Observer pattern: a subject notifies its observers when state changes. The subject doesn't know who they are.

The pattern in 8 lines:

public class Subject<T> {
private final List<Consumer<T>>
observers = new ArrayList<>();

public void addObserver(Consumer<T> o) {
observers.add(o);
}
public void setValue(T val) {
for (var o : observers)
o.accept(val); // notify all
}
}

You've used this since L29:

WhereSubjectObserver
L29ButtononAction handler
L30IntegerPropertyBound Slider
L30ObservableListBound ListView
L32CompletableFuturethenAccept callback

In every case, the subject doesn't depend on any specific observer. Adding or removing an observer = zero changes to the subject. That's data coupling at most (L7).

Observer works beautifully inside one program. But what happens when the components are on different machines?

SceneItAll Is Four Programs on Four Machines

Four SceneItAll components — cloud service, mobile app, hub, and device firmware — shown as separate boxes with question-mark arrows between them, highlighting different hardware, networks, and timelines.

Direct Calls Create a Chain of Fragility

Hub reboots during a firmware update → Cloud blocks → App hangs. One slow or failed component makes everything slow or failed.

This is exactly what the Fallacies of Distributed Computing (L20) warned us about.

Events: Publish Facts, Don't Send Commands

// An event is a fact in the past tense
public record BrightnessChanged(
String eventId,
Instant timestamp,
String source, // "hub-01"
String deviceId,
int previousBrightness,
int newBrightness
) {}

BrightnessChanged, not ChangeBrightness. Events are immutable facts about the past — pinning a note to the bulletin board. The receiver decides how to react.

SceneItAll events:

EventWhat happened
DeviceDiscoveredNew device on Zigbee network
SceneActivatedUser activated a scene
DeviceOfflineDevice stopped responding
FirmwareUpdateAvailableNew firmware ready

Each event is immutable and timestamped. No one can change the fact that the brightness changed at 10:32:05 AM.

Without EDA: Adding a Consumer Means Changing the Hub

The hub is coupled to every downstream system. Adding a new consumer means modifying the hub's code.

With EDA: Zero Changes to the Hub When You Add a Consumer

Observer (L29) → Async (L32) → MVVM (L30) → EDA (today) = the same decoupling pattern at increasing scale.

L7: high coupling = a change in one module forces changes in others. Each step reduces what the caller needs to know — from thread lifecycles, to execution timing, to which consumers exist. EDA is information hiding applied to an entire system.

Brokers Store Events So Offline Consumers Don't Miss Them

Event broker receives events from producers, stores them durably, and delivers to online consumers immediately while queuing events for offline consumers until they reconnect.

Events flow through a broker — it stores events durably and delivers them to subscribers. Offline consumer? Events queue until it reconnects. The broker is the bulletin board from our cover image — one place where facts are posted, many people who read and react independently.

Recall L21: message queues decouple submission from processing (Bottlenose grading, Pawtograder repo creation). A broker generalizes that — many producers, many consumers, managed subscriptions. Kafka handles millions of events/sec at Netflix; Zigbee mesh routes events between your hub and a light bulb. L20's resilience patterns — retry with backoff, circuit breakers, rate limiting — compose naturally with brokers and idempotent consumers.

Exactly-Once Is Nearly Impossible — At-Least-Once Is Practical

GuaranteeHow it worksRiskUse case
At-most-onceFire and forgetLost eventsAnalytics pings
At-least-onceRetry until ACKDuplicate eventsMost operations
Exactly-onceProcess + ACK atomicExtremely hard to achieveEveryone wants this

The practical answer: at-least-once delivery + idempotent consumers.

Accept that duplicates will happen. Design your handlers so processing the same event twice produces the same result as processing it once.

Design Handlers So Processing the Same Event Twice Is Safe

An operation is idempotent if applying it N times has the same effect as applying it once:

OperationIdempotent?Why?
light.setBrightness(30)YesSetting to 30 twice = setting to 30 once
light.togglePower()NoToggling twice reverts to original state
counter.increment()NoIncrementing twice adds 2 instead of 1
database.upsert(id, record)YesUpserting same record twice = one record

Design rule: prefer "set to X" over "change by Y." If someone pins the same note to the bulletin board twice, readers who check the note's ID can ignore the duplicate.

// Idempotent handler — safe to process the same event twice
void handle(BrightnessChanged event) {
light.setBrightness(event.newBrightness()); // set to X, not change by Y
}

Three Screens Show Three Different Answers — Is That a Bug?

You set 30%. Your phone sees it instantly. Roommate's phone sees 100% for 2 seconds. Wall panel sees 100% for 5 seconds. Is the system broken?

Sequential Consistency: One Truth, Everywhere, Always

Sequential consistency means all observers see the same operations in the same order — as if there were one CPU processing everything.

Mental model: a single-threaded program. When you write brightness = 30, every subsequent read returns 30. There's only one copy of the truth.

Like a group text where nobody can send a new message until everyone has read the last one. Simple. Safe. But how do you achieve this across multiple machines?

Sequential Consistency Is Expensive to Enforce

To make every observer see the same state at the same time, you need coordination:

  • Wait for the slowest. You set brightness to 30%. The hub can't confirm until your phone, your roommate's phone, AND the wall panel all acknowledge. The wall panel is on a slow Zigbee link — everyone waits.
  • One failure blocks everyone. Your roommate's phone is in airplane mode. Now nobody can change the brightness until their phone reconnects. The whole system is held hostage by the least reliable participant.
  • It doesn't scale. 3 consumers = manageable. 50 consumers = every operation waits for 50 acknowledgments. 1000 consumers = unusable.

Sequential consistency is the right choice for safety-critical operations (door locks, alarms) where the cost of disagreement is someone getting hurt. For everything else, there's a cheaper model.

Eventual Consistency: Everyone Agrees — Eventually

Eventual consistency means: if you stop making changes and wait long enough, all observers will converge to the same state. But at any given moment, they may disagree.

Mental model: a durable message queue. Every event gets delivered to every subscriber — eventually. Some subscribers are faster than others. But no event is lost, and given time, everyone catches up.

  • Fast: The operation completes as soon as the hub applies it — no waiting for acknowledgments
  • Resilient: If a consumer is offline, events queue in the broker until it reconnects
  • Scalable: Adding consumers doesn't slow down the producer

Like posting on social media — you see it, your friend sees it 10 seconds later, everyone converges. The bulletin board model: people wander by and read notes at their own pace.

Use Strong When Someone Could Get Hurt; Eventual for Everything Else

The question: what is the cost of a user seeing stale data for N seconds?

ScenarioCost of stalenessModel
Door lock stateSomeone enters who shouldn'tSequential
Security alarmAlarm doesn't triggerSequential
Brightness displayRoommate sees old value for 5 secEventual
Scene history logLast scene shown is 10 sec behindEventual
Energy dashboardPower numbers lag by 30 secEventual

L21 callback: CDN caches and browser caches are eventually consistent — they have been all along. Eventual consistency is the default model of the internet. Sequential consistency is the expensive special case.

You Use This Architecture Every Day: Pawtograder

Pawtograder uses both consistency models — chosen per use case:

Sequential (via the database)

Grades, submissions, enrollment records. When a TA enters a grade, the database guarantees every subsequent read sees that grade. One source of truth.

Eventual (via event queues)

Outbound: GitHub repo creation, Discord notifications, autograder job dispatch. Inbound: GitHub webhooks (pushes, issues, PR events), Discord interactions (slash commands, reactions). Queues define the boundary in both directions.

This is the architecture from L18-L21 in action: the database is the sequentially consistent core (hexagonal architecture, L16). Event queues are the ports to external systems — outbound to GitHub, Discord, and the autograder; inbound from GitHub webhooks and Discord interactions. Each queue defines an in/out interface (L7: low coupling).

Work Queues: Each Event Goes to Exactly One Worker

Work queue: producer sends numbered events into a FIFO queue. Events are pulled from the front and distributed to three workers — each event goes to exactly one worker.

Each event goes to exactly one consumer. The broker distributes events among workers.

SceneItAll: 50 device status updates/sec → pool of 5 workers pull from a shared queue. Use for parallelizing work.

Pub-Sub: Every Subscriber Gets Every Event

Pub-sub: producer sends one event to a topic. The topic fans out a copy to every subscriber — all three receive the same event independently.

Each event goes to every subscriber. The broker copies the event to each consumer's subscription.

SceneItAll: SceneActivated → mobile app updates UI, analytics logs it, automation checks rules. Use for broadcasting events.

Compare to work queue: there, each event goes to ONE consumer (like a ticket counter). Here, each event goes to EVERY subscriber (like a radio broadcast).

Fan-In: Many Producers, One Consumer

Many producers, one consumer.

SceneItAll: cloud service aggregates events from all sources into a single activity log.

Use for centralized logging, monitoring, analytics.

Dead Letter Queues Catch What Your Code Can't Handle Yet

Failed events go to a holding queue — not silently dropped.

A firmware update event for a discontinued device fails 5 times → lands in the DLQ. An engineer adds support and replays the event.

Use for catching failures that need human review.

PatternWhen to useSceneItAll example
Work queueParallelize processingStatus updates across worker pool
Pub-subMultiple services react to same eventScene activation notifies app, analytics, automation
Fan-inAggregate from many sourcesCloud collects activity from all components
DLQDon't lose unprocessable eventsFailed firmware updates queued for review

Comprehension Check

Open Poll Everywhere and answer the questions.

Key Takeaways

  1. Observer reduces coupling in-process; EDA reduces it across networks — same principle at different scales

  2. Events are facts (past tense, immutable) — publishers don't know or care who's listening

  3. Brokers store and deliver events durably — offline consumers don't miss events

  4. At-least-once + idempotent consumers = practical exactly-once. Prefer "set to X" over "change by Y"

  5. Eventual consistency is the default — strong consistency is the expensive special case, reserved for safety-critical operations

  6. Broker patterns: work queues parallelize, pub-sub broadcasts, fan-in aggregates, DLQs catch failures

Same Challenge, Increasing Scale: Threads → Async → Events

L31: Threads
In-process

L32: Async
I/O within process

L33: EDA
Across networks

L31: ThreadsL32: AsyncL33: EDA
ProblemShared mutable stateThreads waste resources waitingServices coupled synchronously
SolutionLocks, concurrent collectionsCompletableFuture, allOfEvents, brokers, eventual consistency
Bug categoryRace conditions, deadlockOrdering bugs, swallowed errorsStale state, cascading failures

Same challenge — managing concurrent operations safely — at increasing scale. This is how every large system works: Netflix, Uber, Slack — and yes, GitHub.

EDA vs Monolith: Quality Attributes Revisited (L18)

Quality AttributeMonolithEvent-Driven Architecture
ScalabilityScale the whole app, even if only one part is overloadedScale individual consumers independently — add workers to the bottleneck
DeployabilityDeploy everything at once — one bad change takes down the whole systemDeploy services independently — the broker isolates them
TestabilityMust test the whole system together; hard to isolateEach consumer is independently testable — feed it events, assert on results
ModifiabilityAdding a feature may touch many modulesAdd a new consumer with zero changes to the publisher (L7)
AvailabilityOne component crash = entire system downOne consumer crash = that consumer's work queues; everything else continues
DebuggabilityStack traces, logs in one place — straightforwardEvents scattered across services — need correlation IDs and distributed tracing

EDA wins on most attributes. The trade-off: debuggability. "The lights changed — which service did that?" requires observability tooling that monoliths don't need.

EDA Is the Glue Between Unreliable Services

In a microservice or serverless architecture (L21), each service is small and independently deployable. But services fail, restart, and scale independently. Events + brokers provide the reliable interface between unreliable services:

  • Deployability: Deploy a new version of the analytics service. The broker queues events while it restarts. Zero downtime for the hub or the mobile app.
  • Fault isolation: The Discord bot crashes. Regrade notifications queue in the broker. When the bot restarts, it processes the backlog. No regrade requests are lost.
  • Independent scaling: Autograder is overloaded at deadline time. Spin up more workers pulling from the grading queue. The submission service doesn't change.
  • Debugging: Every event has an ID, a timestamp, and a source. The broker is the audit trail — you can replay events, inspect the DLQ, trace the full lifecycle of a submission.

Events are the contract between services. The broker is the postal service. Each service is free to crash, restart, and evolve — as long as it speaks the same event language.

Looking Ahead

GA1 — Due April 9

Your CookYourBooks app uses BackgroundTaskRunner — a utility wrapping javafx.concurrent.Task, daemon threads, and FX-thread callbacks into run(callable, onSuccess, onFailure). You don't write threading boilerplate, but you must understand what it does (your TA will ask).

The error handling patterns from today (idempotency, retry, graceful failure) apply to every async operation in your app.

Want to go deeper?

  • CS 3700: Networks and Distributed Systems — protocols, distributed coordination
  • CS 4730: Distributed Systems — formal consistency models, CAP theorem, Paxos/Raft consensus
  • CS 6620: Fundamentals of Cloud Computing — cloud-native event systems at scale (Kafka, SQS, EventBridge)