
CS 3100: Program Design and Implementation II
Lecture 37: Performance, Safety, and Sustainability
©2026 Jonathan Bell & Ellen Spertus, CC-BY-SA
Looking Ahead
- Today
- Lecture: Highlights of skipped lectures Performance, Safety, and Sustainability
- Practice final distributed
- TRACE
- Tuesday
- Lab 14: The Future of Programming
- Wednesday
- Lecture: MapReduce and the Future of Programming
- Thursday
- Lecture: Review
- Course Survey
- Due date: Feature Buffet
- Next Monday, 4/20
- Due date: Final Project Report
- Next Tuesday, 4/21
- Final exam, 10:30-12:30
L34: Performance

Learning Objectives
After this lecture, you will be able to:
- Reason about algorithmic growth using Big-O notation
- Identify* performance bottlenecks: measure, don't guess
Analyze the performance impact of architectural decisionsApply common patterns to improve performance
- You will know that tools exist to find performance bottlenecks.
Big-O Describes How Code Scales, Not How Fast It Is
![Colored graph showing O(log n) [dark green, excellent];
O(1) [light green, good];
O(n) [yellow, fair];
O(n log n) [orange, bad];
O(n^2), O(2^n), and O(n!) [red, horrible]](/CS3100-Spring-2026/img/lectures/web/l37-big-o-complexity.png)
CC-BY-2.0 Dunk
Poll: Rank operations by increasing time complexity
A. Comparing every pair of items in a list
B. Performing binary search
C. Retrieving a value from a key in a HashMap
D. Searching for an item by iterating over a list
E. Sorting a list

Text espertus to 22333 if the
URL isn't working for you.
Assume all data structures have n items.
Big-O in SceneItAll
| Notation | Name | SceneItAll example |
|---|---|---|
| O(1) | Constant | HashMap lookup by device id |
| O(log n) | Logarithmic | Binary search sorted devices |
| O(n) | Linear | Iterate all devices by name |
| O(n log n) | Linearithmic | Sort 1,000 devices |
| O(n²) | Quadratic | Compare every device pair |
Linear search can be faster than binary search — when n is small.
Big-O is about what happens for large inputs.
Flame Graphs Show Where Time Goes

Tools (know they exist, not details): JFR (Java Flight Recorder, built into JDK), flame graphs, heap dumps
Performance Has Several Dimensions That Trade Off
| Metric | What it measures | Example |
|---|---|---|
| Latency | Time for a single operation | "How long until the user sees their grade?" |
| Throughput | Operations per unit time | "How many submissions per minute?" |
| Memory | Heap/stack consumption | "How much RAM for 1,000 devices?" |
| CPU | Processor time consumed | "CPU-bound or I/O-bound?" (recall L32) |
Optimizing one can worsen another: caching reduces latency but increases memory.
Bonus Slide
![[A flow chart is shown with three boxes connected with two arrows. The first box is rectangular:]
Are you prematurely optimizing or just taking time to do things right?
[From the first box there is a short arrow straight down to a diamond shaped box:]
Are you consulting a flowchart to answer this question?
[A labeled arrow continues down.]
Yes
[The arrow connects to the final rectangular box.]
You are prematurely optimizing](/CS3100-Spring-2026/img/lectures/web/l37-xkcd-premature-optimization.png)
xkcd #1691 "Optimization" by Randall Munroe, CC BY-NC 2.5
!["I spend a lot of time on this task. I should write a program automating it!"
[Two graphs are shown, plotting workload against time.]
Theory:
[The line for "work on original task" is steady but then drops down to a much lower level.]
[The line for the automating job increases heavily while "writing code" and then drops down when "automation takes over".]
[Both lines end up with a big amount of "free time".]
Reality:
[The line for "work on original task" is steady with no drop to a lower level.]
[The line for the automating job increases heavily while "writing code", then it increases again while "debugging", it drops down slightly while "rethinking", and grows up again with an infinite end while the task is still an "ongoing development".]
[The line for "work on original task" ends up with "no time for original task anymore".]](/CS3100-Spring-2026/img/lectures/web/l37-xkcd-automation.png)
xkcd #1319 "Automation" by Randall Munroe, CC BY-NC 2.5

Learning Objectives
After this lecture, you will be able to:
- Distinguish safety from reliability
- Apply the Swiss cheese model to analyze layered defenses
Analyze blast radius and fail-safe designRecognize prior course concepts as safety mechanismsEvaluate safety trade-offs against cost, complexity, and performance, and explain why professional judgment is currently the primary safety mechanism in most software
Reliable, Available, and Safe Are Three Different Things
Reliable
Does what it's supposed to do, consistently.
Measure: error rates, MTBF
SceneItAll: activates scenes correctly 99.99% of the time
Available
Accessible when users need it.
Measure: "nines" — 99.9% = 8.7 hrs downtime/yr
GitHub: multiple major outages Feb–Mar 2026 (auth DB overload, Actions failover failures)
Safe
Avoids harm, even when it fails.
Measure: incident severity — did anyone get hurt?
Key: a property of the failure mode, not the happy path
- Reliable and unsafe: A medical device delivers correct doses 99.99% of the time — but its failure mode is lethal. Reliable? Extremely. Safe? Not if one failure kills a patient.
- Safe but unreliable: Hub crashes frequently but preserves device state.
Poll: How does the cost of fixing safety issues grow?
As a product moves from design through implementation, launch, and widespread use, how does the cost of addressing safety issues grow?
A. It stays about the same
B. It increases linearly
C. It increases exponentially

Text espertus to 22333 if the
URL isn't working for you.
A firmware update on day one is moderate effort. After a bricking incident, it's a migration plus legal costs, customer replacements, and reputational damage.
The Swiss Cheese Model: Harm Requires Aligned Holes

A single layer with holes is not necessarily dangerous on its own. The problem is when someone removes a layer entirely, or when holes grow larger without anyone noticing.
Poll: What would it take to convince you that software is bug-free?
A. 100% coverage (branch and line) by tests
B. expert human code review
C. AI review
D. all of the above
E. none of the above

Text espertus to 22333 if the
URL isn't working for you.
Three Disasters, One Pattern: Removing Layers Is Removing Safety
| Aspect | Therac-25 | Boeing 737 MAX | CrowdStrike Falcon |
|---|---|---|---|
| What was replaced? | Hardware interlocks | Airframe redesign + pilot training | Manual security review |
| Replaced with? | Software safety checks | MCAS software automation | Automated content update pipeline |
| Why? | Cheaper, lighter | Cheaper, faster certification | Speed — security threats need rapid response |
| Layer removed? | Hardware interlock layer | Sensor redundancy + training | Staged rollout for content updates |
| Critical flaw? | Race conditions | Single point of failure | No rollback path when kernel crashes |
| Consequences? | People received lethal radiation | Planes crashed | 8.5M machines soft-bricked |
Three questions to ask when replacing hardware/human judgment with software:
- What failure modes does software introduce that the original didn't have?
- Is there redundancy? What happens when the single sensor/input fails?
- Can humans override the automation when it's wrong?
Bonus Slide


Sustainability

Learning Objectives
After this lecture, you will be able to:
- Define software sustainability as a meta-quality attribute
- Apply the four dimensions of sustainability to evaluate design trade-offs
- Recognize how efficiency gains can increase total resource consumption
- Evaluate who benefits and who bears risk in design trade-offs
From Safety to Sustainability: Generalizing "Who Profits, Who Bears Risk?"
In L35, we saw Boeing sell sensor redundancy as an optional upgrade. Budget airlines saved money. Passengers bore the risk — without knowing it.
That distributional question — who benefits, who pays, over what time horizon — is the core question of sustainability.
L1 callback: "Software engineering is the integral of programming over time." Every lecture since has been about what that integral measures. Today we name it.
SceneItAll's Success Disaster
SceneItAll launches. 50 beta homes. Everything works. Fast, reliable, safe. Great reviews.
| What went right | What happened next |
|---|---|
| Fast firmware updates | Team pushes 10x more often; total traffic doubles |
| Reliable occupancy sensing | Insurance companies want the data; users never consented |
| Accessible on modern phones | 100,000 homes; users with screen readers can't configure scenes |
| Free cloud tier covers costs | Growth past the free tier; locked into vendor pricing |
| Small team ships fast | Original devs leave; no one understands the Zigbee adapter code |
Nothing broke. The system succeeded — and the success created problems the original design never anticipated.
Sustainability: What Happens When This Succeeds?
Definition (Patricia Lago et al.): "Preservation of long-term beneficial use of software, and its appropriate evolution, in a context that continuously changes."
The key word is "beneficial." SceneItAll's occupancy data is useful — for the homeowner. It's harmful — for the homeowner whose data is sold. Same feature, different stakeholders, different time horizon.
Sustainability is not another quality attribute to add to the list. It is the meta-quality attribute — it asks whether all the other quality attributes (performance, safety, accessibility, changeability) hold up over time, and for whom.
Lago distinguishes two directions: sustainable software (inward — is the artifact itself maintainable, efficient, evolvable?) and software for sustainability (outward — does the software support sustainable processes in the world?). Both matter.
Safety vs. Sustainability: Two Different Questions
Safety (L35)
"What happens when this fails?"
- Therac-25 race condition
- Boeing single sensor
- CrowdStrike boot loop
Focus: failure modes. Who gets hurt when things go wrong?
Sustainability (today)
"What happens when this succeeds?"
- SceneItAll occupancy data sold
- Pawtograder narrows curriculum
- LLM subsidy reshapes labor market
Focus: success modes. Who bears the cost when things go right?
Technical Sustainability: Can the System Be Maintained and Evolved?

The dimension you know best. Low coupling, testability, readable code, clear contracts.
SceneItAll: Hexagonal architecture (L16) lets the team swap the Zigbee adapter for a Matter adapter without rewriting scene activation logic.
The test: Can a new developer join and make changes? Can you replace a dependency without a rewrite?
Economic Sustainability: Is the Total Cost of Ownership Viable?
Beyond hosting costs: developer time, dependency cost, lock-in risk, support burden, opportunity cost.
Pawtograder: GitHub Actions free tier covers current grading volume — but growth past the free tier means GitHub's pricing, not yours. And if GitHub changes their API? Every autograder integration breaks.
License changes are an economic hazard: MongoDB (AGPL to SSPL), HashiCorp (MPL to BSL) — your dependency's license can change under you.
L23 Recall: OpenSSL secured most of the internet — maintained by a handful of volunteers until Heartbleed exposed how underfunded critical infrastructure can be. Economically unsustainable open source is a supply chain risk for everyone who depends on it.
Environmental Sustainability: What Resources Does the System Consume?
Direct compute costs (energy, hardware, cooling) plus indirect effects (does the system enable behaviors that consume more resources?).
L20 callback: "Every network request requires CPU cycles, network interface power, router power, server CPU, data center cooling." Batching saves energy, not just latency.
Social Sustainability: Who Does the System Serve?
Accessibility (L28), inclusivity, fairness, privacy. Indirect stakeholders emerge over time.
SceneItAll usage analytics:
- At 50 beta homes — occupancy data is a debugging tool
- At 100,000 homes — the same data is a burglary-risk or insurance-discrimination vector
The system didn't change. The stakeholder population did.
The Dimensions Interact — and Conflict
| Decision | Technical | Economic | Environmental | Social |
|---|---|---|---|---|
| Monolith to microservices | Better: independent deployment | Worse: operational complexity | Worse: network overhead, container sprawl (L20) | Neutral |
| Add WCAG accessibility | Moderate effort | Higher dev cost | Neutral | Better: inclusive (L28) |
| Switch to serverless | Moderate: vendor-specific APIs | Better: pay-per-use (L21) | Mixed: no idle waste but cold start overhead | Worse: vendor lock-in limits self-hosting |
| Keep all telemetry forever | Simpler: no retention policy | Worse: storage costs grow linearly | Worse: ~98% of data center data is "dark data" — never used (Lago) | Worse: privacy risk grows with data volume |
No decision optimizes all four. Sustainability analysis makes trade-offs visible — not resolved.
Discussion
In the 1860s, improvements in coal engines led to greater efficiency (less coal required for the same amount of work).
Do you think this led to less or more coal usage?
Jevons' Paradox: Efficiency Is Not Sustainability
1865: More efficient coal engines led to more total coal consumption. Efficiency made it cheaper, expanding use faster than per-unit savings.
| Technology | Per-unit gain | Total consumption |
|---|---|---|
| Cloud computing | Cheaper per hour | Total energy skyrocketed |
| Web + CDNs | Faster per byte | Pages: 100KB → 4MB |
| CI/CD | Cheaper per build | Vastly more builds |
| LLM inference | Cheaper per token | AI compute exploding |

Making software faster/cheaper does not automatically make it more sustainable.
The Jevons Cycle: Why Efficiency Feeds Itself
The loop is self-reinforcing. Each efficiency gain makes the next round of expansion cheaper.
You're Living Inside Jevons' Paradox: Pawtograder
Efficient automated grading enables unlimited submissions. Students submit 3,000-12,000 times per day across the course.
Before: submit once, human grades. The system is more efficient; the total resource consumption is higher.
LLMs: Jevons' Paradox as a Business Strategy
Per-token API prices have dropped across model generations even as power and usage have surged.
| Cost layer | Who pays | Who benefits |
|---|---|---|
| GPU hardware + energy | Cloud providers (passed to AI companies) | Developers using the tools |
| Training data creation | Original authors (often unconsented) | AI companies + users |
| Subsidy gap (~$200 vs ~$5,000 estimate) | AI company investors (for now) | Individual developers |
| Environmental externality | Everyone (carbon emissions) | Direct users of the service |
| Labor displacement risk | Workers in affected roles | Companies reducing headcount |
"Who profits, who bears risk?" applied to the tools you use every day.
Digital Sufficiency: Should We Build This at All?
Jevons asks whether efficiency reduces total consumption. Sufficiency asks a more radical question: is this technology needed in the first place?
| Efficiency question | Sufficiency question |
|---|---|
| How do we make this drone software more energy-efficient? | Efficient medical drones get cheap enough to become toys — negating all the efficiency gains at scale |
| How do we optimize data center storage? | Should we be storing 98% "dark data" that no one will ever read? |
| How do we make LLM inference cheaper per token? | Should you be using an LLM for this task, or would grep do? |
| How do we make SceneItAll updates faster? | Does every light bulb need a WiFi chip and cloud connection? |
First, Second, and Third-Order Effects
| 1st-order (direct) | 2nd-order (behavioral) | 3rd-order (systemic) | |
|---|---|---|---|
| SceneItAll | Hub uses power to run | Convenience increases total energy use; usage data reveals when you're home | Insurance pricing and surveillance reshape around smart-home data |
| Pawtograder | Each submission uses compute | Unlimited submissions change study habits — autograder becomes the debugger | If every course auto-grades, assignments gravitate toward what's auto-gradeable, narrowing what students learn |
| LLM Agents | GPU inference per prompt | Developers write more code, explore more approaches, iterate faster | Labor market restructures; codebases grow faster than teams can understand them |
"If this system is wildly successful, what behaviors does it enable, and who is affected?"
Real Decision: The Azure Outage
October 2025. Azure goes down. GitHub Actions stops running. Pawtograder can't grade submissions. Two options:
| Option A: Self-hosted fallback | Option B: Stay GitHub-dependent | |
|---|---|---|
| Technical | Complex failover logic; two systems to maintain | Simpler architecture; single system |
| Economic | Duplicate infrastructure costs | Leverage free tier; lower total cost |
| Environmental | Idle fallback resources most of the time | Shared infrastructure, higher utilization |
| Social | Resilient — students don't lose access during outages | Equal access for all institutions (no self-hosting expertise needed) |
No right answer. The four-dimensional analysis makes the trade-offs visible.
System Design Is Never Value-Neutral
The Karlskrona Manifesto on Sustainability Design (2015): foundational consensus document from ~30 software engineering researchers.
Core principle: every architecture, API, and default setting reflects assumptions about who matters and what matters.
Sustainability is the practice of making those assumptions explicit and revisiting them as the system and its context evolve.
Every Design Decision Encodes a Value Judgment
Pawtograder's choices encode values — whether we thought about them or not:
- Unlimited submissions values learning-by-iteration over compute efficiency
- Requiring GitHub values standardization over universal access
- Auto-grading values scale over human nuance
The question is not whether your design encodes values. It's whether you chose them deliberately.
The Integral of Programming Over Time
L1: "Software engineering is the integral of programming over time."
Sustainability is what that integral computes.
Go build software that provides value over time, to the people who need it, without imposing unacceptable costs on the people who don't.
The State of CS 3100
This class has been challenging. We want to improve it.
Plans:
- Administration will carefully review TRACE feedback.
- We will offer another survey (with credit).
In the remaining time, please:
- Complete TRACE
- Suggest/upvote/downvote questions for the survey