
CS 3100: Program Design and Implementation II
Lecture 36: Sustainability
©2026 Jonathan Bell, CC-BY-SA
Learning Objectives
After this lecture, you will be able to:
- Define software sustainability as a meta-quality attribute
- Apply the four dimensions of sustainability to evaluate design trade-offs
- Recognize how efficiency gains can increase total resource consumption
- Evaluate who benefits and who bears risk in design trade-offs
From Safety to Sustainability: Generalizing "Who Profits, Who Bears Risk?"
In L35, we saw Boeing sell sensor redundancy as an optional upgrade. Budget airlines saved money. Passengers bore the risk — without knowing it.
That distributional question — who benefits, who pays, over what time horizon — is the core question of sustainability.
L1 callback: "Software engineering is the integral of programming over time." Every lecture since has been about what that integral measures. Today we name it.
SceneItAll's Success Disaster
SceneItAll launches. 50 beta homes. Everything works. Fast, reliable, safe. Great reviews.
| What went right | What happened next |
|---|---|
| Fast firmware updates | Team pushes 10x more often; total traffic doubles |
| Reliable occupancy sensing | Insurance companies want the data; users never consented |
| Accessible on modern phones | 100,000 homes; users with screen readers can't configure scenes |
| Free cloud tier covers costs | Growth past the free tier; locked into vendor pricing |
| Small team ships fast | Original devs leave; no one understands the Zigbee adapter code |
Nothing broke. The system succeeded — and the success created problems the original design never anticipated.
Sustainability: What Happens When This Succeeds?
Definition (Lago et al.): "Preservation of long-term beneficial use of software, and its appropriate evolution, in a context that continuously changes."
The key word is "beneficial." SceneItAll's occupancy data is useful — for the homeowner. It's harmful — for the homeowner whose data is sold. Same feature, different stakeholders, different time horizon.
Sustainability is not another quality attribute to add to the list. It is the meta-quality attribute — it asks whether all the other quality attributes (performance, safety, accessibility, changeability) hold up over time, and for whom.
Lago distinguishes two directions: sustainable software (inward — is the artifact itself maintainable, efficient, evolvable?) and software for sustainability (outward — does the software support sustainable processes in the world?). Both matter.
Safety vs. Sustainability: Two Different Questions
Safety (L35)
"What happens when this fails?"
- Therac-25 race condition
- Boeing single sensor
- CrowdStrike boot loop
Focus: failure modes. Who gets hurt when things go wrong?
Sustainability (today)
"What happens when this succeeds — at scale, over years, across stakeholders you haven't met?"
- SceneItAll occupancy data sold
- Pawtograder narrows curriculum
- LLM subsidy reshapes labor market
Focus: success modes. Who bears the cost when things go right?
You've Been Building Sustainability Mechanisms All Semester
| What you learned | Where | What it sustains |
|---|---|---|
| Information hiding | L6 | Changeability — hidden internals can evolve without breaking clients |
| Low coupling | L7 | Independence — modules can be maintained, replaced, scaled independently |
| SOLID principles | L8 | Evolvability — code resists "software rot" as requirements change |
| Hexagonal architecture | L16 | Vendor independence — swap infrastructure without rewriting domain logic |
| Open source evaluation | L23 | Supply chain health — dependencies that won't be abandoned or relicensed |
| Accessibility | L28 | Inclusivity — system serves diverse and growing user populations |
| Staged rollout | L35 | Blast radius control — failures don't cascade to every user simultaneously |
Decisions that seem like "good engineering practice" in the short term are sustainability investments in the long term.
Technical Sustainability: Can the System Be Maintained and Evolved?

The dimension you know best. Low coupling, testability, readable code, clear contracts.
SceneItAll: Hexagonal architecture (L16) lets the team swap the Zigbee adapter for a Matter adapter without rewriting scene activation logic.
The test: Can a new developer join and make changes? Can you replace a dependency without a rewrite?
Economic Sustainability: Is the Total Cost of Ownership Viable?
Beyond hosting costs: developer time, dependency cost, lock-in risk, support burden, opportunity cost.
Pawtograder: GitHub Actions free tier covers current grading volume — but growth past the free tier means GitHub's pricing, not yours. And if GitHub changes their API? Every autograder integration breaks.
License changes are an economic hazard: MongoDB (AGPL to SSPL), HashiCorp (MPL to BSL) — your dependency's license can change under you.
L23 Recall: OpenSSL secured most of the internet — maintained by a handful of volunteers until Heartbleed exposed how underfunded critical infrastructure can be. Economically unsustainable open source is a supply chain risk for everyone who depends on it.
Environmental Sustainability: What Resources Does the System Consume?
Direct compute costs (energy, hardware, cooling) plus indirect effects (does the system enable behaviors that consume more resources?).
L20 callback: "Every network request requires CPU cycles, network interface power, router power, server CPU, data center cooling." Batching saves energy, not just latency.
Tease: efficiency gains don't always reduce total consumption. We'll see why next.
Social Sustainability: Who Does the System Serve?
Accessibility (L28), inclusivity, fairness, privacy. Indirect stakeholders emerge over time.
SceneItAll usage analytics:
- At 50 beta homes — occupancy data is a debugging tool
- At 100,000 homes — the same data is a burglary-risk or insurance-discrimination vector
The system didn't change. The stakeholder population did.
The Dimensions Interact — and Conflict
| Decision | Technical | Economic | Environmental | Social |
|---|---|---|---|---|
| Monolith to microservices | Better: independent deployment | Worse: operational complexity | Worse: network overhead, container sprawl (L20) | Neutral |
| Add WCAG accessibility | Moderate effort | Higher dev cost | Neutral | Better: inclusive (L28) |
| Switch to serverless | Moderate: vendor-specific APIs | Better: pay-per-use (L21) | Mixed: no idle waste but cold start overhead | Worse: vendor lock-in limits self-hosting |
| Keep all telemetry forever | Simpler: no retention policy | Worse: storage costs grow linearly | Worse: ~98% of data center data is "dark data" — never used (Lago) | Worse: privacy risk grows with data volume |
No decision optimizes all four. Sustainability analysis makes trade-offs visible — not resolved.
Jevons' Paradox: Efficiency Is Not Sustainability
1865: More efficient coal engines led to more total coal consumption. Efficiency made it cheaper, expanding use faster than per-unit savings.
| Technology | Per-unit gain | Total consumption |
|---|---|---|
| Cloud computing | Cheaper per hour | Total energy skyrocketed |
| Web + CDNs | Faster per byte | Pages: 100KB → 4MB |
| CI/CD | Cheaper per build | Vastly more builds |
| LLM inference | Cheaper per token | AI compute exploding |

Making software faster/cheaper does not automatically make it more sustainable.
The Jevons Cycle: Why Efficiency Feeds Itself
The loop is self-reinforcing. Each efficiency gain makes the next round of expansion cheaper.
SceneItAll + Pawtograder: You're Living Inside Jevons' Paradox
SceneItAll:
Efficient firmware updates (faster Zigbee, smaller deltas) did not reduce total traffic — they meant the team pushes updates more frequently.
Per-update cost dropped 5x. Update frequency increased 10x. Total update traffic doubled.
Pawtograder:
Efficient automated grading enables unlimited submissions. Students submit 3,000-12,000 times per day across the course.
Before: submit once, human grades. The system is more efficient; the total resource consumption is higher.
LLMs: Jevons' Paradox as a Business Strategy
Per-token API prices have dropped across model generations even as total inference volume has surged. Snapshot (Anthropic API pricing, retrieved 2026-03-31): Claude Opus 4.1 at $15/$75 per million input/output tokens; Claude Opus 4.6 at $5/$25 — about 3x lower per million tokens.
Illustrative estimate: Claude Code Max plans ($200/mo) may correspond to ~$5,000 in API compute at published list rates — a significant gap that illustrates the Jevons pattern, not a precise accounting. Real API spend depends on models, tokens, caching, batching, and contract discounts.
| Cost layer | Who pays | Who benefits |
|---|---|---|
| GPU hardware + energy | Cloud providers (passed to AI companies) | Developers using the tools |
| Training data creation | Original authors (often unconsented) | AI companies + users |
| Subsidy gap (~$200 vs ~$5,000 estimate) | AI company investors (for now) | Individual developers |
| Environmental externality | Everyone (carbon emissions) | Direct users of the service |
| Labor displacement risk | Workers in affected roles | Companies reducing headcount |
"Who profits, who bears risk?" applied to the tools you use every day.
Jevons in the Wild: Spot the Rebound Effect
Which of these exhibit Jevons' paradox? For each, identify: what got more efficient, and what increased.
| Scenario | Jevons? | What increased? |
|---|---|---|
| A. Adding database indexes speeds queries 10x. DevOps adds more monitoring queries. | ❓ | |
| B. Switching to incremental compilation (20x faster). Developers recompile constantly during debugging. | ❓ | |
| C. Raising API rate limit from 100 to 1000 req/sec. Clients send 5x more requests. | ❓ | |
| D. Adding WCAG accessibility to SceneItAll. More users can use the app. | ❓ |
Discuss with a neighbor. Not all of these are Jevons — which one isn't?
Digital Sufficiency: Should We Build This at All?
Jevons asks whether efficiency reduces total consumption. Sufficiency asks a more radical question: is this technology needed in the first place?
| Efficiency question | Sufficiency question |
|---|---|
| How do we make this drone software more energy-efficient? | Efficient medical drones get cheap enough to become toys — negating all the efficiency gains at scale |
| How do we optimize data center storage? | Should we be storing 98% "dark data" that no one will ever read? |
| How do we make LLM inference cheaper per token? | Should you be using an LLM for this task, or would grep do? |
| How do we make SceneItAll updates faster? | Does every light bulb need a WiFi chip and cloud connection? |
The EU's Right to Repair: extending hardware life = less hardware produced. Sufficiency, not efficiency.
First, Second, and Third-Order Effects
| 1st-order (direct) | 2nd-order (behavioral) | 3rd-order (systemic) | |
|---|---|---|---|
| SceneItAll | Hub uses power to run | Convenience increases total energy use; usage data reveals when you're home | Insurance pricing and surveillance reshape around smart-home data |
| Pawtograder | Each submission uses compute | Unlimited submissions change study habits — autograder becomes the debugger | If every course auto-grades, assignments gravitate toward what's auto-gradeable, narrowing what students learn |
| LLM Agents | GPU inference per prompt | Developers write more code, explore more approaches, iterate faster | Labor market restructures; codebases grow faster than teams can understand them |
"If this system is wildly successful, what behaviors does it enable, and who is affected?"
The Veil of Ignorance: Design As If You Don't Know Which Stakeholder You'll Be
Rawls' thought experiment: design the rules of a society as if you don't know which position you'll occupy in it. Applied to software:
SceneItAll: Would you accept this design if you might be...
- The developer maintaining code in 3 years
- A user with a visual impairment
- A homeowner with intermittent internet
- The person whose occupancy data is sold
- A homeowner locked out during a firmware update
Pawtograder: Would you accept this design if you might be...
- A Northeastern student with fast internet
- A community college student self-hosting with limited IT
- A student with a disability needing accessible feedback
- A TA grading 200 submissions during finals
- A student who got a zero from an autograder crash (L35)
The veil doesn't tell you what to build. It tells you which trade-offs deserve extra scrutiny.
The Veil Decides: Who Bears the Cost of "Unlimited" Submissions?
Scenario: Pawtograder offers unlimited autograder submissions. The compute cost is real — but invisible to students. Behind the veil, you might be:
| Stakeholder | Impact of "unlimited" |
|---|---|
| Student at Northeastern | Submit freely, fast feedback, iterate quickly |
| Student at community college self-hosting | Their IT budget pays per container-minute — unlimited = unaffordable |
| Student with slow internet | Each submission is a 30-second upload + 2-minute wait — "unlimited" isn't free |
| Student who uses autograder as debugger | Learns less; relies on output instead of reasoning (L13) |
| The planet | 12,000 submissions/day × 2 min compute = real energy (Jevons) |
"Unlimited" is a design choice that encodes a value: iteration over efficiency. Does the veil change your assessment?
Pawtograder Through Four Dimensions
| Dimension | Assessment | Key Question |
|---|---|---|
| Technical | Open-source, modular (L16 hex arch). GitHub Actions dependency. | If GitHub changes their Actions pricing or API, how much breaks? |
| Economic | Serverless pay-per-use (L21), no licensing. But self-hosting requires expertise. | Can an under-resourced institution actually adopt this? |
| Environmental | 3-12k daily submissions, scale-to-zero. But Jevons applies: unlimited submissions generate more total compute. | Should there be a cooling-off period between submissions? |
| Social | GPL license, but requires GitHub. WCAG not yet validated. | Is it truly accessible to all students and institutions? |
Open-source + modular + pay-per-use looks great on paper. The four-dimensional analysis reveals what's hidden.
Real Decision: The Azure Outage
October 2025. Azure goes down. GitHub Actions stops running. Pawtograder can't grade submissions. Two options:
| Option A: Self-hosted fallback | Option B: Stay GitHub-dependent | |
|---|---|---|
| Technical | Complex failover logic; two systems to maintain | Simpler architecture; single system |
| Economic | Duplicate infrastructure costs | Leverage free tier; lower total cost |
| Environmental | Idle fallback resources most of the time | Shared infrastructure, higher utilization |
| Social | Resilient — students don't lose access during outages | Equal access for all institutions (no self-hosting expertise needed) |
No right answer. The four-dimensional analysis makes the trade-offs visible.
Evaluate This Trade-off: Real-Time TA Notifications
Feature request: Add real-time email/SMS notifications to TAs whenever a student submits to Pawtograder.
Analyze across all four dimensions:
| Dimension | Better or worse? | Why? |
|---|---|---|
| Technical | New dependencies? Latency requirements? | |
| Economic | Does this increase hosting/SaaS costs? | |
| Environmental | Real-time push vs batch — compute difference? | |
| Social | TAs get faster feedback — but notifications during off-hours? |
Is this sustainable across all four dimensions? For which stakeholder does it worsen?
The Values-Requirements Gap: Operationalizing Values Is Genuinely Hard
You can state values clearly. Translating them into testable requirements is an open research problem.
| Value | Attempted Requirement | Problem |
|---|---|---|
| Fairness | "Grade all submissions identically" | Identical can be inequitable (students with disabilities, slow connections hit timeouts) |
| Privacy | "Don't collect unnecessary data" | "Necessary" depends on who's asking — debugging needs telemetry, but telemetry is surveillance |
| Environmental | "Minimize compute" | Conflicts with unlimited submissions, thorough test suites, and fast feedback |
C.T. Nguyen calls this value capture. Economists call it Goodhart's Law: "when a measure becomes a target, it ceases to be a good measure."
Comprehension Check
Open Poll Everywhere and answer the three questions.
System Design Is Never Value-Neutral
The Karlskrona Manifesto on Sustainability Design (2015): foundational consensus document from ~30 software engineering researchers.
Core principle: every architecture, API, and default setting reflects assumptions about who matters and what matters.
Sustainability is the practice of making those assumptions explicit and revisiting them as the system and its context evolve.
Every Design Decision Encodes a Value Judgment
Pawtograder's choices encode values — whether we thought about them or not:
- Unlimited submissions values learning-by-iteration over compute efficiency
- Requiring GitHub values standardization over universal access
- Auto-grading values scale over human nuance
The question is not whether your design encodes values. It's whether you chose them deliberately.
Everything This Semester Sustains Something
| Lecture Arc | What It Sustains | Time Horizon |
|---|---|---|
| L5-L8: Readability, coupling, SOLID | Code future developers can understand and change | Years of maintenance |
| L9, L12: Requirements, domain modeling | Systems that solve the right problem | Project lifetime |
| L15-L16: Testing, testability | Confidence that changes don't break existing behavior | Every commit |
| L18-L21: Architecture, networks, serverless | Systems that scale, evolve, survive infrastructure changes | Organizational lifetime |
| L23: Open source | Supply chains that don't depend on abandoned projects | Industry-wide |
| L28: Accessibility | Systems that serve all users, not just ones who look like the developers | Societal |
| L34-L35: Performance, safety | Systems that don't harm users through slowness or failure | Immediate to catastrophic |
Sustainability is not a new topic. It is the name for what all of these have in common.
Parnas: The Question Is Not "AI" but "Critical Software"
"What we should be doing is trying to regulate critical software rather than trying to make regulations that apply to AI."
The sustainability framework agrees: what is the blast radius, who are the stakeholders, and are the trade-offs visible?
Same questions for a for loop processing loan applications and a neural network doing the same thing.
Professional responsibility transcends the technology.
The Integral of Programming Over Time
L1: "Software engineering is the integral of programming over time."
Sustainability is what that integral computes.
Go build software that provides value over time, to the people who need it, without imposing unacceptable costs on the people who don't.
Looking Ahead
L37 (Monday): Map-Reduce — a case study in sustainable architecture at planetary scale. How Google designed a programming model that let thousands of engineers process the world's data without understanding distributed systems.
L38 (Wednesday): The Future of Programming — where does software engineering go from here?
L39 (Thursday): Review
GA2: Feature Buffet due Thursday April 16. Process over product — a well-documented partial feature scores higher than a complete feature with no documentation.