Skip to main content
A pixel art split-scene: LEFT shows a stressed student with their car in a snowy lot — $2,200 parking permit, insurance bills, gas costs, car sitting unused. RIGHT shows the same student relaxed, tapping CharlieCard on the MBTA Green Line, sharing the train with others. BUT: MBTA signs show 'No trains past Government Center' and 'Service ends 12:30 AM' — you play by their rules. Tagline: Shared Infrastructure. Shared Rules. Shared Savings.

CS 3100: Program Design and Implementation II

Lecture 21: Serverless Architecture

©2026 Jonathan Bell & Ellen Spertus, CC-BY-SA

Announcements

Late tokens CAN be used on group assignments.

  • See details in link from syllabus.

Due Friday

  • Midterm survey via Qualtrics
  • Team formation survey
  • Khoury mid-semester evaluation

Learning Objectives

After this lecture, you will be able to:

  1. Recognize common infrastructure building blocks (databases, queues, caches, object storage, observability) and their architectural roles
  2. Define "serverless" architecture and Functions as a Service (FaaS) concepts
  3. Compare serverless to traditional and container-based architectures, identifying tradeoffs
  4. Identify requirements that are well-suited or poorly-suited for serverless
  5. Apply a decision framework for choosing between architectural styles based on team size, scaling needs, and operational capacity

Important framing: You will encounter serverless systems in internships and jobs. The goal is to understand why teams choose serverless and reason about whether it fits a given problem — not to become a serverless architect overnight.

Poll: What's the fun part?

Imagine memes haven't been invented, and you decide to create the first meme generator and server. What will you want to spend your time on?

A. the GUI, including the meme editor

B. keeping search and retrieval fast with 10M memes

C. handling traffic when a meme goes viral

D. authenticating users securely

E. scrambling to fix production problems at 2 AM

F. diagnosing why memes randomly fail in Asia

Poll Everywhere QR Code or Logo

Text espertus to 22333 if the
URL isn't working for you.

https://pollev.com/espertus

DIY or Cloud

A split illustration: LEFT shows a exhausted developer at 2:47 AM surrounded by smoking servers and error screens, overwhelmed by infrastructure worries and angry user complaints. RIGHT shows the same developer sleeping peacefully while a cloud diagram displays healthy managed infrastructure and happy users reacting with hearts and laughter.

The Infrastructure Iceberg

An iceberg diagram: Above water (small) is 'Your App' — the code you write. Below water (massive) are the infrastructure layers: Runtime Environment, Operating System, Hardware, Network, Power & Cooling, Physical Space. Tagline: Your app is just the tip of the iceberg.

Storing Memes: We Need a Database and Object Storage

Every meme has metadata and an image that need to outlive your server process. A database gives you persistent storage for metadata.

PostgreSQL (relational)

memes
─────────────────────────────────────
id | 4821
template_id | 12
top_text | 'Me'
bottom_text | 'A new programming language'
author_id | 99
rating | 4.7
image_url | 'memes/4821.jpg'

Rows, columns, strict schema. Supports joins, aggregations, transactions, making it more suitable for our meme server.

Firestore (NoSQL)

{
"id": "meme_4821",
"template": "distracted-boyfriend",
"text": {
"top": "Me",
"bottom": "A new programming language"
},
"author": 99,
"rating": 4.7,
"image_url": "memes/4821.jpg"
}

Flexible JSON documents. Easy to add fields. No joins.

Images are stored separately in cheaper object storage.

Indexing: Finding Memes Without Reading Everything

With 10 million memes, you can't scan every row on every search. A database index is a pre-computed lookup structure — like a book's index, but rebuilt automatically as data changes.

Without index: full table scan

SELECT * FROM memes
WHERE top_text LIKE '%programming%'
OR bottom_text LIKE '%programming%';

Reads all 10M rows. Every search. Gets slower as library grows. Response time: seconds.

With index: direct lookup

-- Create a composite index over both text fields
CREATE INDEX memes_text_search
ON memes
USING gin(to_tsvector('english', top_text || ' ' || bottom_text));

-- Query using the index
SELECT * FROM memes
WHERE to_tsvector('english', top_text || ' ' || bottom_text)
@@ to_tsquery('english', 'programming');

Index narrows to matching rows instantly. Response time: milliseconds at any scale.

The database builds and maintains the index automatically. You declare what to index; it handles the rest.

Caching

When the same data is requested repeatedly, caching stores the result so you don't recompute it. Trade a little memory for a lot of speed.

Without cache

User → App → Database → App → User
↑ repeated 10,000×
for "drake meme" results

Each request pays the full query cost. Popular searches hammer the database.

With cache (Redis)

search:"drake meme" → [4821, 4799, 4755, ...]
template:12 → "templates/drake.jpg"
meme:4821 → { top_text: "Me", ... }

Results stored by key. First request hits the database; the next 9,999 are served from memory.

Select images (such as the most popular templates) can also be stored in memory.

Handling Traffic When a Meme Goes Viral

Your meme generator handles 100 requests/second on a normal day. Then your distracted boyfriend meme gets posted on Reddit. Suddenly: 100,000 requests/second. How do you not crash?

Without a queue: direct processing

100,000 users

Your server
(built for 100)

💥 crash

Requests pile up, server runs out of memory, everyone gets an error.

With a queue: buffered processing

100,000 users

Message queue
(holds requests)

Workers process
at sustainable pace

Requests are accepted immediately. Workers drain the queue as fast as they can. No crashes, no lost requests.

Common services: AWS SQS, Google Pub/Sub, RabbitMQ. Queues usually provide at-least-once delivery with retries/visibility timeouts; handlers should be idempotent, and poison messages should go to a dead-letter queue.

Authenticating Users Securely

Your meme generator needs to know who's who. Authentication is deceptively hard to get right — and the consequences of getting it wrong are severe.

What could go wrong?

  • Passwords stored in plain text — one breach exposes everyone
  • Weak session tokens — attackers can impersonate users
  • No rate limiting — brute-force login attacks
  • Forgotten password reset flaws — account takeover
  • No MFA — stolen password = full access

What you actually need

  • Secure password hashing (bcrypt, Argon2)
  • Short-lived, signed session tokens (JWT)
  • Rate limiting on login attempts
  • Secure password reset flows
  • Multi-factor authentication support
  • OAuth ("Sign in with Google")

This is why teams use managed auth services. The attack surface is large, the stakes are high, and the requirements are well-understood. This is exactly the kind of problem best outsourced to specialists.

Monitoring and Alerting

Your meme generator is running in production. How do you catch problems before they blow up?

Without alerting

  • Users notice before you do
  • You find out via angry tweets
  • You have no idea when it started
  • You don't know how many users are affected
  • You're guessing at the cause

With alerting

  • Monitor key metrics: error rate, latency, queue depth
  • Set thresholds: "page me if error rate exceeds 1%"
  • Get notified before users do
  • Know exactly when the problem started
  • On-call rotation shares the burden

Common services: PagerDuty, Datadog, AWS CloudWatch. The goal is to be proactive — your monitoring catches the problem, not your users.

What you don't get paged about: disk failures, server reboots, OS upgrades, network outages, hardware replacements. The cloud handles these — you only get woken up for problems in your code.

Observability: Diagnosing Why Memes Fail in Asia

Alerting tells you something is wrong. Observability tells you why — even when the problem only affects some users, in some regions, some of the time.

Logs

A record of what happened

[ERROR] meme:4821 failed to load
region: ap-northeast-1
latency: 8200ms
cause: storage timeout

Metrics

Aggregated measurements over time

error_rate{region="asia"} 12%
error_rate{region="us"} 0.1%
p99_latency{region="asia"} 8s
p99_latency{region="us"} 120ms

Traces

The path of a single request

User request (8200ms total)
├─ Auth check 12ms
├─ Database query 45ms
└─ Image fetch 8100ms ⚠
└─ CDN miss → origin
(wrong region)

Without the trace, you might spend hours guessing. With it, the problem is obvious: images are being fetched from the wrong region. Common services: Datadog, Sentry, AWS CloudWatch.

Who configures this? The platform automatically captures infrastructure logs (server restarts, network errors, scaling events). You add application logs — the lines in your code that record what your meme generator is actually doing.

Offering an API: You Need a Gateway

Your meme generator is a hit. Developers want to build mobile apps, bots, and integrations on top of it. You decide to offer a public API. Every request needs auth, routing, and rate limiting — but you don't have to write any of it yourself.

Client request

API Gateway
├─ Authenticate: is this a valid API key?
├─ Rate limit: has this client exceeded 100 requests/min?
├─ Route: /memes/* → meme service
│ /users/* → user service
│ /templates/* → template service
└─ Forward to your code

Without a gateway

Every service reimplements auth, rate limiting, and routing. Changes must be made everywhere. One misconfiguration exposes your entire backend.

With a gateway

One place to enforce policies. Backend services only see authenticated, routed requests. Add a new service without changing client code.

Common services: AWS API Gateway, Kong, Supabase. Configure policies in one place — your backend code never sees an unauthenticated or over-limit request.

Summary: Infrastructure Building Blocks

Cloud platforms provide standardized components that solve these recurring problems. Just as we have design patterns in code, these "building blocks" appear across architectural styles.

Databases

Structured data persistence

PostgreSQL, MongoDB, DynamoDB

Object Storage

Files and binary data at scale

S3, Cloud Storage, Supabase Storage

Message Queues

Async communication, buffering

SQS, Pub/Sub, RabbitMQ, pgmq

Caches

Fast access to hot data

Redis, Memcached, Upstash

API Gateways

Unified entry point, auth, routing

AWS API Gateway, Kong

Observability

Logs, metrics, traces

Sentry, Datadog, CloudWatch

Serverless architecture is fundamentally about composing these managed services: you write functions containing business logic; the cloud provider operates the infrastructure.

Your Meme Generator: Composed from Managed Services

You wrote the meme generation logic. Everything else? Managed services you composed together.

The purple box is what you write. Everything else is managed infrastructure — operated by specialists, scaled automatically, billed by usage.

The Cloud Deployment Spectrum

A horizontal spectrum of seven cloud deployment models. Each station shows an 8-layer stack (Facility, Power, Network, Hardware, Virtualization, OS, Runtime, App) with orange layers (you manage) and teal layers (provider manages). LEFT: Own Data Center has all orange layers, sweating developer. Moving RIGHT: progressively more teal layers. RIGHT: FaaS has almost all teal, with developer relaxing on a cloud-shaped hammock holding only a tiny fn() function. Legend shows orange=you, teal=provider.

Not shown: SaaS (Software as a Service) — even further right.

What Is Serverless?

"Serverless" doesn't mean no servers. It means someone else manages them. You write code and compose managed services; the cloud provider handles everything beneath.

Traditional

  • You provision servers
  • You configure the OS and runtime
  • You handle scaling and restarts
  • You pay for idle time
  • You get paged when hardware fails

Serverless

  • Vendor provisions servers
  • Vendor configures OS and runtime
  • Vendor handles scaling and restarts
  • You pay only for execution time
  • Vendor gets paged when hardware fails

Your meme generator is already largely serverless — managed database, object storage, cache, queue, auth, and API gateway. The last piece: your own code. That's where Functions as a Service comes in.

Functions as a Service (FaaS)

Instead of a server running 24/7, you deploy functions that execute in response to events. No main(), no server setup — just your logic.

Meme generation function

public class GenerateMemeHandler
implements RequestHandler<APIGatewayProxyRequestEvent,
APIGatewayProxyResponseEvent> {

public APIGatewayProxyResponseEvent handleRequest(
APIGatewayProxyRequestEvent request,
Context context) {

MemeRequest body = parseJson(request.getBody());
byte[] template = storage.get(body.templateId());
byte[] meme = MemeUtils.addText(
template, body.topText(), body.bottomText());
String url = storage.put("memes/", meme);
db.insert(new Meme(body, url));
return ok(new MemeResponse(url));
}
}

Four key properties

① Event-driven Platform calls your function when an event arrives. You don't listen for requests.

② Stateless No state persists between calls. All state lives in external services.

③ Clear contract Request in, response out. Keep side effects explicit (e.g., DB/storage writes) and bounded.

④ Pay per invocation No requests? No cost. 1M requests? Billed for exactly that.

Event-Driven Execution

Serverless functions are triggered by events — not just HTTP requests. This enables reactive architectures where functions respond to changes in the system.

Four event sources (HTTP Request like POST /submissions, File Upload like test files, Database Change like new submission triggers, Schedule like nightly exports) shown as distinct icons at top, with arrows converging on a central serverless function. Multiple copies of the function appear (illustrating auto-scaling). Functions connect to downstream services.

Energy Efficiency Considerations

Serverless architecture has interesting sustainability implications that cut both ways.

Potential Energy Savings

  • No idle power: Monolith runs 24/7 even at 3 AM. Serverless consumes energy only when executing.

  • Shared infrastructure: Cloud providers achieve high utilization across thousands of customers. 80% utilization > 10%.

  • Right-sized execution: Functions get exactly the resources needed (modulo startup overhead).

Potential Energy Costs

  • Cold start overhead: Spinning up new containers has energy costs warm monoliths avoid.

  • Per-request overhead: Each invocation goes through routing, logging, billing infrastructure.

  • Distributed chattiness: Many small functions calling each other = network energy costs.

The architectural lesson: batch operations when possible. Pawtograder's submitFeedback() sends all test results in one call, not 100 separate calls. This saves latency, cost, AND energy.

When Does Serverless Fit?

Split illustration showing serverless fit. LEFT (green): Image resize (S3 trigger), Welcome emails (database trigger), Webhook handlers (GitHub events), Submissions at deadline (bursty traffic). RIGHT (red): Video encoding (timeout), Multiplayer games (cold starts), In-memory cache (stateless problem), High-frequency trading (sustained load). Center shows the three key questions: scaling, latency, ops.

Bringing It Together: L19 → L20 → L21

LectureQuestionKey Insight
L19How do we organize code?Architectural styles emerge from quality attribute requirements. Monolith-first is usually right.
L20What changes over networks?The eight fallacies. Every network call can fail, be slow, or be intercepted.
L21What if someone else manages infra?Serverless = technical partitioning with a vendor. Same principles, different operational model.

The thread connecting all three:

Same design principles at every scale:

  • Information hiding (L6)
  • Coupling and cohesion (L7)
  • Hexagonal architecture (L16)
  • Quality attribute tradeoffs (L19)

The practical takeaway:

No single architecture is right for everything. Pawtograder's hybrid approach demonstrates this — serverless API, managed compute for grading, PostgreSQL for domain logic.

Same Questions, Every Scale

At every level — class, module, service, system — you ask the same four questions:

QuestionWhat It Determines
What changes independently?Where to draw boundaries
Who needs to know?What the interface should hide
What can fail?How explicit your error handling must be
What are you trading?Whether the tradeoff is worth it

L6-L7

Classes & methods

Private fields, cohesive modules

L16-L18

Services & boundaries

Ports, adapters, APIs

L20

Network boundaries

Fallacies, failures, security

L21

Vendor boundaries

Managed infra, tradeoffs

The Architect's Toolkit

You now have a framework for approaching any system:

When you see a boundary, ask:

  • What's hiding behind it?
  • Who owns each side?
  • What happens when communication fails?

When you're drawing a boundary, ask:

  • What changes independently?
  • Who needs to know about what?
  • Is this a one-way door or two-way door?

When evaluating an architecture, ask:

  • What quality attributes drove these choices?
  • What tradeoffs were accepted?
  • What would break if requirements changed?

When choosing complexity, ask:

  • Do I have a specific problem this solves?
  • Can I start simpler and evolve?
  • What's the cost of being wrong?

The principles scale. The details change. The questions stay the same.

What's Next: Teams and Collaboration

We've been implicitly assuming a single developer making all decisions. Real software is built by teams — and team structure has a big impact on how software gets built.

L22: Teams and Collaboration

  • How teams organize, communicate, coordinate
  • Why org structure shapes system structure
  • Architectural boundaries often become team boundaries
  • Strategies for effective collaboration

The connection:

Today we saw serverless as outsourcing infrastructure to a specialist vendor — your team focuses on domain logic, they focus on infra.

That's an organizational decision as much as a technical one.

Bonus Slide

Two-part meme with Fred from Scooby Doo unmasking a ghost.
Part 1: 'Let's see you you really are...', reaching for a sheet-wearing person labeled 'Serverless Computing'.
Part 2: The removed mask reveals 'Servers'.