Chapter 13: Security
A planetary scale computer is a vast attack surface. Our security service provides token-based authentication for the system's dashboard, demonstrating the principles of authentication, authorization, and token management that protect real distributed systems.
Interface
security/src/lib.rs
The security service exposes four procedures. CREATE_TOKEN generates a new
token with a name and permission set. VALIDATE_TOKEN checks whether a
token is valid and returns the associated identity. REVOKE_TOKEN invalidates
a compromised token. LIST_TOKENS enumerates all active tokens for the
dashboard.
pub const CREATE_TOKEN_PROCEDURE: ProcedureId = 601;
pub const VALIDATE_TOKEN_PROCEDURE: ProcedureId = 602;
pub const REVOKE_TOKEN_PROCEDURE: ProcedureId = 603;
pub const LIST_TOKENS_PROCEDURE: ProcedureId = 604;
#[derive(Debug, Serializable, Deserializable)]
pub struct CreateTokenArgs {
pub name: String,
pub permissions: String,
}
#[derive(Debug, Serializable, Deserializable)]
pub struct ValidateTokenResult {
pub valid: i32,
pub name: String,
pub permissions: String,
}
Token-Based Authentication
security/src/main.rs
The security service manages authentication tokens — opaque strings that
grant access to protected operations. Each token has a name (identifying
who it belongs to), a set of permissions, and a creation timestamp. The
frontend checks for an auth_token cookie on every sensitive dashboard
operation (POST requests that modify state) by calling the security
service's VALIDATE_TOKEN procedure.
struct TokenEntry {
name: String,
token: String,
permissions: String,
created_at: u64,
}
struct SecurityState {
tokens: HashMap<String, TokenEntry>,
rng_state: u64, // xorshift64 seed
}
Token Generation: PRNGs in Distributed Systems
security/src/main.rs
Token generation raises an interesting question: where does randomness come
from in a distributed system? Hardware random number generators are slow.
Cryptographic PRNGs (like /dev/urandom) are better but still involve
system calls. For our educational implementation, we use xorshift64 —
a fast, self-contained pseudorandom number generator:
fn xorshift64(&mut self) -> u64 {
let mut x = self.rng_state;
x ^= x << 13;
x ^= x >> 7;
x ^= x << 17;
self.rng_state = x;
x
}
The xorshift64 generator produces uniformly distributed 64-bit values with a period of 264−1. We concatenate two outputs to form a 128-bit hex token. In a production system, you would use a cryptographically secure PRNG, but xorshift64 demonstrates the core concept: deterministic functions that produce seemingly random output from a seed value. The seed is derived from the system clock at startup, making each instance's token stream unique.
The Bootstrap Problem
security/src/main.rs — admin token seeding at startup
Token-based auth creates a chicken-and-egg problem: how do you create the
first token if all mutations require a valid token? An early approach was a
bootstrap exception — leaving one route unauthenticated — but that
creates an attack surface on any public deployment. Instead, we seed the
admin token from an environment variable at startup:
let mut initial_state = SecurityState::new();
if let Ok(token) = std::env::var("ADMIN_TOKEN") {
if !token.is_empty() {
initial_state.tokens.insert(token.clone(), TokenEntry {
name: "admin".to_string(),
token,
permissions: "admin".to_string(),
created_at,
});
}
}
The startup script (start.sh) generates a random token with
openssl rand -hex 16, exports it as ADMIN_TOKEN, and
prints it for the operator. The operator sets the cookie via browser console
(document.cookie = "auth_token=...;path=/") and can then use the
dashboard to create additional tokens. Every POST route — including
token creation — requires a valid auth_token cookie.
This pattern is standard in real systems. Kubernetes creates a bootstrap token during cluster initialization. Cloud providers use IAM root credentials seeded at provisioning time. The key insight is that the bootstrap secret lives in the deployment environment, not in an unauthenticated HTTP endpoint.
Authorization Middleware
frontend/src/main.rs
The frontend implements authorization as middleware — a function that runs
before each protected route handler:
async fn require_admin(headers: &str) -> bool {
if let Some(token) = parse_cookie(headers, "auth_token") {
let result = security::validate_token(SECURITY_ADDR, token).await;
return result.valid == 1;
}
false
}
This pattern — extracting credentials from the request, validating them against a central authority, and gating access based on the result — is the same pattern used by API gateways, service meshes, and web frameworks at planetary scale. The security dashboard lets you create tokens, view active tokens, and revoke compromised ones.
Integrity
Authentication controls who can act, but it does not protect against volume. A valid user can still overwhelm a system with requests, and an attacker does not need credentials to consume resources. Integrity is the practice of ensuring a system remains functional under hostile conditions — rate limiting, IP blackholing, and defense in depth.
Rate Limiting
loadbalancer/src/main.rs — TokenBucket struct
The load balancer implements per-IP rate limiting using a
token bucket algorithm.
Each IP address gets a bucket that holds up to 30 tokens (the burst capacity)
and refills at 2 tokens per second (sustaining ~120 requests per minute).
Every request consumes one token. When the bucket is empty, the request is
rejected with 429 Too Many Requests:
fn try_consume(&mut self) -> bool {
let now = Instant::now();
let elapsed = now.duration_since(self.last_refill).as_secs_f64();
self.tokens = (self.tokens + elapsed * RATE_REFILL).min(RATE_CAPACITY);
self.last_refill = now;
if self.tokens >= 1.0 {
self.tokens -= 1.0;
true
} else {
false
}
}
Token buckets are elegant because they allow bursts (a user loading a page fetches several resources at once) while enforcing a sustained rate. The alternative — fixed-window counters — creates boundary problems where a client can send twice the limit by timing requests across the window edge.
IP Blackholing
loadbalancer/src/main.rs — record_violation()
Rate limiting alone is not sufficient. An attacker who is repeatedly rejected
still consumes CPU cycles for each rejection. The load balancer escalates
automatically: if an IP accumulates 10 rejections within 60 seconds, it is
blackholed — banned for 5 minutes. Blackholed IPs receive
429 Retry-After: 300 immediately, before the request body is even read:
fn record_violation(&mut self, ip: IpAddr) {
let now = Instant::now();
let (count, window_start) = self.violations
.entry(ip).or_insert((0, now));
if now.duration_since(*window_start) > BLACKHOLE_WINDOW {
*count = 0;
*window_start = now;
}
*count += 1;
if *count >= BLACKHOLE_THRESHOLD {
self.blacklist.insert(ip, now + BLACKHOLE_DURATION);
self.violations.remove(&ip);
}
}
A background task sweeps all three maps (rate limit buckets, blacklist entries, violation counters) every 60 seconds to remove expired state. This prevents memory growth from long-running deployments.
Defense in Depth
These protections layer together. The
load balancer is the first
line of defense: rate limiting and blackholing happen before requests reach any
backend service. The frontend is the second line: every dashboard mutation
requires a valid auth_token cookie validated against the security service.
The monitoring service provides
visibility — the load balancer reports rate_limited and
blackholed metrics so operators can detect attacks in real time.
Introspection endpoints (/__lb_status and /__lb_strategy)
are restricted to loopback addresses, preventing external users from reading
backend topology or changing the load balancing strategy. All backend services
bind to 127.0.0.1 and are unreachable from the internet — only the
load balancer listens on 0.0.0.0. This is the same architecture used
by production reverse proxies: a single hardened entry point funneling traffic
to internal services.