Chapter 30: Servers

A server is the basic unit of compute in a planetary scale computer. Modern servers pack enormous capability into a compact form factor: dozens of CPU cores, hundreds of gigabytes of memory, terabytes of storage, and network links capable of hundreds of gigabits per second. Understanding server architecture is important for sizing workloads and understanding performance characteristics.

Servers in a data center are typically rack-mounted in standard 19-inch racks. A single rack might hold 40 or more servers, along with network switches, power distribution, and cable management. The density of a rack — both in compute power and in power consumption — is a key constraint on data center design.

Server selection involves balancing many dimensions: CPU cores versus clock speed, memory capacity versus bandwidth, storage capacity versus IOPS, and network bandwidth versus latency. Different workloads have different profiles: a caching service needs large memory and fast network but little storage; a storage service needs large disks and durable writes but fewer CPU cores; a compute-intensive service needs many fast cores.

The trend toward heterogeneous compute — adding GPUs, FPGAs, and custom accelerators alongside CPUs — adds another dimension to server selection. Workloads like machine learning training benefit enormously from GPU acceleration, while cryptographic operations can be offloaded to dedicated hardware.