Research / High Performance Computing

High Performance Computing

A 20-Year White-Box Research Archive

z → z² + c

From Fractal Clusters to AI Workstations

Archive source: tiggerfox.livejournal.com (2004–2026)
19
System Designs
20
Years
5
Tiers
$291
Lowest Build
$10.5M
Highest Build
4.5
PFLOPS (2026 Rack)

2026 Gold Standard Builds

Five tier builds priced from new retail hardware as of April 2026. The complete analysis covers GPU/CPU landscapes, the DRAM crisis, TOP500 comparisons, and historical trend data across every tier.

T1: Budget
$1,439
6C | 32GB | 17 TFLOPS
T2: Gaming
$4,712
16C | 64GB | 56.3 TFLOPS
T3: Scientific
$24K–$114K
128C | 768GB–2TB | 875 TFLOPS
T4: Rack (4U)
$105K
192C | 1.5TB | 500 TFLOPS
T5: 40U Rack
$971K
1,728C | 13.5TB | 4.5 PFLOPS
Market context: DRAM crisis (DDR5 5x above 2024), GPU supply constrained, all prices April 2026 new retail
View Full Analysis →
DRAM Crisis: RAM as % of Build Cost (2024 vs 2026)

The Five Tiers

Each system design answers a different question. The archive organizes around five tiers, each with its own economics and its own definition of "best."

T1 Budget Build (~$1K)

"Lowest possible cost, highest possible performance — what can you make on ~$1K?"

T2 Gaming / Workstation

"High-end gaming desktop — best prosumer build?"

T3 Scientific Workstation

"Very high performance for real scientific workloads — no compromise on compute"

T4 Rack-Optimized (4U)

"Lowest cost, highest performance in a standard 4U rack space — perf/$/U"

T5 Cutting Edge (40U)

"Best performance per 40U rack — price no object"

$/Core by Tier (2006–2026)

The fundamental metric, color-coded by tier. From $73 (T1 Budget King, 2009) to $5,087 (T3 Storage Monster, 2011) — the spread reveals purpose more than progress.

Total System Cost (Log Scale)

From $291 to $10.5 million — spanning five orders of magnitude. Each build answers a different question about what compute can do.

Core Count Over Time (Log Scale)

The multi-core trajectory: 6 threads in 2006, 224 cores by 2018. The dream of 32 cores from off-the-shelf parts was realized in March 2008.

RAM Over Time (Log Scale, GB)

From 4GB to 60TB. The render farm's 60,000GB of RAM in 2011 remains the peak — purpose-built for a specific class of problem.

TFLOPS and $/TFLOP

The GPU compute era: measurable floating point throughput changes the economics entirely. The 2026 AI workstation achieves 637.7 TFLOPS at $196/TFLOP.

TFLOPS (Log Scale)
$/TFLOP

Fractal Origins (2004–2006)

The journal title is the Mandelbrot set formula: z → z² + c. This is where the interest in high performance computing begins — not with server racks or data centers, but with a need to render fractals at impossible resolutions.

"132 billion iterations, 63 hours compute, 896-bit precision" LJ #4027, March 2006

By November 2006, the operation scaled to 7.1 gigapixel composites tiled at 2400 DPI for 30"x40" poster prints. Six render threads across four Windows boxes running Ultra Fractal network rendering. The target: 10 nodes at $250 each.

"Math is beautiful... lost in a sea of numbers" LJ #4554, March 2006

The rendering referenced earlier systems: a 486 with a 387 math coprocessor, Amiga systems. The need for compute did not arrive with modern hardware — it was always there, waiting for the silicon to catch up to the mathematics.

The Multi-Core Sprint (2008–2009)

March 2008 produced five system designs in two days — a systematic exploration of the price/performance landscape from $1,121 to $43,361. The central obsession: $/core as the unit of measure.

"I can finally build a 32 core system from off the shelf components... been wanting to do this since 1997" LJ #46901, March 19, 2008

The January 2009 benchmark analysis formalized the methodology: 31 CPUs compared on Whetstone per dollar. The winner — AMD X4 9600 at 183.86 W/T$ — went straight into the $291 Budget King build five months later.

"only looking at floating point performance as it is what I am most concerned with" LJ #67441, January 16, 2009

On the same day as the Budget King, the $3,630 3-Way SLI appeared — the first GPU-forward build. The comment thread reveals it was aspirational:

"I won't actually purchase this... I just play Starcraft" LJ #76703, May 29, 2009

GPU Compute & Scale (2011)

2011 marks the transition from CPU-centric to GPU-centric thinking. The $244,155 Storage Monster put 82% of its budget into a Violin 3200 10TB all-flash array ($200K). The remaining 18% bought 48 cores and 512GB RAM with a Quadro Plex 2200 S4.

The $10.5M Render Farm is the apex — 42 racks of Mercury GPU400-TX01 units delivering 9,139 TFLOPS with 4.8 million stream processors. At 1,260kW power draw, this is compute at datacenter scale, designed from commodity GPU accelerators.

VM Economics & Moore's Law (2018)

The 2018 build brought the analysis full circle: 224 cores, 12TB RAM, 8 Tesla V100s. But the real contribution was the VM economics analysis — $2,776 per VM at 224 VMs, or $1,388 per VM at 448 VMs running on threads.

"Intel 4004 (2,250 transistors, 10µm) to modern (100B/mm²) = 533,333x in 47 years" LJ #94260, April 2, 2018

The post explores 3D chip stacking as the next frontier, using a SimCity to Cities:Skylines analogy for the leap from planar to volumetric silicon. And the first mention of quantum: "50 qubits could outperform even the world's most powerful supercomputers."

The AI Era (2026)

The most recent build — $125,087 — is the first designed explicitly for AI workloads. 96 cores, 2TB RAM, 672GB VRAM across 7 NVIDIA RTX PRO 6000 cards. The metric shifts from $/core to $/TFLOP: $196.15.

The post benchmarks this against the NVIDIA GB200 NVL72: $3M for 1,000 PFLOPS at $3/TFLOP. The white-box approach still works — it just operates at a different point on the curve, trading peak throughput for accessibility.

Complete System Archive

13 system designs from 2006 to 2026. Each linked page contains the full component list, pricing, metrics, and the author's own words from the original LiveJournal post.

Methodology

These are the author's own words about the purpose and approach of these system designs, drawn from the original LiveJournal posts spanning 2006 to 2026.

"These represent buildable systems rather than aspirational machines" — design philosophy, consistent across all builds
"I use these posts as price-tracking references for future comparisons" LJ #46901
"I've been interested in multi-processor systems since 1997" LJ #46901
"only looking at floating point performance as it is what I am most concerned with" LJ #67441
"These numbers really don't mean that much but is meant as a guideline to determine a starting point when building a system" LJ #67441

Core Principles

  • White-box — off-the-shelf commodity parts, no vendor markup
  • $/core as the primary metric for CPU-era builds
  • $/TFLOP as the primary metric once GPU compute arrives
  • Floating point performance is the benchmark that matters
  • Buildable systems — every part is a real SKU at a real price
  • Price tracking — posts serve as snapshots for longitudinal comparison

Primary Sources

All system designs, quotes, and pricing data are sourced from the LiveJournal archive. The journal title — "theory of general chaos z→z²+c" — is the Mandelbrot set formula that started the HPC interest through fractal rendering.

tiggerfox.livejournal.com August 2004 – March 2026 19 system designs (13 historical + 6 gold standard) 20-year span