pellegrini/AC3-TP1

Fork 0

This repository provides a complete, reproducible pipeline using gem5 to quantitatively evaluate cache hierarchy sensitivities on select PolyBench kernels through an automated simulation sweep, result visualization, and LaTeX report generation, all orchestrated via a Zig build graph.

Python 53.8%
C 17.6%
Zig 15.5%
TeX 12.6%
Makefile 0.5%

Find a file

Lucca Pellegrini 9570450332 chore(README.md): update asciicast link to recording showcasing build system improvements		2026-05-24 11:05:02 -03:00
.config/mise	chore(mise): add Perl dependency to install TeX packages	2026-04-19 11:51:47 -03:00
demo	chore(demo): add `asciinema-automation` script to reproduce report with podman	2026-05-24 10:54:56 -03:00
gem5@7a2b0e413d	feat: add gem5 as a submodule	2026-03-18 15:35:15 -03:00
include	feat(workloads): add some benchmarks from PolyBench	2026-03-23 17:44:07 -03:00
report	chore(build.zig): move venv to root dir	2026-04-19 11:51:47 -03:00
workloads	chore: add LICENSE and copyright notices	2026-04-02 13:07:46 -03:00
.clang-format	initial commit	2026-03-17 11:41:42 -03:00
.clangd	feat(workloads): add some benchmarks from PolyBench	2026-03-23 17:44:07 -03:00
.gitignore	feat: add script to generate plot from results	2026-03-31 10:04:27 -03:00
.gitmodules	feat: add gem5 as a submodule	2026-03-18 15:35:15 -03:00
.python-version	feat: add .python-version for pyenv	2026-04-01 21:04:16 -03:00
analyze.zig	chore: add LICENSE and copyright notices	2026-04-02 13:07:46 -03:00
AUTHORS	chore: add README and authorship info	2026-04-03 18:16:31 -03:00
build.zig	chore(build.zig): format dependencies description properly	2026-04-19 11:51:47 -03:00
build.zig.zon	chore: add LICENSE and copyright notices	2026-04-02 13:07:46 -03:00
cache_config.py	fix(cache_config.py): make sure an exit code is returned when simulation fails	2026-04-19 11:51:47 -03:00
CITATION.cff	chore(CITATION.cff): fix metadata on the file so that it parses right	2026-04-05 23:06:52 -03:00
LICENSE	chore: add LICENSE and copyright notices	2026-04-02 13:07:46 -03:00
pyrightconfig.json	feat(gem5): compile gem5 with fast preset	2026-04-11 18:32:01 -03:00
README.md	chore(README.md): update asciicast link to recording showcasing build system improvements	2026-05-24 11:05:02 -03:00
requirements.txt	fix(run_all_simulations.py): make graphviz a hard requirement	2026-04-04 09:55:34 -03:00
run_all_simulations.py	fix(cache_config.py): make sure an exit code is returned when simulation fails	2026-04-19 11:51:47 -03:00
visualize_results.py	chore(mise): remove unused TeX package	2026-04-18 18:12:42 -03:00

README.md

Artifact Repository for “A Quantitative Cache Evaluation of Select PolyBench Kernels”

This repository contains the full submission for the course “Arquitetura de Computadores III” (Instituto de Ciências Exatas e Informática, Pontifícia Universidade Católica de Minas Gerais), 2026/1, Prof. Matheus Alcântara Souza.

It packages a complete, reproducible pipeline to study cache hierarchy sensitivities on a selected set of PolyBench kernels using the gem5 simulator. The pipeline is expressed in a single Zig build graph that: checks host prerequisites; pins and bootstraps Python via uv; initializes and builds the vendored gem5 submodule; compiles statically linked workloads; runs an exhaustive, parameterized simulation sweep; generates figures; and builds the final LaTeX report.

Target platform: Linux x86_64 only. While gem5 itself is portable, parts of the automation (uv/Python setup and LaTeX/report build) are written for Linux.

Full Demonstration

What’s Here

build.zig: End-to-end build graph (Zig 0.15.2)
cache_config.py: gem5 SE-mode system and cache configuration (CLI-tunable)
run_all_simulations.py: Orchestrates the full parameter sweep, with parallel workers and idempotent resumption
visualize_results.py: Turns results/ into publication figures under figures/
analyze.zig: Small helper to inspect/aggregate gem5 stats (optional)
workloads/: C sources for PolyBench kernels and small microbenchmarks
- atax.c, floyd-warshall.c, gemm.c, jacobi-2d.c, seidel-2d.c (PolyBench v4.2.1 kernels)
- array_stride.c, matrix_multiply.c, random_access.c (handwritten)
- polybench.c (shared runtime)
include/polybench.h: PolyBench configuration header
report/: IEEEtran paper sources; report/main.tex is the manuscript
gem5/: gem5 submodule (initialized by the build)

Who Made This

See AUTHORS for the full list and contact emails. If you use these artifacts, please cite the work described in report/main.tex. A machine-readable CITATION.cff is provided.

Summary Of The Experiment

Workloads (PolyBench v4.2.1): atax, floyd-warshall, gemm, jacobi-2d, seidel-2d
Core model: X86TimingSimpleCPU (in-order), 4 GHz; memory mode: timing; 8 GiB address space
Cache hierarchy: private L1I/L1D, shared L2, shared L3 (see cache_config.py)
Parameter sweep per workload (31 configs):
- 11 realistic multi-level size presets (e.g., baseline i7-6700K; Ryzen; Apple; server)
- cache line size ∈ {32, 64, 128, 256} bytes
- associativity sweep for L1, L2, and L3 ∈ {1, 2, 4, 8, 16} (one level varied at a time)
Total runs: 5 workloads × 31 configurations = 155 simulations
Dataset sizes per kernel were chosen to balance fidelity vs. run time (see table below)

Dataset choices compiled into the real-run workloads:

Kernel	Dataset	Notes/Dimensions (from report)
atax	LARGE	M=1900, N=2100
gemm	MEDIUM	NI=200, NJ=220, NK=240
floyd-warshall	SMALL	N=180
jacobi-2d	MEDIUM	N=250, T=100
seidel-2d	MEDIUM	N=400, T=100

Prerequisites (Linux)

Required (checked by zig build check-deps):

Zig 0.15.2 (ZVM/mise recommended)
uv (the build pins Python to 3.14.3 inside venv)
A system-wide GCC or Clang installation, since gem5 unfortunately will not accept Zig's internal Clang
git, just, and m4 to fetch and build gem5, as well as the report
A TeX distribution to generate plots and the report (a full TeX Live is the easiest option, but TinyTeX may also work).
Graphviz (dot on PATH) to render gem5 config.dot to PDF (the Python binding pydot is installed via requirements.txt)
Optionally, gperftools for the tcmalloc implementation, which speeds up gem5 a lot.

Quick Start

Clone release tag with gem5 submodule (shallow clone recommended for saving disk space):

git clone https://github.com/lucca-pellegrini/AC3-TP1.git --branch=v0.1.1 --depth=1 --recursive --shallow-submodules
cd AC3-TP1

Sanity-check your host tools:

zig build check-deps

Reproduce everything end-to-end (very long: build gem5, run 155 sims, make figures, build the paper):

zig build report

The gem5 build usually takes ~50 minutes with decent parallelism, depending on the hardware. The full simulations usually take hours to a few days, as the build caps parallel workers to 9 to reduce OOM risks. Upon finishing, figures and PDF report appear under figures/ and report/.

Using mise-en-place

Set up requirements

On Debian Trixie:

sudo apt install curl gcc g++ m4 git zlib1g-dev libgoogle-perftools-dev graphviz
curl https://mise.run | sh && export PATH="$HOME/.local/bin:$PATH"

On Fedora 43/RHEL 10/CentOS Stream 10/Rocky 10/Alma 10 (run as superuser):

dnf install 'dnf-command(copr)'
dnf copr enable jdxcode/mise
dnf install gcc gcc-c++ glibc-devel glibc-static libstdc++ libstdc++-devel libstdc++-static m4 git zlib-devel graphviz mise

On Arch Linux (run as superuser):

pacman -S --needed gcc m4 git graphviz gperftools zlib mise

Clone repo, trust config, and run

git clone https://github.com/lucca-pellegrini/AC3-TP1.git --depth=1 --shallow-submodules --recursive --branch=v0.1.1
cd AC3-TP1
mise trust
mise run # Or `mise report` to immediately run the entire build/simulation pipeline

Manual Workflow

Each major step is addressable. You can run them individually and resume safely.

# 1) Prepare Python (uv + venv + requirements.txt)
zig build setup-python

# 2) Initialize gem5 submodule
zig build init-gem5

# 3) Build gem5 simulator (gem5/build/X86/gem5.fast)
zig build gem5

# 4) Build the m5 control library (libm5.a)
zig build m5

# 5) Build workloads (default step)
zig build

# 6) Run the full sweep for all 5 PolyBench kernels
zig build simulations

# 7) Generate publication figures from results/
zig build visualize

# 8) Build the LaTeX report
zig build report

All steps are idempotent. You can interrupt long runs and rerun the same step later; remaining items will continue.

Running One-Off Simulations

You can run gem5 directly with a specific parameter and workload. Examples:

# Baseline config for jacobi-2d
./gem5/build/X86/gem5.fast \
  -d results/jacobi-2d_baseline -- \
  cache_config.py ./zig-out/bin/jacobi-2d

# Try a different cache line size for atax
./gem5/build/X86/gem5.fast \
  -d results/atax_cache_line_128 -- \
  cache_config.py --cache-line-size=128 ./zig-out/bin/atax

# Use a tiny debug binary (MINI dataset) to smoke-test the flow
./gem5/build/X86/gem5.fast \
  -d results/seidel-2d_testline_64 -- \
  cache_config.py --cache-line-size=64 ./zig-out/bin/seidel-2d-test

Or invoke the orchestrator for just one workload:

# Run all 31 configs for gemm with 4 workers and CPU pinning (requires psutil)
./.venv/bin/python run_all_simulations.py \
  --results-dir=results ./gem5/build/X86/gem5.fast ./zig-out/bin/gemm \
  -j 4 --pin-workers

# Dry-run to see what would execute (no simulations are launched)
./.venv/bin/python run_all_simulations.py \
  --dry-run --results-dir=results ./gem5/build/X86/gem5.fast ./zig-out/bin/jacobi-2d

Results Layout

Each simulation stores its outputs under results/<workload>_<variant>/. The directory always contains stats.txt (the counters used by the analysis) and a complete snapshot of the simulated system in config.ini, config.json, and config.dot along with a rendered config.dot.pdf. When a run finishes, the orchestrator drops a .completed marker so subsequent invocations can resume cleanly without redoing work. The plotting stage reads all runs from results/, writes publication figures to figures/, and the paper (report/main.tex) imports those figures directly.

Reproducibility Choices

To minimize drift, the build pins Python 3.14.3 via uv and installs all Python tooling (SCons, plotting libraries, and friends) into .venv. The gem5 source is vendored as a Git submodule at a fixed commit and is always built through that virtual environment’s SCons. Workloads target x86_64-linux-musl and are linked statically to reduce host‑dependency variance (see musl). The simulation runner is deterministic and resumable; it caps parallelism at min(9, nproc) to avoid oversubscription and out‑of‑memory failures.

Licensing

Unless a file states otherwise, source code in this repository is licensed under the ISC license (see the SPDX headers). The report (report/main.tex and the figures it includes) is distributed under CC BY-SA 4.0. The gem5 submodule remains under its own upstream license, and the PolyBench workloads in workloads/ remain under the Ohio State University Software Distribution License.

Citing

If you use the code, figures, or methodology, please cite the accompanying paper in report/main.tex: A Quantitative Cache Evaluation of Select PolyBench Kernels, Amanda Canizela Guimarães, Ariel Inácio Jordão, Lucca Pellegrini, Paulo Dimas Junior, Pedro Vitor Andrade, ICEI/PUC Minas, 2026. Machine-readable citation metadata is available in CITATION.cff.

README.md Unescape Escape