Skip to content

Conversation

@deaneeth
Copy link
Owner

This pull request introduces the v2.0.0 release of TinyGPU, featuring significant enhancements to the instruction set, visualization, continuous integration, and documentation. The release adds new shared memory instructions, improves synchronization semantics, expands the visualizer, and updates the project structure and examples to better support educational use and extensibility.

Major new features and improvements:

Instruction Set and Core Functionality:

  • Added new shared memory instructions SHLD and SHST for robust per-block shared memory operations, and improved SYNC/SYNCB semantics for better thread and block coordination. [1] [2] [3]
  • Refactored core execution logic for simpler step semantics and optimized performance with larger thread counts. [1] [2]

Visualizer and Example Scripts:

  • Enhanced visualizer with improved GIF export and direct saving from simulator runs. Updated all example scripts (run_odd_even_sort.py, run_reduce_sum.py, run_sync_test.py, new run_block_shared_sum.py) to output GIFs to src/outputs/<script_name>/ and included new example programs, such as block shared memory sum and a REPL debugger. [1] [2] [3] [4] [5] [6] [7] [8] [9]

Continuous Integration and Code Quality:

  • Updated GitHub Actions CI workflow to test on the new dev branch and Python 3.13, and added badges for linting, code style, and tests. Integrated ruff and black for linting and formatting. [1] [2] [3]

Documentation and Project Structure:

  • Overhauled README.md and added docs/index.md for v2.0.0, with detailed changelogs, updated examples, project layout, and instruction set reference. Updated all documentation and image paths to reflect the new src/outputs/ organization. [1] [2] [3] [4] [5]

New and Improved Examples:

  • Added new example programs such as block shared memory sum (block_shared_sum.tgpu and runner), and an interactive REPL debugger (debug_repl.py). [1] [2]

These changes collectively make TinyGPU more powerful, extensible, and user-friendly for both educational and development purposes.

  • Instruction set and core:
    • Added SHLD and SHST instructions for shared memory, improved SYNC semantics, and refactored core execution for extensibility and performance. [1] [2] [3]
  • Visualizer and examples:
    • Improved GIF export, updated all example scripts to save outputs in src/outputs/, and added new examples for block shared memory and a REPL debugger. [1] [2] [3] [4] [5] [6] [7] [8] [9]
  • CI/CD and code quality:
    • Updated CI workflow to include Python 3.13 and dev branch, and added badges for linting and tests. Integrated ruff and black for code quality. [1] [2] [3]
  • Documentation:
    • Major update to README.md and new docs/index.md with v2.0.0 changelog, usage instructions, new project layout, and instruction set reference. [1] [2] [3] [4] [5]
  • New example programs:
    • Added block_shared_sum.tgpu, run_block_shared_sum.py, and debug_repl.py for demonstrating block-level operations and interactive debugging. [1] [2]

- Added TinyGPU.load_kernel(program, labels, grid=(blocks,tpb), args=None, shared_size=...)
  which configures grid, copies kernel args into R0..Rk, allocates shared memory, and loads program
- Added TinyGPU.run_kernel() convenience wrapper
- Added example runner demonstrating kernel launch (examples/run_vector_add_kernel.py)
- Added step_single(), snapshot(), and rewind() to TinyGPU
- Added examples/debug_repl.py: a tiny command-line REPL to step through kernels
- Snapshot returns PC, flags, registers, memory slice, and shared memory for quick debugging
…r examples

- Make SHLD/SHST robust: compute block_id from tid // threads_per_block (fall back to register if needed) and add defensive bounds checks to prevent IndexError when kernels overwrite registers.
- Refactor TinyGPU step internals and restore semantics so consecutive non-control instructions execute in the same cycle when expected.
- Normalize example imports by adding local `src` to sys.path when running scripts and switch to the `tinygpu` package imports.
- Add GIF saving for examples: each example now attempts to write a timestamped GIF to `src/outputs/<script_name>/` (wrapped in try/except to be CI/headless-friendly).
- Fix various lint/format issues (ruff/black): wrap long docstrings/lines, add noqa where appropriate, and remove tab indentation in tests.
- Update tests & verify: ruff/black/pytest passed; examples generated GIFs under `src/outputs/`.
@deaneeth deaneeth self-assigned this Oct 26, 2025
@deaneeth deaneeth added documentation Improvements or additions to documentation enhancement New feature or request labels Oct 26, 2025
@deaneeth deaneeth merged commit 9622620 into main Oct 26, 2025
6 checks passed
deaneeth added a commit that referenced this pull request Dec 16, 2025
Introduces the v2.0.0 release of TinyGPU
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant