Reshape Memory Freeing and Generic Float GEMM Fixes #91

diaconuccalin · 2025-06-04T15:13:27Z

This PR tries to solve 2 issues:

1. Reshape memory freeing:

In the case of a reshape after a skip connection is stored for later use, the created alias may get freed before that later point. In the figure below, C (and, correspondingly, B and A, since they are all pointing to the same data) may get deallocated after it is used on the right branch, but before A is used in the MatMul. To solve this, a list of alises is stored for each variable, which only get deallocated if all their aliases are marked as not live.

2. GEMM Generic float:

Previously, in the general GEMM template, only the equal number of matrices between the A and B terms was supported, which raised issues when the GEMM was actually a fully connected layer (the weight/bias array would usually have a single matrix that should be multiplied with/added to each matrix in the other term). Now this operation should support partial broadcasting, when at least one of the input matrices has all dimensions 1, except for the last 2 dimensions.

Added

New alias list parameter for buffer objects
New test, also included in the CI pipeline, for the reshape and skip connection situation
'shape' parameter handling similar to the 'indices' parameter in the generic reshape template

Changed

Fixed

Buffer deallocation to only happen when all its aliases are not live anymore (the data stored there is not needed anymore, not even by other nodes)
GEMM Generic float template to iterate through terms only when they actually contain multiple matrices

PR Merge Checklist

The PR is rebased on the latest devel commit and pointing to devel.
Your PR reviewed and approved.
All checks are passing.
The CHANGELOG.md file has been updated.
If the docker was modified, change back its link after review.

…tions.

…lias_of parameter. Fixed GEMM template

Xeratec

Looks good, just one comment about adding new alias to a buffer. I can live with both options, but would be curious to get your opinion.

Deeploy/Targets/Generic/Parsers.py

Signed-off-by: Călin Diaconu <[email protected]>

Xeratec · 2025-06-16T16:59:48Z

Looks good. Thanks for the work!

This release contains major architectural changes, new platform support, enhanced simulation workflows, floating-point kernel support, training infrastructure for CCT models, memory allocation strategies, and documentation improvements. After merging this into `main`, the release process will proceed with: - Pushing a Git tag for the release after merging this PR - Creating a GitHub release with the prepared tag. Note: Since the release tag references the Docker container tagged with the release tag (`ghcr.io/pulp-platform/deeploy:v0.2.0`), the CI will initially fail. The Deeploy Docker image must be built after the release PR is merged and the CI restarted. ### List of Pull Requests - Prepare v0.2.0 release [#102](#102) - Add Luka as Code Owner [#101](#101) - Fix CI, Docker Files, and Documentation Workflow [#100](#100) - Chimera Platform Integration [#96](#96) - Add Tutorial and Refactor README [#97](#97) - Reduce Mean Float Template [#92](#92) - Reshape Memory Freeing and Generic Float GEMM Fixes [#91](#91) - Prepare for Release and Separate Dependencies [#90](#90) - Fix input offsets calculation [#89](#89) - Move PULP SDK to main branch/fork [#88](#88) - Finite Lifetime for IO Tensors [#51](#51) - Improved Memory Visualization and Multi-Layer Tiling Profiling [#56](#56) - Fix Linting in CI and Reformat C Files [#86](#86) - Fix Broken CMake Flow For pulp-sdk [#87](#87) - Refactor Changelog For Release [#85](#85) - ARM Docker Container and Minor Bug Fix [#84](#84) - Added Kernel for Generic Float DW Conv2D [#63](#63) - Autoselect Self-Hosted Runners if the Action is on Upstream [#81](#81) - TEST_RECENT linking on MacOS [#78](#78) - Add RV32IMF Picolibc support for Siracusa platform [#66](#66) - Improve Documentation and VSCode Support [#76](#76) - Debug Print Topology Pass and Code Transformation [#75](#75) - Find all subdirectories of Deeploy when installing with pip install [#70](#70) - Add milestone issue template [#71](#71) - Bunch of fixes and changes [#58](#58) - Add SoftHier platform [#65](#65) - rv32imf_xpulpv2 ISA support for Siracusa platform [#64](#64) - One LLVM To Compile Them All [#60](#60) - One GVSoC to Simulate Them All [#59](#59) - Add Support for CCT Last Layer Training with Embedding Dim 8-128 [#55](#55) - Add CCT Classifier Training Support [#53](#53) - L3 Bugs: DMA Struct Datatype and Maxpool Margin Error [#45](#45) - DeepQuant Quantized Linear Support [#54](#54) - Implemented Dequant Layer for Generic and Siracusa [#52](#52) - Infinite Lifetime Buffers Considered in Tiling & Memory Allocation (+ Visualization) [#44](#44) - Implemented Quant Layer for Generic and Siracusa [#49](#49) - Increase maximal Mchan DMA transfer sizes from 64KiB to 128KiB [#47](#47) - Add MiniMalloc and Decouple Memory Allocation and Tiling [#40](#40) - Float CCT Bugs on L3 [#37](#37) - Memory Allocation Strategies and Visualization [#36](#36) - Add CODEOWNERS [#42](#42) - Add Tiling Support to All CCT Kernels and Fix CCT Operators on Siracusa Platform for L2 [#35](#35) - Add Fp gemm and Softmax for Snitch platform [#31](#31) - Add Float Kernels for CCT [#29](#29) - documentation deployment [#34](#34) - main.c Float Cast Bugs [#28](#28) - Add Float GEMM on PULP with Tiling [#26](#26) - Add Float Support & Float GEMM for Generic [#25](#25) - GVSOC support for the Snitch Cluster platform [#23](#23) - Snitch Cluster Tiling Support [#22](#22) - Snitch support integration [#14](#14) - Update bibtex citation [#20](#20) - the PR template location, bump min python to 3.10, change install command [#17](#17) - Add pre-commit for python formatting [#15](#15) - FP integration (v2) [#12](#12) - shell for sequential tests of Generic, Cortex, and Mempool platforms [#11](#11) - Add issue templates [#10](#10) - Minor CI and Readme Improvements [#8](#8) - Fix GHCR Link for Docker Build [#7](#7) - neureka's ccache id [#6](#6) - GitHub-based CI/CD Flow [#4](#4) - Generic Softmax Kernel [#2](#2) - Port GitLab CI [#1](#1)

diaconuccalin added 5 commits June 2, 2025 10:41

Added fix for the reshape deallocation issue in skip connection situa…

45a8a20

…tions.

Merge branch 'pulp-platform:devel' into reshape_free_fix

23d1916

Added tests for reshape skip situations. Introduced reflexivity for a…

4762f51

…lias_of parameter. Fixed GEMM template

Extended Generic GEMM fix to support one-way broadcasting

fade25c

Added the new float reshape test to the CI pipeline. Reformatted code

8274f25

diaconuccalin self-assigned this Jun 4, 2025

diaconuccalin requested review from Victor-Jung and Xeratec as code owners June 4, 2025 15:13

diaconuccalin added the Bug Something isn't working label Jun 4, 2025

diaconuccalin changed the title ~~Reshape memory freeing and Generic float GEMM fixes~~ Reshape Memory Freeing and Generic Float GEMM Fixes Jun 4, 2025

diaconuccalin added 4 commits June 4, 2025 15:17

Updated changelog

531f91a

Add check for platforms that don't support the new alias parameter

97bd65d

Reformat

62ea0e9

New check to fix some Neureka CI passes

4db43e9

Xeratec added this to the Release xxx milestone Jun 5, 2025

Xeratec added this to Deeploy Jun 5, 2025

Xeratec moved this to Need Reviewer in Deeploy Jun 5, 2025

Xeratec approved these changes Jun 5, 2025

View reviewed changes

Deeploy/Targets/Generic/Parsers.py Outdated Show resolved Hide resolved

Deeploy/Targets/Generic/Parsers.py Outdated Show resolved Hide resolved

Xeratec moved this from Need Reviewer to In review in Deeploy Jun 5, 2025

Xeratec modified the milestones: Release xxx, Release 0.2.0 Jun 5, 2025

diaconuccalin added 3 commits June 10, 2025 12:59

Improved implementation for accessing the alias_of buffer parameter

52d89ee

Fixed attribute check

8c1cd42

Merge branch 'pulp-platform:devel' into reshape_free_fix

aed0d97

diaconuccalin marked this pull request as draft June 11, 2025 14:50

Test update CHANGELOG.md

cdb408c

Signed-off-by: Călin Diaconu <[email protected]>

diaconuccalin added a commit to diaconuccalin/Deeploy that referenced this pull request Jun 13, 2025

temporarly remove changelog entry for PR pulp-platform#91

7687906

diaconuccalin added 3 commits June 13, 2025 17:45

Revert changelog test change

bf65699

Merge branch 'pulp-platform:devel' into reshape_free_fix

eba204f

Moved alias functions to class

662db30

diaconuccalin marked this pull request as ready for review June 13, 2025 17:49

Xeratec merged commit c7bcec9 into pulp-platform:devel Jun 16, 2025
117 checks passed

github-project-automation bot moved this from In review to Done in Deeploy Jun 16, 2025

diaconuccalin mentioned this pull request Jun 17, 2025

[BUG] _alias fail #30

Closed

Xeratec mentioned this pull request Jul 8, 2025

Release v0.2.0 #103

Merged

diaconuccalin deleted the reshape_free_fix branch November 19, 2025 10:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reshape Memory Freeing and Generic Float GEMM Fixes #91

Reshape Memory Freeing and Generic Float GEMM Fixes #91

Uh oh!

diaconuccalin commented Jun 4, 2025 •

edited by Xeratec

Loading

Uh oh!

Xeratec left a comment

Uh oh!

Uh oh!

Uh oh!

Xeratec commented Jun 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Reshape Memory Freeing and Generic Float GEMM Fixes #91

Reshape Memory Freeing and Generic Float GEMM Fixes #91

Uh oh!

Conversation

diaconuccalin commented Jun 4, 2025 • edited by Xeratec Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Reshape memory freeing:

2. GEMM Generic float:

Added

Changed

Fixed

PR Merge Checklist

Uh oh!

Xeratec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Xeratec commented Jun 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

diaconuccalin commented Jun 4, 2025 •

edited by Xeratec

Loading