Add Support for CCT Last Layer Training with Dim 8-128 #55

runwangdl · 2025-03-20T01:35:19Z

This PR adds support for fine-tuning the last GEMM layer of CCT with embedding dimensions ranging from 8 to 128. The key additions and changes include:

Added

Support for SoftmaxCrossEntropyLoss and SoftmaxCrossEntropyLossGrad with tiling.
Implementation of SGD updates for CCT training.
Test for one iteration of CCT last-layer training with dimensions from 8 to 128.

Changed

Modified the outputs of LayerNorm and SoftmaxCrossEntropyLoss nodes to a single output for better tiling compatibility.
Added SGD parameter updates to the CCT training graph.

PR Merge Checklist

The PR is rebased on the latest devel commit and pointing to devel.
Your PR reviewed and approved.
All checks are passing.
The CHANGELOG.md file has been updated.
If the docker was modified, change back its link after review.

Victor-Jung

Great job Run! I only have a few minor comments.

.github/workflows/CI.yml

Deeploy/Targets/Generic/Parsers.py

Deeploy/Targets/PULPOpen/Platform.py

Deeploy/Targets/PULPOpen/TileConstraints/SGDTileConstraint.py

runwangdl · 2025-03-28T15:01:23Z

My fork CI has passed

…p-platform#55) * Add classifier training support * Update node with multioutput to single output * Add softmax cross-entropy grad tiling * Add softmax cross-entropy loss grad tiling * Add and pass test for CCT GEMM training 1_16_16_8 to 128 * Update CI with 8-128 dim CCT last GEMM training test * Add SGD support for PULP Open * Update Changelog * Address Review Comments

This release contains major architectural changes, new platform support, enhanced simulation workflows, floating-point kernel support, training infrastructure for CCT models, memory allocation strategies, and documentation improvements. After merging this into `main`, the release process will proceed with: - Pushing a Git tag for the release after merging this PR - Creating a GitHub release with the prepared tag. Note: Since the release tag references the Docker container tagged with the release tag (`ghcr.io/pulp-platform/deeploy:v0.2.0`), the CI will initially fail. The Deeploy Docker image must be built after the release PR is merged and the CI restarted. ### List of Pull Requests - Prepare v0.2.0 release [#102](#102) - Add Luka as Code Owner [#101](#101) - Fix CI, Docker Files, and Documentation Workflow [#100](#100) - Chimera Platform Integration [#96](#96) - Add Tutorial and Refactor README [#97](#97) - Reduce Mean Float Template [#92](#92) - Reshape Memory Freeing and Generic Float GEMM Fixes [#91](#91) - Prepare for Release and Separate Dependencies [#90](#90) - Fix input offsets calculation [#89](#89) - Move PULP SDK to main branch/fork [#88](#88) - Finite Lifetime for IO Tensors [#51](#51) - Improved Memory Visualization and Multi-Layer Tiling Profiling [#56](#56) - Fix Linting in CI and Reformat C Files [#86](#86) - Fix Broken CMake Flow For pulp-sdk [#87](#87) - Refactor Changelog For Release [#85](#85) - ARM Docker Container and Minor Bug Fix [#84](#84) - Added Kernel for Generic Float DW Conv2D [#63](#63) - Autoselect Self-Hosted Runners if the Action is on Upstream [#81](#81) - TEST_RECENT linking on MacOS [#78](#78) - Add RV32IMF Picolibc support for Siracusa platform [#66](#66) - Improve Documentation and VSCode Support [#76](#76) - Debug Print Topology Pass and Code Transformation [#75](#75) - Find all subdirectories of Deeploy when installing with pip install [#70](#70) - Add milestone issue template [#71](#71) - Bunch of fixes and changes [#58](#58) - Add SoftHier platform [#65](#65) - rv32imf_xpulpv2 ISA support for Siracusa platform [#64](#64) - One LLVM To Compile Them All [#60](#60) - One GVSoC to Simulate Them All [#59](#59) - Add Support for CCT Last Layer Training with Embedding Dim 8-128 [#55](#55) - Add CCT Classifier Training Support [#53](#53) - L3 Bugs: DMA Struct Datatype and Maxpool Margin Error [#45](#45) - DeepQuant Quantized Linear Support [#54](#54) - Implemented Dequant Layer for Generic and Siracusa [#52](#52) - Infinite Lifetime Buffers Considered in Tiling & Memory Allocation (+ Visualization) [#44](#44) - Implemented Quant Layer for Generic and Siracusa [#49](#49) - Increase maximal Mchan DMA transfer sizes from 64KiB to 128KiB [#47](#47) - Add MiniMalloc and Decouple Memory Allocation and Tiling [#40](#40) - Float CCT Bugs on L3 [#37](#37) - Memory Allocation Strategies and Visualization [#36](#36) - Add CODEOWNERS [#42](#42) - Add Tiling Support to All CCT Kernels and Fix CCT Operators on Siracusa Platform for L2 [#35](#35) - Add Fp gemm and Softmax for Snitch platform [#31](#31) - Add Float Kernels for CCT [#29](#29) - documentation deployment [#34](#34) - main.c Float Cast Bugs [#28](#28) - Add Float GEMM on PULP with Tiling [#26](#26) - Add Float Support & Float GEMM for Generic [#25](#25) - GVSOC support for the Snitch Cluster platform [#23](#23) - Snitch Cluster Tiling Support [#22](#22) - Snitch support integration [#14](#14) - Update bibtex citation [#20](#20) - the PR template location, bump min python to 3.10, change install command [#17](#17) - Add pre-commit for python formatting [#15](#15) - FP integration (v2) [#12](#12) - shell for sequential tests of Generic, Cortex, and Mempool platforms [#11](#11) - Add issue templates [#10](#10) - Minor CI and Readme Improvements [#8](#8) - Fix GHCR Link for Docker Build [#7](#7) - neureka's ccache id [#6](#6) - GitHub-based CI/CD Flow [#4](#4) - Generic Softmax Kernel [#2](#2) - Port GitLab CI [#1](#1)

runwangdl added 16 commits March 17, 2025 22:19

Add classifier training support

9ec13f9

Fix L3 DMA and Maxpool Bugs

f1a0491

correct DMA lengh of copy assertion

8bfdb13

delete redundant shell scripts

031dc79

Merge branch 'devel' into PULPCCTL3_16_16_64

58e18da

Update node with multioutput to single output

ac2d879

add softmaxcrossentropygrad tiling

6a7198b

Add softmaxcrossentropylossgrad tiling

360aef7

Merge branch 'PULPCCTL3_16_16_64' into GEMM_training_tiled

bc48582

Fix CI issue

b6542ba

Fix CI bugs

fe208d0

update CI

4a21359

Add and pass test for CCT gemmtraining 1_16_16_8 to 128

91f12f0

update CI with 8-128 dim CCT last gemm training test

d1e1ebf

Add SGD support for PULP Open

86a2e99

Update CCT training test with sgd

bdacd2f

runwangdl force-pushed the GEMM_training_tiled branch 2 times, most recently from c60594f to 4971564 Compare March 23, 2025 17:02

runwangdl marked this pull request as ready for review March 23, 2025 17:26

runwangdl requested review from Victor-Jung and Xeratec as code owners March 23, 2025 17:26

runwangdl changed the title ~~[Draft] Add Support for CCT Last Layer Training with Dim 8-128~~ Add Support for CCT Last Layer Training with Dim 8-128 Mar 23, 2025

Update Changelog

99035f0

runwangdl force-pushed the GEMM_training_tiled branch from 4971564 to 99035f0 Compare March 23, 2025 23:25

runwangdl added 2 commits March 23, 2025 23:31

Merge branch 'devel' into GEMM_training_tiled

62e87d3

Solved issues caused by merging conflicts

15ea3ec

runwangdl force-pushed the GEMM_training_tiled branch from 9c7e31c to 15ea3ec Compare March 23, 2025 23:41

Victor-Jung reviewed Mar 28, 2025

View reviewed changes

.github/workflows/CI.yml Outdated Show resolved Hide resolved

Deeploy/Targets/Generic/Parsers.py Show resolved Hide resolved

Deeploy/Targets/PULPOpen/Platform.py Show resolved Hide resolved

Deeploy/Targets/PULPOpen/TileConstraints/SGDTileConstraint.py Outdated Show resolved Hide resolved

runwangdl added 2 commits March 28, 2025 12:34

Solved Review Comments

a644fdf

Resolving conflicts

643e160

runwangdl added 2 commits March 28, 2025 12:52

Reresolve the conflict

80a9518

Solving CI issues

501775d

runwangdl force-pushed the GEMM_training_tiled branch from e046cb0 to 501775d Compare March 28, 2025 13:25

fix linting errors

65a56b7

Victor-Jung approved these changes Apr 7, 2025

View reviewed changes

Victor-Jung merged commit e97f6c8 into pulp-platform:devel Apr 7, 2025
121 checks passed

runwangdl deleted the GEMM_training_tiled branch April 11, 2025 15:55

Xeratec mentioned this pull request Jul 8, 2025

Release v0.2.0 #103

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Support for CCT Last Layer Training with Dim 8-128 #55

Add Support for CCT Last Layer Training with Dim 8-128 #55

Uh oh!

runwangdl commented Mar 20, 2025 •

edited by Victor-Jung

Loading

Uh oh!

Victor-Jung left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

runwangdl commented Mar 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Support for CCT Last Layer Training with Dim 8-128 #55

Add Support for CCT Last Layer Training with Dim 8-128 #55

Uh oh!

Conversation

runwangdl commented Mar 20, 2025 • edited by Victor-Jung Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Added

Changed

PR Merge Checklist

Uh oh!

Victor-Jung left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

runwangdl commented Mar 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

runwangdl commented Mar 20, 2025 •

edited by Victor-Jung

Loading