-
Notifications
You must be signed in to change notification settings - Fork 3.1k
[NVIDIA] Build CUDA 13 #11299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
johnnynunez
wants to merge
38
commits into
sgl-project:main
Choose a base branch
from
johnnynunez:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[NVIDIA] Build CUDA 13 #11299
Changes from all commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
162aaca
BUMP FA aligned with SGL-KERNEL
johnnynunez 4f281c7
BUILD CUDA 13
johnnynunez 0b34d78
BUILD CUDA 13
johnnynunez 28bbd2a
decord2
johnnynunez 3890637
fix typo
johnnynunez ba0294e
Merge branch 'main' into main
johnnynunez 771a614
fix typo
johnnynunez 4793f03
Merge remote-tracking branch 'origin/main'
johnnynunez 6054344
Merge branch 'main' into main
johnnynunez b7a4f36
Merge branch 'main' into main
johnnynunez 9a2c0b2
Merge branch 'main' into main
johnnynunez 1d5cb44
build cu129 ok, now test pr test cu130
johnnynunez 1be9fa5
Merge branch 'main' into main
johnnynunez 95d1a3e
Merge branch 'main' into main
johnnynunez c4f9a20
test bump fa
johnnynunez 20c5a82
Update CMakeLists.txt
johnnynunez 5419b85
Merge branch 'main' into main
johnnynunez 800c6e3
fix
johnnynunez 126acc6
Merge remote-tracking branch 'origin/main'
johnnynunez 74d1416
fix
johnnynunez 4048eab
fix tests
johnnynunez 9ee6bac
bump docker
johnnynunez ed65fb1
bump torch
johnnynunez 7b6d88f
Merge branch 'main' into main
johnnynunez 6a53e65
bump gdrcopy compatible with Blackwell
johnnynunez 1860c3f
bump gdrcopy compatible with Blackwell
johnnynunez 11ca260
bump gdrcopy compatible with Blackwell
johnnynunez 072add5
add support gdrcopy for GB200,GB300,Thor,Spark
johnnynunez 0f94fc2
Merge branch 'main' into main
johnnynunez 49acd56
upgrade cuda 13
johnnynunez abfeb61
Update ci_install_deepep.sh
johnnynunez 98682db
Merge branch 'main' into main
johnnynunez 1982c4f
revert
johnnynunez 41b0c19
Merge remote-tracking branch 'origin/main'
johnnynunez fb2b87c
Merge branch 'main' into main
johnnynunez f065618
remove thor in cu12
johnnynunez de4d952
revert
johnnynunez e67a841
revert
johnnynunez File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -184,3 +184,171 @@ jobs: | |
git add -A | ||
git commit -m "update whl index" | ||
git push | ||
|
||
build-cu130: | ||
if: github.repository == 'sgl-project/sglang' | ||
runs-on: x64-kernel-build-node | ||
strategy: | ||
matrix: | ||
python-version: [ "3.10" ] | ||
cuda-version: [ "13.0" ] | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
submodules: "recursive" | ||
|
||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
|
||
- name: Build wheels | ||
run: | | ||
cd sgl-kernel | ||
chmod +x ./build.sh | ||
./build.sh "${{ matrix.python-version }}" "${{ matrix.cuda-version }}" | ||
|
||
- name: Upload to PyPI | ||
working-directory: sgl-kernel | ||
run: | | ||
pip install twine | ||
python3 -m twine upload --skip-existing dist/* -u __token__ -p ${{ secrets.PYPI_TOKEN }} | ||
|
||
- name: Upload artifacts | ||
uses: actions/upload-artifact@v4 | ||
with: | ||
name: wheel-python${{ matrix.python-version }}-cuda${{ matrix.cuda-version }} | ||
path: sgl-kernel/dist/* | ||
|
||
release-cu130: | ||
needs: build-cu130 | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
|
||
- name: Download artifacts | ||
uses: actions/download-artifact@v4 | ||
with: | ||
path: sgl-kernel/dist/ | ||
merge-multiple: true | ||
pattern: wheel-* | ||
|
||
- name: Set tag name | ||
id: set_tag_name | ||
run: | | ||
if [ -z "${{ inputs.tag_name }}" ]; then | ||
TAG_NAME="v$(cat sgl-kernel/python/sgl_kernel/version.py | cut -d'"' -f2)" | ||
echo "tag_name=$TAG_NAME" >> $GITHUB_OUTPUT | ||
else | ||
echo "tag_name=${{ inputs.tag_name }}" >> $GITHUB_OUTPUT | ||
fi | ||
|
||
- name: Release | ||
uses: softprops/action-gh-release@v2 | ||
with: | ||
tag_name: ${{ steps.set_tag_name.outputs.tag_name }} | ||
repository: sgl-project/whl | ||
token: ${{ secrets.WHL_TOKEN }} | ||
files: | | ||
sgl-kernel/dist/* | ||
|
||
- name: Clone wheel index | ||
run: git clone https://oauth2:${WHL_TOKEN}@github.com/sgl-project/whl.git sgl-whl | ||
env: | ||
WHL_TOKEN: ${{ secrets.WHL_TOKEN }} | ||
|
||
- name: Update wheel index | ||
run: python3 scripts/update_kernel_whl_index.py --cuda 130 | ||
|
||
- name: Push wheel index | ||
run: | | ||
cd sgl-whl | ||
git config --local user.name "sglang-bot" | ||
git config --local user.email "[email protected]" | ||
git add -A | ||
git commit -m "update whl index" | ||
git push | ||
|
||
build-cu130-aarch64: | ||
if: github.repository == 'sgl-project/sglang' | ||
runs-on: arm-kernel-build-node | ||
strategy: | ||
matrix: | ||
python-version: [ "3.10" ] | ||
cuda-version: [ "13.0" ] | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
submodules: "recursive" | ||
|
||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
|
||
- name: Build wheels | ||
run: | | ||
cd sgl-kernel | ||
chmod +x ./build.sh | ||
./build.sh "${{ matrix.python-version }}" "${{ matrix.cuda-version }}" aarch64 | ||
|
||
- name: Upload to PyPI | ||
working-directory: sgl-kernel | ||
run: | | ||
pip install twine | ||
python3 -m twine upload --skip-existing dist/* -u __token__ -p ${{ secrets.PYPI_TOKEN }} | ||
|
||
- name: Upload artifacts | ||
uses: actions/upload-artifact@v4 | ||
with: | ||
name: wheel-python${{ matrix.python-version }}-cuda${{ matrix.cuda-version }}-aarch64 | ||
path: sgl-kernel/dist/* | ||
|
||
release-cu130-aarch64: | ||
needs: build-cu130-aarch64 | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
|
||
- name: Download artifacts | ||
uses: actions/download-artifact@v4 | ||
with: | ||
path: sgl-kernel/dist/ | ||
merge-multiple: true | ||
pattern: wheel-* | ||
|
||
- name: Set tag name | ||
id: set_tag_name | ||
run: | | ||
if [ -z "${{ inputs.tag_name }}" ]; then | ||
TAG_NAME="v$(cat sgl-kernel/python/sgl_kernel/version.py | cut -d'"' -f2)" | ||
echo "tag_name=$TAG_NAME" >> $GITHUB_OUTPUT | ||
else | ||
echo "tag_name=${{ inputs.tag_name }}" >> $GITHUB_OUTPUT | ||
fi | ||
|
||
- name: Release | ||
uses: softprops/action-gh-release@v2 | ||
with: | ||
tag_name: ${{ steps.set_tag_name.outputs.tag_name }} | ||
repository: sgl-project/whl | ||
token: ${{ secrets.WHL_TOKEN }} | ||
files: | | ||
sgl-kernel/dist/* | ||
|
||
- name: Clone wheel index | ||
run: git clone https://oauth2:${WHL_TOKEN}@github.com/sgl-project/whl.git sgl-whl | ||
env: | ||
WHL_TOKEN: ${{ secrets.WHL_TOKEN }} | ||
|
||
- name: Update wheel index | ||
run: python3 scripts/update_kernel_whl_index.py --cuda 130 | ||
|
||
- name: Push wheel index | ||
run: | | ||
cd sgl-whl | ||
git config --local user.name "sglang-bot" | ||
git config --local user.email "[email protected]" | ||
git add -A | ||
git commit -m "update whl index" | ||
git push |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make it a matrix to unify cu130 and cu130-aarch64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i can. I’m testing at this moment sgl-flash-attention