Skip to content

Commit 1a6659d

Browse files
authored
Merge branch 'main' into refactor_cutlass_3x_fp8_blockwise_gemm_sm90
2 parents 51c2dcc + 2449a0a commit 1a6659d

File tree

297 files changed

+7182
-3115
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

297 files changed

+7182
-3115
lines changed

.github/CODEOWNERS

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
/python/sglang/srt/eplb @fzyzcjy
1111
/python/sglang/srt/function_call @CatherineSue
1212
/python/sglang/srt/layers @merrymercy @Ying1123 @zhyncs @ispobock @HaiShaw @ch-wan @BBuf @kushanam @Edwardf0t1
13-
/python/sglang/srt/lora @Ying1123 @Fridge003
13+
/python/sglang/srt/lora @Ying1123 @Fridge003 @lifuhuang
1414
/python/sglang/srt/managers @merrymercy @Ying1123 @hnyls2002 @xiezhq-hermann
1515
/python/sglang/srt/mem_cache @merrymercy @Ying1123 @hnyls2002 @xiezhq-hermann
1616
/python/sglang/srt/model_executor @merrymercy @Ying1123 @hnyls2002 @zhyncs @ispobock

.github/REVIEWERS.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Area Reviewer
2+
3+
Here are some reviewers for common areas. You can ping them to review your code if you touch related parts.
4+
5+
## Hardware platforms
6+
- general @Alcanderian
7+
- AMD GPU @HaiShaw
8+
- Blackwell GPU @kushanam @trevor-m @zhyncs
9+
- CPU @mingfeima
10+
11+
## Kernel
12+
- general @zhyncs @ispobock @HandH1998 @BBuf @yizhang2077 @HaiShaw
13+
- triton attention backend @ispobock
14+
- flash attention @hebiao064
15+
16+
## Scheduler and memory pool
17+
- general @merrymercy @Ying1123 @hnyls2002 @xiezhq-hermann
18+
- constrained decoding @hnyls2002
19+
- hierarhical cache @xiezhq-hermann @DarkSharpness
20+
- lora @Fridge003 @Ying1123 @lifuhuang
21+
- speculative decoding @merrymercy @Ying1123 @kssteven418
22+
- sliding window attention @hanming-lu
23+
24+
## Parallelism
25+
- expert parallelism @fzyzcjy @ch-wan
26+
- data parallelism attention @ch-wan
27+
- pipeline parallelism @Ying1123
28+
- tensor parallelism @merrymercy
29+
30+
## PD disaggregation
31+
- general @ByronHsu @ShangmingCai @@ShangmingCai @hnyls2002
32+
- Mooncake backend @ShangmingCai
33+
34+
## Build and release
35+
- general @zhyncs @merrymercy
36+
37+
## API Server
38+
- general @CatherineSue @slin1237 @ispobock
39+
- function calling and reasoning parsing @CatherineSue
40+
- OpenAI API @CatherineSue @slin1237
41+
42+
## SGL-Router
43+
- general @slin1237 @ByronHsu
44+
45+
## Model
46+
- multimodal models @mickqian @JustinTong0323
47+
- other new models @zhaochenyang20
48+
49+
## Reinforcment learning
50+
- general @zhaochenyang20 @hebiao064 @fzyzcjy @zhuzilin

.github/pull_request_template.md

Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,24 @@
1-
<!-- Thank you for your contribution! We appreciate it. The following guidelines will help improve your pull request and facilitate feedback. If anything is unclear, don't hesitate to submit your pull request and ask the maintainers for assistance. -->
1+
<!-- Thank you for your contribution! Please follow these guidelines to enhance your pull request. If anything is unclear, submit your PR and reach out to maintainers for assistance. Join our Slack community at https://slack.sglang.ai to discuss further. -->
22

33
## Motivation
44

5-
<!-- Explain the purpose of this PR and the goals it aims to achieve. -->
5+
<!-- Describe the purpose and goals of this pull request. -->
66

77
## Modifications
88

9-
<!-- Describe the changes made in this PR. -->
9+
<!-- Detail the changes made in this pull request. -->
1010

11-
## Accuracy Test
11+
## Accuracy Tests
1212

13-
<!-- If this PR affects model-side code (e.g., kernels, model architecture), please provide accuracy test results. Ref: https://docs.sglang.ai/references/accuracy_evaluation.html -->
13+
<!-- If this pull request affects model outputs (e.g., changes to the kernel or model forward code), provide accuracy test results. -->
1414

15-
## Benchmark & Profiling
15+
## Benchmarking and Profiling
1616

17-
<!-- If this PR is expected to impact performance, please provide benchmark and profiling results. Ref: https://docs.sglang.ai/references/benchmark_and_profiling.html -->
17+
<!-- If this pull request impacts inference speed, provide benchmarking and profiling results. -->
1818

1919
## Checklist
2020

21-
- [ ] Format your code according to the [Code Formatting with Pre-Commit](https://docs.sglang.ai/references/contribution_guide.html#code-formatting-with-pre-commit).
22-
- [ ] Add unit tests as outlined in the [Running Unit Tests](https://docs.sglang.ai/references/contribution_guide.html#running-unit-tests-adding-to-ci).
23-
- [ ] Update documentation / docstrings / example tutorials as needed, according to [Writing Documentation](https://docs.sglang.ai/references/contribution_guide.html#writing-documentation-running-docs-ci).
24-
- [ ] Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to [Benchmark and Profiling](https://docs.sglang.ai/references/benchmark_and_profiling.html) and [Accuracy Results](https://docs.sglang.ai/references/accuracy_evaluation.html).
25-
- [ ] For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
26-
- [ ] Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.
21+
- [ ] Format your code according to the [Code formatting with pre-commit](https://docs.sglang.ai/references/contribution_guide.html#code-formatting-with-pre-commit).
22+
- [ ] Add unit tests according to the [Running and adding unit tests](https://docs.sglang.ai/references/contribution_guide.html#running-unit-tests-adding-to-ci).
23+
- [ ] Update documentation according to [Writing documentations](https://docs.sglang.ai/references/contribution_guide.html#writing-documentation-running-docs-ci).
24+
- [ ] Provide accuracy and speed benchmark results according to [Testing the accuracy](https://docs.sglang.ai/references/contribution_guide.html#testing-the-accuracy) and [Benchmark and profiling]()
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
name: Cancel All Pending PR Test Runs
2+
3+
on:
4+
workflow_dispatch:
5+
inputs:
6+
workflows:
7+
description: 'Space-separated list of workflow filenames to cancel'
8+
required: true
9+
type: string
10+
default: 'pr-test.yml pr-test-xeon.yml'
11+
12+
permissions:
13+
actions: write # Needed to cancel runs
14+
contents: read # Needed to read repo info
15+
16+
jobs:
17+
cancel-pending:
18+
runs-on: ubuntu-latest
19+
steps:
20+
- name: Install GitHub CLI
21+
run: sudo apt-get install -y gh jq
22+
23+
- name: Cancel all pending/waiting runs for specified workflows
24+
env:
25+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
26+
REPO: ${{ github.repository }}
27+
run: |
28+
# Read the space-separated string from the input into a bash array
29+
WORKFLOW_FILES=(${{ github.event.inputs.workflows }})
30+
31+
echo "Targeting ${#WORKFLOW_FILES[@]} workflow(s): ${{ github.event.inputs.workflows }}"
32+
33+
for workflow_file in "${WORKFLOW_FILES[@]}"; do
34+
echo "--- Checking workflow: $workflow_file ---"
35+
gh run list \
36+
--repo "$REPO" \
37+
--workflow "$workflow_file" \
38+
--json databaseId,status \
39+
--limit 1000 \
40+
| jq -r '.[] | select(.status=="queued" or .status=="in_progress") | .databaseId' \
41+
| while read run_id; do
42+
echo "Cancelling run ID: $run_id for workflow: $workflow_file"
43+
gh run cancel "$run_id" --repo "$REPO"
44+
done
45+
done
File renamed without changes.

.github/workflows/execute-notebook.yml

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,9 @@ jobs:
2424

2525
- name: Install dependencies
2626
run: |
27-
bash scripts/ci_install_dependency.sh
27+
bash scripts/ci/ci_install_dependency.sh
2828
pip install -r docs/requirements.txt
29-
apt-get update
30-
apt-get install -y pandoc
31-
apt-get update && apt-get install -y parallel retry
32-
29+
apt-get update && apt-get install -y pandoc parallel retry
3330
ln -sf "$(which python3)" /usr/bin/python
3431
3532
- name: Setup Jupyter Kernel
@@ -44,7 +41,7 @@ jobs:
4441
make compile
4542
4643
47-
finish:
44+
notebook-finish:
4845
needs: [
4946
run-all-notebooks
5047
]

.github/workflows/experiment-runner.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ jobs:
2121

2222
- name: Install dependencies
2323
run: |
24-
bash scripts/ci_install_dependency.sh
24+
bash scripts/ci/ci_install_dependency.sh
2525
2626
- name: Test experiment runner
2727
timeout-minutes: 120

.github/workflows/lint.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
name: Lint
22

3-
on: [ pull_request ]
3+
on: [pull_request]
44

55
jobs:
66
lint:
@@ -11,7 +11,7 @@ jobs:
1111
- name: Set up Python
1212
uses: actions/setup-python@v4
1313
with:
14-
python-version: '3.9'
14+
python-version: "3.10"
1515

1616
- name: Install pre-commit hook
1717
run: |

.github/workflows/nightly-test-amd.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,14 +28,14 @@ jobs:
2828
- name: Setup docker
2929
run: |
3030
touch github_summary.md
31-
bash scripts/amd_ci_start_container.sh
31+
bash scripts/ci/amd_ci_start_container.sh
3232
env:
3333
GITHUB_WORKSPACE: ${{ github.workspace }}
3434

3535
- name: Install dependencies
36-
run: bash scripts/amd_ci_install_dependency.sh
36+
run: bash scripts/ci/amd_ci_install_dependency.sh
3737

3838
- name: Nightly Test
3939
run: |
40-
bash scripts/amd_ci_exec.sh -e GITHUB_STEP_SUMMARY="/sglang-checkout/github_summary.md" python3 run_suite.py --suite nightly-amd --timeout-per-file 7200
40+
bash scripts/ci/amd_ci_exec.sh -e GITHUB_STEP_SUMMARY="/sglang-checkout/github_summary.md" python3 run_suite.py --suite nightly-amd --timeout-per-file 7200
4141
echo "$(<github_summary.md )" >> $GITHUB_STEP_SUMMARY

.github/workflows/nightly-test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ jobs:
2424

2525
- name: Install dependencies
2626
run: |
27-
bash scripts/ci_install_dependency.sh
27+
bash scripts/ci/ci_install_dependency.sh
2828
2929
- name: Run test
3030
timeout-minutes: 120

0 commit comments

Comments
 (0)