Add an int64 path for mlp kernels #3614

mmathew23 · 2025-11-18T22:22:20Z

The llama mlp kernels produce nans with extremely long context length. This is happens when the num_elements is greater than 2**31. In these cases it's best to calculate offsets with tl.int64 instead of int32. This PR will route to int64 kernels if the num_elements is big enough.

danielhanchen · 2025-11-19T13:00:19Z

unsloth/kernels/geglu.py

    device = gate.device
    out = torch.empty((batch, seq_len, hd), dtype = gate.dtype, device = device)
    grid = lambda meta: (triton.cdiv(n_elements, meta["BLOCK_SIZE"]),)
+    if n_elements <= (2**31) - 1024:


Why -1024? Is it maybe hd?

yes I forgot to account for hd. The idea is that I wanted to add a buffer just to be safe.

danielhanchen · 2025-11-19T13:00:51Z

unsloth/kernels/geglu.py

    batch_seq_len, hd = e.shape
    n_elements = e.numel()
    grid = lambda meta: (triton.cdiv(n_elements, meta["BLOCK_SIZE"]),)
+    if n_elements <= (2**31) - 1024:


Maybe move (2**31) to a global var

danielhanchen · 2025-11-19T13:46:16Z

unsloth/kernels/swiglu.py

+    e,
+    g,
+    n_elements,
+    BLOCK_SIZE: tl.constexpr,


there is actually a way to use 1 kernel only and dispatch, but for now this is fine - we can refactor later

danielhanchen reviewed Nov 19, 2025

View reviewed changes

mmathew23 force-pushed the tiled/contextlen branch 2 times, most recently from c008eca to 262ada3 Compare November 19, 2025 17:24

Add an int64 path for mlp kernels

833d91f

mmathew23 force-pushed the tiled/contextlen branch from 262ada3 to 833d91f Compare November 19, 2025 19:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add an int64 path for mlp kernels #3614

Add an int64 path for mlp kernels #3614

mmathew23 commented Nov 18, 2025

Uh oh!

danielhanchen Nov 19, 2025

Uh oh!

mmathew23 Nov 19, 2025

Uh oh!

danielhanchen Nov 19, 2025

Uh oh!

danielhanchen Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add an int64 path for mlp kernels #3614

Are you sure you want to change the base?

Add an int64 path for mlp kernels #3614

Conversation

mmathew23 commented Nov 18, 2025

Uh oh!

danielhanchen Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

mmathew23 Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

danielhanchen Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

danielhanchen Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants