Enable thinking models

A v0 would be allowing users to instantiate the logits processor `ThinkingLogitsProcessor` with the end of thinking token and pass it to the generator.

```python
model = ...
logits_processor = ThinkingLogitsProcessor("</think>", output_type)
generator = Generator(model, processor=logits_processor)
```

The implementation is simple:

1. While `</think>` has not been observed, `__call__` is a pass-through.
2. After `<think>` has been generated, mask the logits to generate the type passed by the user.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable thinking models #1627

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Enable thinking models #1627

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions