fix(deps): update dependency huggingface-hub to ~=0.32.4 #1483
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
~=0.28.0
->~=0.32.4
Release Notes
huggingface/huggingface_hub (huggingface-hub)
v0.32.4
: [v0.32.4]: Bug fixes intiny-agents
, and fix input handling for question-answering task.Compare Source
Full Changelog: huggingface/huggingface_hub@v0.32.3...v0.32.4
This release introduces bug fixes to
tiny-agents
andInferenceClient.question_answering
:asyncio.wait()
does not accept bare coroutines #3135 by @hanouticelinav0.32.3
: [v0.32.3]: Handle env variables intiny-agents
, better CLI exit and handling of MCP tool calls argumentsCompare Source
Full Changelog: huggingface/huggingface_hub@v0.32.2...v0.32.3
This release introduces some improvements and bug fixes to
tiny-agents
:tiny-agents
cli exit issues #3125v0.32.2
: [v0.32.2]: Add endpoint support in Tiny-Agent + fixsnapshot_download
on large reposCompare Source
Full Changelog: huggingface/huggingface_hub@v0.32.1...v0.32.2
v0.32.1
: [v0.32.1]: hot-fix: Fix tiny agents on WindowsCompare Source
Patch release to fix #3116
Full Changelog: huggingface/huggingface_hub@v0.32.0...v0.32.1
v0.32.0
: [v0.32.0]: MCP Client, Tiny Agents CLI and more!Compare Source
🤖 Powering LLMs with Tools: MCP Client & Tiny Agents CLI
✨ The
huggingface_hub
library now includes an MCP Client, designed to empower Large Language Models (LLMs) with the ability to interact with external Tools via Model Context Protocol (MCP). This client extends theInfrenceClient
and provides a seamless way to connect LLMs to both local and remote tool servers!In the following example, we use the Qwen/Qwen2.5-72B-Instruct model via the Nebius inference provider. We then add a remote MCP server, in this case, an SSE server which makes the Flux image generation tool available to the LLM:
For even simpler development, we now also offer a higher-level
Agent
class. These 'Tiny Agents' simplify creating conversational Agents by managing the chat loop and state, essentially acting as a user-friendly wrapper aroundMCPClient
. It's designed to be a simple while loop built right on top of an MCPClient.You can run these Agents directly from the command line:
You can run these Agents using your own local configs or load them directly from the Hugging Face dataset tiny-agents.
This is an early version of the
MCPClient
, and community contributions are welcome 🤗InferenceClient
is also aMCPClient
by @julien-c in #2986⚡ Inference Providers
Thanks to @diadorer, feature extraction (embeddings) inference is now supported with Nebius provider!
We’re thrilled to introduce Nscale as an official inference provider! This expansion strengthens the Hub as the go-to entry point for running inference on open-weight models 🔥
We also fixed compatibility issues with structured outputs across providers by ensuring the
InferenceClient
follows the OpenAI API specs structured output.💾 Serialization
We've introduced a new
@strict
decorator for dataclasses, providing robust validation capabilities to ensure data integrity both at initialization and during assignment. Here is a basic example:This feature also includes support for custom validators, class-wise validation logic, handling of additional keyword arguments, and automatic validation based on type hints. Documentation can be found here.
@strict
decorator for dataclass validation by @Wauplin in #2895This release brings also support for
DTensor
in_get_unique_id
/get_torch_storage_size
helpers, allowingtransformers
to seamlessly usesave_pretrained
withDTensor
.✨ HF API
When creating an Endpoint, the default for
scale_to_zero_timeout
is nowNone
, meaning endpoints will no longer scale to zero by default unless explicitly configured.We've also introduced experimental helpers to manage OAuth within FastAPI applications, bringing functionality previously used in Gradio to a wider range of frameworks for easier integration.
📚 Documentation
We now have much more detailed documentation for Inference! This includes more detailed explanations and examples to clarify that the
InferenceClient
can also be effectively used with local endpoints (llama.cpp, vllm, MLX..etc).🛠️ Small fixes and maintenance
😌 QoL improvements
api.endpoint
to arguments for_get_upload_mode
by @matthewgrossman in #3077🐛 Bug and typo fixes
read()
by @lhoestq in #3080🏗️ internal
hf-xet
optional by @hanouticelina in #3079Community contributions
huggingface-cli repo create
command by @Wauplin in #3094Significant community contributions
The following contributors have made significant changes to the library over the last release:
v0.31.4
: [v0.31.4]: strict dataclasses, supportDTensor
saving & some bug fixesCompare Source
This release includes some new features and bug fixes:
strict
decorators for runtime dataclass validation with custom and type-based checks. by @Wauplin in https://github.com/huggingface/huggingface_hub/pull/2895.DTensor
support to_get_unique_id
/get_torch_storage_size
helpers, enablingtransformers
to usesave_pretrained
withDTensor
. by @S1ro1 in https://github.com/huggingface/huggingface_hub/pull/3042.Full Changelog: huggingface/huggingface_hub@v0.31.2...v0.31.4
v0.31.3
Compare Source
v0.31.2
: [v0.31.2] Hot-fix: makehf-xet
optional again and bump the min version of the packageCompare Source
Patch release to make
hf-xet
optional. More context in #3079 and #3078.Full Changelog: huggingface/huggingface_hub@v0.31.1...v0.31.2
v0.31.1
Compare Source
v0.31.0
: [v0.31.0] LoRAs with Inference Providers,auto
mode for provider selection, embeddings models and moreCompare Source
🧑🎨 Introducing LoRAs with fal.ai and Replicate providers
We're introducing blazingly fast LoRA inference powered by
fal.ai and Replicate through Hugging Face Inference Providers! You can use any compatible LoRA available on the Hugging Face Hub and get generations at lightning fast speed ⚡
⚙️
auto
mode for provider selectionYou can now automatically select a provider for a model using
auto
mode — it will pick the first available provider based on your preferred order set in https://hf.co/settings/inference-providers.provider
argument. Previously, the default washf-inference
, so this change may be a breaking one if you're not specifying the provider name when initializingInferenceClient
orAsyncInferenceClient
.provider="auto"
by @julien-c in #3011🧠 Embeddings support with Sambanova (feature-extraction)
We added support for feature extraction (embeddings) inference with sambanova provider.
⚡ Other Inference features
HF Inference API provider is now fully integrated as an Inference Provider, this means it only supports a predefined list of deployed models, selected based on popularity.
Cold-starting arbitrary models from the Hub is no longer supported — if a model isn't already deployed, it won’t be available via HF Inference API.
Miscellaneous improvements and some bug fixes:
✅ Of course, all of those inference changes are available in the
AsyncInferenceClient
async equivalent 🤗🚀 Xet
Thanks to @bpronan's PR, Xet now supports uploading byte arrays:
Additionally, we’ve added documentation for environment variables used by
hf-xet
to optimize file download/upload performance — including options for caching (HF_XET_CHUNK_CACHE_SIZE_BYTES
), concurrency (HF_XET_NUM_CONCURRENT_RANGE_GETS
), high-performance mode (HF_XET_HIGH_PERFORMANCE
), and sequential writes (HF_XET_RECONSTRUCT_WRITE_SEQUENTIALLY
).Miscellaneous improvements:
✨ HF API
We added HTTP download support for files larger than 50GB — enabling more reliable handling of large file downloads.
We also added dynamic batching to
upload_large_folder
, replacing the fixed 50-files-per-commit rule with an adaptive strategy that adjusts based on commit success and duration — improving performance and reducing the risk of hitting the commits rate limit on large repositories.We added support for new arguments when creating or updating Hugging Face Inference Endpoints.
💔 Breaking changes
provider
argument inInferenceClient
andAsyncInferenceClient
is now "auto" instead of "hf-inference" (HF Inference API). This means provider selection will now follow your preferred order set in your inference provider settings.If your code relied on the previous default ("hf-inference"), you may need to update it explicitly to avoid unexpected behavior.
feature-extraction
andsentence-similarity
tasks has changed fromhttps://router.huggingface.co/hf-inference/pipeline/{task}/{model}
tohttps://router.huggingface.co/hf-inference/models/{model}/pipeline/{task}
.🛠️ Small fixes and maintenance
😌 QoL improvements
🐛 Bug and typo fixes
🏗️ internal
hf_xet
min version to 1.0.0 + make it required dep on 64 bits by @hanouticelina in #2971Community contributions
The following contributors have made significant changes to the library over the last release:
v0.30.2
: : Fix text-generation task in InferenceClientCompare Source
Fixing some
InferenceClient
-related bugs:Full Changelog: huggingface/huggingface_hub@v0.30.1...v0.30.2
v0.30.1
: : fix 'sentence-transformers/all-MiniLM-L6-v2' doesn't support task 'feature-extraction'Compare Source
Patch release to fix https://github.com/huggingface/huggingface_hub/issues/2967.
Full Changelog: huggingface/huggingface_hub@v0.30.0...v0.30.1
v0.30.0
: Xet is here! (+ many cool Inference-related things!)Compare Source
🚀 Ready. Xet. Go!
This might just be our biggest update in the past two years! Xet is a groundbreaking new protocol for storing large objects in Git repositories, designed to replace Git LFS. Unlike LFS, which deduplicates files, Xet operates at the chunk level—making it a game-changer for AI builders collaborating on massive models and datasets. Our Python integration is powered by xet-core, a Rust-based package that handles all the low-level details.
You can start using Xet today by installing the optional dependency:
With that, you can seamlessly download files from Xet-enabled repositories! And don’t worry—everything remains fully backward-compatible if you’re not ready to upgrade yet.
Blog post: Xet on the Hub
Docs: Storage backends → Xet
This is the result of collaborative work by @bpronan, @hanouticelina, @rajatarya, @jsulz, @assafvayner, @Wauplin, + many others on the infra/Hub side!
xetEnabled
as an expand property by @hanouticelina in #2907⚡ Enhanced InferenceClient
The
InferenceClient
has received significant updates and improvements in this release, making it more robust and easy to work with.We’re thrilled to introduce Cerebras and Cohere as official inference providers! This expansion strengthens the Hub as the go-to entry point for running inference on open-weight models.
Novita is now our 3rd provider to support text-to-video task after Fal.ai and Replicate:
It is now possible to centralize billing on your organization rather than individual accounts! This helps companies managing their budget and setting limits at a team level. Organization must be subscribed to Enterprise Hub.
Handling long-running inference tasks just got easier! To prevent request timeouts, we’ve introduced asynchronous calls for text-to-video inference. We are expecting more providers to leverage the same structure soon, ensuring better robustness and developer-experience.
Miscellaneous improvements:
InferenceClient
docstring to reflect thattoken=False
is no longer accepted by @abidlabs in #2853provider
parameter by @hanouticelina in #2949✨ New Features and Improvements
This release also includes several other notable features and improvements.
It's now possible to pass a path with wildcard to the upload command instead of passing
--include=...
option:Deploying an Inference Endpoint from the Model Catalog just got 100x easier! Simply select which model to deploy and we handle the rest to guarantee the best hardware and settings for your dedicated endpoints.
The
ModelHubMixin
got two small updates:config
until now)You can now sort by name, size, last updated and last used where using the
delete-cache
command:--sort
arg todelete-cache
to sort by size by @AlpinDale in #2815Since end 2024, it is possible to manage the LFS files stored in a repo from the UI (see docs). This release makes it possible to do the same programmatically. The goal is to enable users to free-up some storage space in their private repositories.
💔 Breaking Changes
labels
has been removed fromInferenceClient.zero_shot_classification
andInferenceClient.zero_shot_image_classification
tasks in favor ofcandidate_labels
. There has been a proper deprecation warning for that.🛠️ Small Fixes and Maintenance
🐛 Bug and Typo Fixes
🏗️ Internal
Thanks to the work previously introduced by the
diffusers
team, we've published a GitHub Action that runs code style tooling on demand on Pull Requests, making the life of contributors and reviewers easier.Other minor updates:
Significant community contributions
The following contributors have made significant changes to the library over the last release:
InferenceClient
docstring to reflect thattoken=False
is no longer accepted (#2853)--sort
arg todelete-cache
to sort by size (#2815)v0.29.3
: [v0.29.3]: Adding 2 new Inference Providers: Cerebras and Cohere 🔥Compare Source
Added client-side support for Cerebras and Cohere providers for upcoming official launch on the Hub.
Cerebras: https://github.com/huggingface/huggingface_hub/pull/2901.
Cohere: https://github.com/huggingface/huggingface_hub/pull/2888.
Full Changelog: huggingface/huggingface_hub@v0.29.2...v0.29.3
v0.29.2
: [v0.29.2] Fix payload model name when model id is a URL & Restoresys.stdout
innotebook_login()
after errorCompare Source
This patch release includes two fixes:
Full Changelog: huggingface/huggingface_hub@v0.29.1...v0.29.2
v0.29.1
: [v0.29.1] Fix revision URL encoding inupload_large_folder
& Fix endpoint update state handling inInferenceEndpoint.wait()
Compare Source
This patch release includes two fixes:
Full Changelog: huggingface/huggingface_hub@v0.29.0...v0.29.1
v0.29.0
: [v0.29.0]: Introducing 4 new Inference Providers: Fireworks AI, Hyperbolic, Nebius AI Studio, and Novita 🔥Compare Source
We’re thrilled to announce the addition of three more outstanding serverless Inference Providers to the Hugging Face Hub: Fireworks AI, Hyperbolic, Nebius AI Studio, and Novita. These providers join our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub’s model pages. This release adds official support for these 3 providers, making it super easy to use a wide variety of models with your preferred providers.
See our announcement blog for more details: https://huggingface.co/blog/new-inference-providers.
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Never, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.
Summary by Sourcery
Build: