Skip to content

Conversation

papa99do
Copy link
Collaborator

@papa99do papa99do commented Apr 22, 2025

Change Summary

Our perf test identified pydantic validation logic to be one of the major perf bottleneck of the search response parsing process. the core logic of Pydantic v2 is rewritten in rust and is much more performant than Pydantic v1. Upgrading to pydantic v2 helps address this perf bottleneck. In addition, after upgrading Pydantic, we can also upgrade Starlette and FastAPI to address some security issues.

Instead of a big bang change which touches all the Pydantic model class in Marqo repo. We will take baby steps and gradually migrate the model classes to v2. This is less risky. Please see This guide for more details.

In this PR

  • Upgrade Pydantic dep to the latest v2 version
  • Upgrade FastAPI, Starlette, Uvicorn dep to the latest version
  • Replace all dep to pydantic package to pydantic.v1 so we still use the embedded v1 implementation in Pydantic v2 package.
  • Replace API method parameter with Pydantic model type with dictionary and handle the conversion from dictionary to Pydantic v1 model manually. This is due to that the latest version of FastAPI/Starllete uses Pydantic v2 and do not take v1 models as API method parameter.
    • Please check api.py file and test_api.py file specifically for this change.
    • Please check

In a following PR, we will

  • Upgrade QueryResult to use Pydantic v2
  • Upgrade SemiStructuredVespaDocument to use Pydantic v2

Related Jira Ticket

MOSD-365

Checklist

  • Tests have been added for changes
  • Documentation has been updated
  • Breaking changes are clearly identified
  • Python client changes linked or N/A

For new field types:

  • Tests cover score modifier usage of this new type
  • Test indexes updated to cover the new type for all APIs (add docs, search, partial update, etc.)

@papa99do papa99do marked this pull request as ready for review April 22, 2025 05:53
@papa99do papa99do changed the title Upgrade pydantic to v2 Upgrade pydantic to v2 (PR1) Apr 23, 2025
@papa99do papa99do requested review from Copilot and farshidz April 23, 2025 04:59
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR upgrades the Pydantic dependency to v2 while preserving the v1 functionality via the pydantic.v1 subpackage, and concurrently updates FastAPI, Starlette, and Uvicorn to their latest versions to address performance and security issues.

  • Upgrade Pydantic to v2 and switch all direct imports to use pydantic.v1 for legacy behavior
  • Update related dependencies (FastAPI, Starlette, Uvicorn) to their respective latest versions
  • Adjust workflow scripts to improve change detection in CI

Reviewed Changes

Copilot reviewed 72 out of 72 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/marqo/core/semi_structured_vespa_index/semi_structured_add_document_handler.py Replaces pydantic.Field with pydantic.v1.Field
src/marqo/core/models/marqo_update_documents_response.py Replaces pydantic imports with pydantic.v1 equivalents
src/marqo/core/models/marqo_query.py Replaces pydantic validator and root_validator imports
src/marqo/core/models/marqo_index_request.py Updates pydantic import and field alias usage
src/marqo/core/models/marqo_index.py Updates multiple pydantic import statements and Field calls
src/marqo/core/models/marqo_get_documents_by_id_response.py Replaces pydantic imports with pydantic.v1 equivalents
src/marqo/core/models/marqo_add_documents_response.py Replaces pydantic imports with pydantic.v1 equivalents
src/marqo/core/models/hybrid_parameters.py Replaces pydantic imports with pydantic.v1 equivalents
src/marqo/core/models/facets_parameters.py Replaces pydantic imports and adds spacing in code
src/marqo/core/models/add_docs_params.py Replaces pydantic imports with pydantic.v1 equivalents
src/marqo/core/inference/tensor_fields_container.py Replaces pydantic.BaseModel import with pydantic.v1
src/marqo/core/inference/api/preprocessing_config.py Updates pydantic Field and validator calls to pydantic.v1
src/marqo/core/inference/api/inference.py Updates pydantic import and field calls to pydantic.v1
src/marqo/base_model.py Replaces pydantic.BaseModel import with pydantic.v1
src/marqo/api/models/update_documents.py Replaces pydantic.validator import with pydantic.v1
src/marqo/api/models/recommend_query.py Replaces pydantic.root_validator import with pydantic.v1
src/marqo/api/models/get_batch_documents_request.py Replaces pydantic Field and conlist imports with pydantic.v1 equivalents
src/marqo/api/models/embed_request.py Replaces pydantic.Field and validator imports with pydantic.v1 equivalents
src/marqo/api/models/add_docs_objects.py Replaces pydantic imports with pydantic.v1 equivalents and adds an explicit default for mediaDownloadThreadCount
.github/workflows/run_required_checks.yml Updates shell script logic to correctly detect non-markdown changes

@papa99do papa99do merged commit 9c3acfa into mainline Apr 24, 2025
52 of 58 checks passed
@papa99do papa99do deleted the yihan/pydantic-v2-upgrade branch April 24, 2025 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants