-
Notifications
You must be signed in to change notification settings - Fork 0
perf(http): Shift from requests to asyncio/aiohttp #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We have better performance switching to aiohttp.
Reviewer's GuideThe PR replaces blocking requests-based PGN fetching with an asynchronous aiohttp-driven workflow, introducing a fetch_archive helper with rate-limit handling, concurrency limits via semaphores and a TCPConnector, adapts the Celery task to invoke the async function with asyncio.run, and centralizes Redis broker/backend configuration in a single URI. Sequence diagram for asynchronous PGN fetching with aiohttpsequenceDiagram
participant CeleryTask as Celery Task
participant Asyncio as asyncio
participant ClientSession as aiohttp.ClientSession
participant ChessComAPI as chess.com API
CeleryTask->>Asyncio: asyncio.run(get_chess_dot_com_games(username))
Asyncio->>ClientSession: Create session with TCPConnector(limit=15)
loop For each archive
Asyncio->>ClientSession: session.get(archive_url/pgn)
alt 429 Rate Limited
ClientSession-->>Asyncio: status 429, Retry-After header
Asyncio->>Asyncio: await asyncio.sleep(Retry-After)
Asyncio->>ClientSession: Retry session.get(archive_url/pgn)
else 200 OK
ClientSession-->>Asyncio: status 200, PGN text
else Other error
ClientSession-->>Asyncio: status != 200/429
end
end
Asyncio-->>CeleryTask: Return concatenated PGNs
Class diagram for updated PGN fetching utilitiesclassDiagram
class fetch_archive {
+async fetch_archive(archive_url, session, semaphore)
}
class get_chess_dot_com_games {
+async get_chess_dot_com_games(username)
}
fetch_archive <.. get_chess_dot_com_games : uses
Class diagram for Celery task updateclassDiagram
class pgn_get_chess_com_games_by_user {
+pgn_get_chess_com_games_by_user(session_id, username)
}
class get_chess_dot_com_games {
+async get_chess_dot_com_games(username)
}
pgn_get_chess_com_games_by_user ..> get_chess_dot_com_games : calls via asyncio.run
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12 +/- ##
==========================================
- Coverage 36.88% 36.19% -0.70%
==========================================
Files 42 42
Lines 1041 1061 +20
Branches 99 100 +1
==========================================
Hits 384 384
- Misses 641 661 +20
Partials 16 16 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @CalvoM - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `style_predictor/apis/pgn/utils.py:88` </location>
<code_context>
-def get_chess_dot_com_games(username: str) -> str:
+async def fetch_archive(
+ archive_url: str, session: aiohttp.ClientSession, semaphore: asyncio.Semaphore
+):
+ async with semaphore:
+ while True:
+ async with session.get(f"{archive_url}/pgn") as resp:
+ if resp.status == 429:
+ retry_after = resp.headers.get("Retry-After")
+ if retry_after:
+ LOG.info(f"Rate limited. Retrying after {retry_after} seconds.")
+ await asyncio.sleep(int(retry_after))
+ else:
+ # Implement exponential backoff here if no Retry-After
+ await asyncio.sleep(45) # Example fixed delay
+ elif resp.status == 200:
+ return await resp.text()
+ else:
+ LOG.error(f"Failed to fetch {archive_url}: {resp.status}")
</code_context>
<issue_to_address>
No timeout or error handling for network failures in fetch_archive.
Currently, network errors like timeouts or aiohttp.ClientError are not handled, which may cause hangs or crashes. Please add appropriate exception handling and consider implementing retries or returning an empty string on failure.
</issue_to_address>
### Comment 2
<location> `style_predictor/apis/pgn/utils.py:130` </location>
<code_context>
+ for archive in archives.get("archives", [])
+ ]
+ all_pgns = await asyncio.gather(*tasks)
+ return "\n\n".join(all_pgns)
</code_context>
<issue_to_address>
Empty strings from failed fetches will be included in the result.
Filter out empty strings from all_pgns before joining to avoid extra blank lines in the result.
</issue_to_address>
### Comment 3
<location> `style_predictor/tasks.py:314` </location>
<code_context>
@shared_task(name=constants.GET_CHESS_COM_TASK)
def pgn_get_chess_com_games_by_user(session_id: UUID, username: str):
"""Celery task to get chess games for user from chess.com."""
- pgn_data: str = get_chess_dot_com_games(username)
+ pgn_data: str = asyncio.run(get_chess_dot_com_games(username))
return save_file_and_queue_task(
session_id, username, pgn_data, FileSource.CHESSDOTCOM
</code_context>
<issue_to_address>
Using asyncio.run in a Celery task may cause issues if an event loop is already running.
In environments where an event loop is already running, asyncio.run will fail. Consider alternatives like nest_asyncio or running the async code in a separate thread to ensure compatibility.
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
5c173f4
to
615f456
Compare
We have better performance switching to aiohttp.
Summary by Sourcery
Migrate chess.com PGN fetching to asynchronous aiohttp calls with concurrency control and retry logic, update Celery task to run the async function, and DRY up Redis URI configuration for Celery.
New Features:
Enhancements: