Skip to content
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/release_notes/release_notes-0.15.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Key Features and Updates
* FEAT-#4320: Add connectorx as an alternative engine for read_sql (#4346)
* Benchmarking enhancements
* FEAT-#4371: Add logging to Modin (#4372)
* FEAT-#4501: Add RSS Memory Profiling to Modin Logging (#4502)
* Refactor Codebase
* REFACTOR-#4284: use variable length unpacking when getting results from `deploy` function (#4285)
* REFACTOR-#3642: Move PyArrow storage format usage from main feature to experimental ones (#4374)
Expand Down
11 changes: 9 additions & 2 deletions modin/logging/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
"""

import logging
from logging.handlers import RotatingFileHandler
import datetime as dt
import os
import uuid
Expand Down Expand Up @@ -96,7 +97,12 @@ def configure_logging():

os.makedirs(os.path.dirname(log_filename), exist_ok=True)

logfile = logging.FileHandler(log_filename, "a")
logfile = RotatingFileHandler(
filename=log_filename,
mode="a",
maxBytes=100000000,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have maxBytes set so high?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set it to 10MB for each file (since it is being stored locally I think this is reasonable), and logging a max of 10GB.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the 10GB max?

Copy link
Collaborator Author

@naren-ponder naren-ponder Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.python.org/3/library/logging.handlers.html#rotatingfilehandler -- because backupCount is set to 100. At maximum you get 100 files of 10MB each. If people are running logs at that level anyhow and need support, they would likely reach out to us somehow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like 100 MB (10^8)? Is that what you meant to put? I think something like int(1e8) would be more readable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to make this configurable instead of hard coded.

I think setting a default max of 10GB of logs is way too high. This is 10GB per Modin job, not total. I prefer to cap it at 100MB per job by default. If more is needed, they can set a configuration.

Copy link
Collaborator Author

@naren-ponder naren-ponder Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So 10MB per file, and a max of 10 files, for a total of 100MB per job? Was looking through Google and the max MB per file for logs people recommend for a local filesystem is somewhere in the 10-50MB range.

backupCount=100,
)
formatter = ModinFormatter(
fmt="%(process)d, %(thread)d, %(asctime)s, %(message)s",
datefmt="%Y-%m-%d,%H:%M:%S.%f",
Expand All @@ -108,7 +114,6 @@ def configure_logging():
logger.setLevel(logging.INFO)
logger.setLevel(logging.DEBUG)

logger = logging.getLogger("modin.logger")
logger.info(f"OS Version: {platform.platform()}")
logger.info(f"Python Version: {platform.python_version()}")
modin_version = pkg_resources.get_distribution("modin").version
Expand Down Expand Up @@ -144,8 +149,10 @@ def memory_thread(logger, sleep_time):
The interval at which to profile system memory.
"""
while True:
rss_mem = bytes_int_to_str(psutil.Process().memory_info().rss)
svmem = psutil.virtual_memory()
logger.info(f"Memory Percentage: {svmem.percent}%")
logger.info(f"RSS Memory: {rss_mem}")
time.sleep(sleep_time)


Expand Down