Skip to content

Warnings shown when using default configuration in BasicCrawler #1468

@vdusek

Description

@vdusek

Description

  • When running a BasicCrawler (or any other) with the default configuration (no explicit Configuration, StorageClient, or EventManager provided), several warning messages are printed to the log.

Minimal reproducible example

import asyncio

from crawlee.crawlers import BasicCrawler, BasicCrawlingContext


async def main() -> None:
    crawler = BasicCrawler()

    @crawler.router.default_handler
    async def request_handler(context: BasicCrawlingContext) -> None:
        context.log.info(f'Processing URL: {context.request.url}...')

    await crawler.run(['https://crawlee.dev/'])


if __name__ == '__main__':
    asyncio.run(main())
$ uv run python run_crawler.py 
No configuration set, implicitly creating and using default Configuration.
No storage client set, implicitly creating and using default FileSystemStorageClient.
No event manager set, implicitly creating and using default LocalEventManager.
[crawlee.crawlers._basic._basic_crawler] INFO  Crawled 0/813 pages, 0 failed requests, desired concurrency 10.
[crawlee.crawlers._basic._basic_crawler] INFO  Current request statistics:
┌───────────────────────────────┬────────┐
│ requests_finished             │ 0      │
│ requests_failed               │ 0      │
│ retry_histogram               │ [0]    │
│ request_avg_failed_duration   │ None   │
│ request_avg_finished_duration │ None   │
│ requests_finished_per_minute  │ 0      │
│ requests_failed_per_minute    │ 0      │
│ request_total_duration        │ 0s     │
│ requests_total                │ 0      │
│ crawler_runtime               │ 23.3ms │
└───────────────────────────────┴────────┘
[crawlee._autoscaling.autoscaled_pool] INFO  current_concurrency = 0; desired_concurrency = 10; cpu = 0; mem = 0; event_loop = 0.0; client_info = 0.0
[crawlee.crawlers._basic._basic_crawler] INFO  Processing URL: https://crawlee.dev/...
[crawlee._autoscaling.autoscaled_pool] INFO  Waiting for remaining tasks to finish
[crawlee.crawlers._basic._basic_crawler] INFO  Final request statistics:
┌───────────────────────────────┬────────┐
│ requests_finished             │ 1      │
│ requests_failed               │ 0      │
│ retry_histogram               │ [1]    │
│ request_avg_failed_duration   │ None   │
│ request_avg_finished_duration │ 6.7ms  │
│ requests_finished_per_minute  │ 1663   │
│ requests_failed_per_minute    │ 0      │
│ request_total_duration        │ 6.7ms  │
│ requests_total                │ 1      │
│ crawler_runtime               │ 36.1ms │
└───────────────────────────────┴────────┘

Expected behavior

  • Default components (Configuration, StorageClient, EventManager) should be initialized silently without producing warning log messages.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working.t-toolingIssues with this label are in the ownership of the tooling team.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions