Skip to content

Integrating PubMed Central As a DataSource to Improve Quantifying Creative Commons #233

@Goziee-git

Description

@Goziee-git

Problem

Improving Quantifying Creative commons with Medical and life Science records from PubMed.

Description

This Issue aims to integrate PubMed as a new data source for improving Quantifying Creative Commons project. PubMed provides access to biomedical literature with Creative Commons licensing information, contributing valuable insights into open access scientific publishing trends.

Additional context

API Documentation

API Limitations and Constraints

Rate Limiting

  • Maximum Rate: 3 requests per second (0.34-second intervals)
  • Enforcement: Implemented via time.sleep(0.34) between requests
  • Retry Strategy: 3 retries with exponential backoff for failed requests

Data Retrieval Limits

  • Per Request: Maximum 9,999 records per esearch call
  • Batch Size: 200 papers per efetch request for optimal performance
  • Total Limit: Configurable via --fetch-limit parameter (default: 5,000)

PubMed Data Source Information

Implementation

  • I would be interested in implementing this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedOpen to participation from the community✨ goal: improvementImprovement to an existing feature💻 aspect: codeConcerns the software code in the repository🟩 priority: lowLow priority and doesn't need to be rushed🧹 status: ticket work requiredNeeds more details before it can be worked on

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions