Skip to content

Add Wikipedia processing and reporting #232

@oree-xx

Description

@oree-xx

Problem

The wikipedia data is being fetched using the wikipedia_fetch.py. Wikipedia mainly uses the CC_BY_SA 4.0 license and it api currently fetches data from all language edition of wikipedia.
There is need to complete the processing and reporting of the language data.

Description

Need to identify meaningful analysis like:

  • Top 10 highest language usage
  • Classifying represented and underrepresented languages
  • Average count of article per language
  • % of all Wikipedia articles that belong to the top 10 languages
  • % of underrepresented languages
  • Classify article count by regions

Alternatives

Can we use other visualizations for reporting?

Implementation

  • I would be interested in implementing this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions