-
Notifications
You must be signed in to change notification settings - Fork 303
Open
Labels
Description
A high number of tags refer to the same concept with different wording or different casing/styling for the same words.
It might be a good idea to add a normalization pipeline for the tags in each company.
Here is a mapping from original to normalized tags in the form of a python dict (easily convertible in any other format) that might be useful as a starting point: https://github.com/FrancescoManfredi/AIRV-analysis/blob/main/tags_repl.py
I'm the author of that mapping and this is an invite to make use of it in any way you prefer.
bittner