Releases: dccuchile/wefe
1.0.0
Version 1.0.0
Major Release - Breaking Changes
- Python 3.10+ Required: Dropped support for Python 3.6-3.9
- Modern Packaging: Migrated from
setup.py
topyproject.toml
- Updated Dependencies: All packages updated for modern Python ecosystem
New Features:
- Robust dataset fetching with retry mechanism and exponential backoff
- HTTP 429 (rate limiting) and timeout error handling
- Optional dependencies:
pip install "wefe[dev]"
and"wefe[pytorch]"
- Dynamic version loading from
wefe.__version__
Core Improvements:
- WordEmbeddingModel: Enhanced type safety, better gensim compatibility, improved error handling
- BaseMetric: Refactored input validation, standardized
run_query
methods across all metrics - Testing: Converted to pytest patterns with monkeypatch, comprehensive test coverage
- Code Quality: Migration from flake8 to Ruff, enhanced documentation with detailed docstrings
Development Workflow:
- GitHub Actions upgraded with Python 3.10-3.13 matrix testing
- Pre-commit hooks enhanced with JSON/TOML validation and security checks
- Modernized Sphinx documentation configuration
- Updated benchmark documentation and metrics comparison tables
What's Changed
- Benchmark and Changes by @pbadillatorrealba in #42
- Update changelog in readme by @pbadillatorrealba in #48
- Bump ipython from 7.34.0 to 8.10.0 by @dependabot[bot] in #47
- Bump torch from 1.12.1 to 1.13.1 by @dependabot[bot] in #46
- Fix/benchmark doc by @pbadillatorrealba in #49
- Bump urllib3 from 1.26.15 to 1.26.18 by @dependabot[bot] in #51
- chore: Update requirements and documentation by @pbadillatorrealba in #59
- chore: Update CI, pre-commit, ruff rules and fix several lint problems by @pbadillatorrealba in #61
- feature: Update WordEmbeddingModel class by @pbadillatorrealba in #62
- feature: Update BaseMetric class by @pbadillatorrealba in #63
- Fix WEAT words by @kato8966 in #53
- feature: Add dataset fetch retry and update tests with new word sets references by @pbadillatorrealba in #64
- feat: Update project build and configuration by @pbadillatorrealba in #65
New Contributors
- @dependabot[bot] made their first contribution in #47
- @kato8966 made their first contribution in #53
Full Changelog: 0.4.1...1.0.0
0.4.1
What's Changed
- Fix bug in RIPA metric:
Previously it was considering n-1 words for the 2 sets of targets, therefore, it was omitting the last words of the targets, which was incorrect with respect to the original definition of the paper.
Now it correctly occupies all the target words.
- Added project logos to documentation.
- Added pre-commit to github actions, which improves automatic QA.
- Several code style and typing corrections given by flake, black, ruff and isort.
New Contributors
Associated PR
- Add Logos by @pbadillatorrealba in #40
- fix: include last pair in RIPA score by @h4c5 in #43
- Add precommit job to github actions and correct linter issues by @pbadillatorrealba in #44
- Release 0.4.1 by @pbadillatorrealba in #45
Full Changelog: 0.4.0...0.4.1
0.4.0
Version 0.4.0 Changelog
- 3 new bias mitigation methods (debias) implemented: Double Hard Debias, Half
Sibling Regression and Repulsion Attraction Neutralization. - The library documentation of the library has been restructured.
Now, the documentation is divided into user guide and theoretical framework
The user guide does not contain theoretical information.
Instead, theoretical documentation can be found in the conceptual guides. - Improved API documentation and examples. Added multilingual examples contributed
by the community. - The user guides are fully executable because they are now on notebooks.
- There was also an important improvement in the API documentation and in metrics and
debias examples. - Improved library testing mechanisms for metrics and debias methods.
- Fixed wrong repr of query. Now the sets are in the correct order.
- Implemented repr for WordEmbeddingModel.
- Testing CI moved from CircleCI to GithubActions.
- License changed to MIT.
0.3.2
This version mainly fixes a bug in RNSB and updates case study scores in examples. The changes are as follows:
- Fixed RNSB bug where the classification labels were interchanged and could produce erroneous results when the attributes are of different sizes.
- Fixed RNSB replication notebook
- Update of WEFE case study scores.
- Improved documentation examples for WEAT, RNSB, RIPA.
- Holdout parameter added to RNSB, which allows to indicate whether or not a holdout is performed when training the classifier.
- Improved printing of the RNSB evaluation.
0.3.0
This new version includes a new debias module as well as a complete refactoring to the preprocessing of word embeddings.
Changelog:
- Implemented Bolukbasi et al. 2016 Hard Debias.
- Implemented Thomas Manzini et al. 2019 Multiclass Hard Debias.
- Implemented a fetch function to retrieve gn-glove female-male word sets.
- Moved the transformation logic of words, sets and queries to embeddings to its own module: preprocessing
- Enhanced the preprocessor_args and secondary_preprocessor_args metric preprocessing parameters to an list of preprocessors
preprocessors
together with the parameterstrategy
indicating whether to consider all the transformed words ('all'
) or only the first one encountered ('first'
). - Renamed WordEmbeddingModel attributes
model
andmodel_name
towv
andname
respectively. - Renamed every run_query
word_embedding
argument tomodel
in every metric.
0.2.2
Some bug fixes and RIPA integration prior to the release of the new version.
v0.2.0
The main change in this version is an improvement in how WEFE transforms word sets into embeddings.
Now all this work is contained in the WordEmbeddingModel class. It also contains several improvements, both to this process and to the library in general.
See the changelog for more information.
Note: Contains changes that may not be compatible with previous versions.
- Renamed optional
run_query
parameterwarn_filtered_words
to
warn_not_found_words
. - Added
word_preprocessor_args
parameter torun_query
that allows to
specify transformations prior to searching for words in word embeddings. - Added
secondary_preprocessor_args
parameter torun_query
which allows
to specify a second pre-processor transformation to words before searching them
in word embeddings. It is not necessary to specify the first preprocessor to
use this one. - Implemented
__getitem__
function in WordEmbeddingModel. This method allows to
obtain an embedding from a word from the model stored in the instance using
indexers. - Removed underscore from class and instance variable names.
- Improved type and verification exception messages when creating objects and
executing methods. - Fix an error that appeared when calculating rankings with two columns of
aggregations with the same name. - Ranking correlations are now calculated using pandas
corr
method. - Changed metric template, name and short_names to class variables.
- Implemented
random_state
in RNSB to allow replication of the experiments. - run_query now returns as a result the default metric requested in the
parameters and all calculated values that may be useful in the other variables
of the dictionary. - Fixed problem with api documentation: now it shows methods of the classes.
- Implemented p-value for WEAT