Skip to content

Commit 6082200

Browse files
authored
Minor doc and CI updates + pin pyvinecopulib version (#27)
1 parent e83846b commit 6082200

File tree

9 files changed

+43
-24
lines changed

9 files changed

+43
-24
lines changed

.github/workflows/ci.yml

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ jobs:
1111
matrix:
1212
os: ['ubuntu-latest', 'macos-latest', 'windows-latest']
1313
# when changing version, also change setup.py
14-
python-version: ['3.8']
14+
python-version: ['3.8', '3.9']
1515
steps:
1616
- uses: actions/checkout@v2
1717
- uses: conda-incubator/setup-miniconda@v2
@@ -31,11 +31,11 @@ jobs:
3131
shell: bash -l {0}
3232
run: pytest tests/
3333

34-
- name: Run tests with pyvinecopulib
34+
- name: Run tests with pyvinecopulib==0.5.5
3535
shell: bash -l {0}
3636
run: |
3737
set -ex
38-
pip install pyvinecopulib
38+
pip install pyvinecopulib==0.5.5
3939
pytest tests/
4040
4141
gh-pages:
@@ -50,6 +50,9 @@ jobs:
5050
activate-environment: synthia
5151
environment-file: environment.yml
5252

53+
- name: Install pandoc
54+
run: sudo apt-get install pandoc
55+
5356
- name: Install Synthia (dev env)
5457
# 'shell' required to activate environment.
5558
# See https://github.com/conda-incubator/setup-miniconda#IMPORTANT.

CHANGELOG.txt

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,13 @@
1+
1.1.0
2+
- Pin pyvinecopulib version to avoid issues between versions.
3+
- Add CI tests for Python 3.9 (#17).
4+
- Minor doc improvements.
5+
6+
1.0.0
7+
- Add JOSS summary paper (#26).
8+
- Improve docs and tutorials (#14, #13, #18, ...).
9+
- Enable CI on multiple OS and Python versions (#16).
10+
111
0.3.0
212
- Add support for handling categorical quantities (#10, #13).
313

DEVELOP.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,8 @@ Then activate with `conda activate synthia`.
1616
During development:
1717

1818
```
19-
pip install -e .
19+
pip install -e .[full]
20+
pip install pytest
2021
```
2122

2223

README.md

Lines changed: 19 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
## Overview
1010

11-
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences ([Meyer et al. 2021](https://doi.org/10.5194/gmd-2020-427)). [Copula](https://dmey.github.io/synthia/copula.html) and [functional Principle Component Analysis (fPCA)](https://dmey.github.io/synthia/fpca.html) are statistical models that allow these properties to be simulated ([Joe 2014](https://doi.org/10.1201/b17116)). As such, copula generated data have shown potential to improve the generalization of machine learning (ML) emulators ([Meyer et al. 2021](https://doi.org/10.5194/gmd-2020-427)) or anonymize real-data datasets ([Patki et al. 2016](https://doi.org/10.1109/DSAA.2016.49)).
11+
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences. [Copula](https://dmey.github.io/synthia/copula.html) and [functional Principle Component Analysis (fPCA)](https://dmey.github.io/synthia/fpca.html) are statistical models that allow these properties to be simulated ([Joe 2014](https://doi.org/10.1201/b17116)). As such, copula generated data have shown potential to improve the generalization of machine learning (ML) emulators ([Meyer et al. 2021](https://doi.org/10.5194/gmd-14-5205-2021)) or anonymize real-data datasets ([Patki et al. 2016](https://doi.org/10.1109/DSAA.2016.49)).
1212

1313
Synthia is an open source Python package to model univariate and multivariate data, parameterize data using empirical and parametric methods, and manipulate marginal distributions. It is designed to enable scientists and practitioners to handle labelled multivariate data typical of computational sciences. For example, given some vertical profiles of atmospheric temperature, we can use Synthia to generate new but statistically similar profiles in just three lines of code (Table 1).
1414

@@ -33,14 +33,14 @@ For installation instructions, getting started guides and tutorials, background
3333

3434
## How to cite
3535

36-
If you are using Synthia, please cite the following two papers using their respective Digital Object Identifiers (DOIs). Citations may be generated automatically using Crosscite's [DOI Citation Formatter](https://citation.crosscite.org/) or from the BibTeX entries below. If needed, you may also cite the specific software version with [its corresponding Zendo DOI](https://doi.org/10.5281/zenodo.4701278).
36+
If you are using Synthia, please cite the following two papers using their respective Digital Object Identifiers (DOIs). Citations may be generated automatically using Crosscite's [DOI Citation Formatter](https://citation.crosscite.org/) or from the BibTeX entries below.
3737

38-
| Synthia Software | Software Application |
39-
| ---------------------------------- | ----------------------------------------------------------------- |
40-
| DOI: 10.21105/joss.02863 | DOI: [10.5194/gmd-2020-427](https://doi.org/10.5194/gmd-2020-427) |
38+
| Synthia Software | Software Application |
39+
| ---------------------------------- | ------------------------------------------------------------------------- |
40+
| DOI: 10.21105/joss.02863 | DOI: [10.5194/gmd-14-5205-2021](https://doi.org/10.5194/gmd-14-5205-2021) |
4141

4242
```bibtex
43-
@article{Meyer_Nagler_2021,
43+
@article{Meyer_and_Nagler_2021,
4444
title = {Synthia: multidimensional synthetic data generation in Python},
4545
author = {David Meyer and Thomas Nagler},
4646
year = {2021},
@@ -49,17 +49,22 @@ If you are using Synthia, please cite the following two papers using their respe
4949
note = {Under review}
5050
}
5151
52-
@article{Meyer_Nagler_Hogan_2021,
53-
title = {Copula-Based Synthetic Data Generation for Machine Learning Emulators in Weather and Climate: Application to a Simple Radiation Model},
54-
author = {David Meyer and Thomas Nagler and Robin J. Hogan},
55-
year = {2021},
56-
volume = {2021},
57-
doi = {10.5194/gmd-2020-427},
58-
journal = {Geoscientific Model Development Discussions},
59-
note = {Under review}
52+
@article{Meyer_and_Nagler_and_Hogan_2021,
53+
doi = {10.5194/gmd-14-5205-2021},
54+
url = {https://doi.org/10.5194/gmd-14-5205-2021},
55+
year = {2021},
56+
month = aug,
57+
publisher = {Copernicus {GmbH}},
58+
volume = {14},
59+
number = {8},
60+
pages = {5205--5215},
61+
author = {David Meyer and Thomas Nagler and Robin J. Hogan},
62+
title = {Copula-based synthetic data augmentation for machine-learning emulators},
63+
journal = {Geoscientific Model Development}
6064
}
6165
```
6266

67+
If needed, you may also cite the specific software version with [its corresponding Zendo DOI](https://doi.org/10.5281/zenodo.4701278).
6368

6469
## Contributing
6570

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ def copy_overview(f_read: Path, f_write: Path, rebuild=False) -> None:
4747
project = 'synthia'
4848
copyright = '2020 D. Meyer and T. Nagler'
4949
author = 'D. Meyer and T. Nagler'
50-
release = '0.3.0'
50+
release = '1.1.0'
5151

5252
html_context = {
5353
'display_github': True,

docs/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
pip install synthia
1818
```
1919

20-
or with optional dependencies
20+
or with optional [pyvinecopulib](https://github.com/vinecopulib/pyvinecopulib):
2121

2222
```
2323
pip install synthia[full]

environment.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,4 +22,4 @@ dependencies:
2222
- sphinxcontrib-bibtex=1
2323
- sphinx-copybutton
2424
- pip:
25-
- pyvinecopulib
25+
- pyvinecopulib==0.5.5

setup.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
setup(
1010
name='synthia',
11-
version='1.0.0',
11+
version='1.1.0',
1212
description='Multidimensional synthetic data generation in Python',
1313
long_description=long_description,
1414
long_description_content_type="text/markdown",
@@ -29,6 +29,6 @@
2929
"bottleneck", # required by xarray.DataArray.rank
3030
],
3131
extras_require = {
32-
"full": ["pyvinecopulib"]
32+
"full": ["pyvinecopulib==0.5.5"]
3333
}
3434
)

tests/test_generators.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ def test_independent_feature_generation_with_distribution():
4949

5050
dist_names = set(syn.DistributionParameterizer.get_dist_names())
5151
# Remove all very slow distributions
52-
dist_names -= set(['genexpon', 'levy_stable', 'recipinvgauss', 'vonmises', 'kstwo'])
52+
dist_names -= set(['genexpon', 'levy_stable', 'recipinvgauss', 'vonmises', 'kstwo', 'studentized_range'])
5353

5454
generator.fit(input_data, copula=syn.IndependenceCopula(),
5555
parameterize_by=syn.DistributionParameterizer(dist_names, verbose=True))

0 commit comments

Comments
 (0)