prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech tagging, and named-entity extraction.
You can find a more detailed summary on the library's performance here: Introducing prose v2.0.0: Bringing NLP to Go.
$ go get github.com/jdkato/prose/v2package main
import (
    "fmt"
    "log"
    "github.com/jdkato/prose/v2"
)
func main() {
    // Create a new document with the default configuration:
    doc, err := prose.NewDocument("Go is an open-source programming language created at Google.")
    if err != nil {
        log.Fatal(err)
    }
    // Iterate over the doc's tokens:
    for _, tok := range doc.Tokens() {
        fmt.Println(tok.Text, tok.Tag, tok.Label)
        // Go NNP B-GPE
        // is VBZ O
        // an DT O
        // ...
    }
    // Iterate over the doc's named-entities:
    for _, ent := range doc.Entities() {
        fmt.Println(ent.Text, ent.Label)
        // Go GPE
        // Google GPE
    }
    // Iterate over the doc's sentences:
    for _, sent := range doc.Sentences() {
        fmt.Println(sent.Text)
        // Go is an open-source programming language created at Google.
    }
}The document-creation process adheres to the following sequence of steps:
tokenization -> POS tagging -> NE extraction
            \
             segmentation
Each step may be disabled (assuming later steps aren't required) by passing the appropriate functional option. To disable named-entity extraction, for example, you'd do the following:
doc, err := prose.NewDocument(
        "Go is an open-source programming language created at Google.",
        prose.WithExtraction(false))prose includes a tokenizer capable of processing modern text, including the non-word character spans shown below.
| Type | Example | 
|---|---|
| Email addresses | [email protected] | 
| Hashtags | #trending | 
| Mentions | @jdkato | 
| URLs | https://github.com/jdkato/prose | 
| Emoticons | :-),>:(,o_0, etc. | 
package main
import (
    "fmt"
    "log"
    "github.com/jdkato/prose/v2"
)
func main() {
    // Create a new document with the default configuration:
    doc, err := prose.NewDocument("@jdkato, go to http://example.com thanks :).")
    if err != nil {
        log.Fatal(err)
    }
    // Iterate over the doc's tokens:
    for _, tok := range doc.Tokens() {
        fmt.Println(tok.Text, tok.Tag)
        // @jdkato NN
        // , ,
        // go VB
        // to TO
        // http://example.com NN
        // thanks NNS
        // :) SYM
        // . .
    }
}prose includes one of the most accurate sentence segmenters available, according to the Golden Rules created by the developers of the pragmatic_segmenter.
| Name | Language | License | GRS (English) | GRS (Other) | Speed† | 
|---|---|---|---|---|---|
| Pragmatic Segmenter | Ruby | MIT | 98.08% (51/52) | 100.00% | 3.84 s | 
| prose | Go | MIT | 75.00% (39/52) | N/A | 0.96 s | 
| TactfulTokenizer | Ruby | GNU GPLv3 | 65.38% (34/52) | 48.57% | 46.32 s | 
| OpenNLP | Java | APLv2 | 59.62% (31/52) | 45.71% | 1.27 s | 
| Standford CoreNLP | Java | GNU GPLv3 | 59.62% (31/52) | 31.43% | 0.92 s | 
| Splitta | Python | APLv2 | 55.77% (29/52) | 37.14% | N/A | 
| Punkt | Python | APLv2 | 46.15% (24/52) | 48.57% | 1.79 s | 
| SRX English | Ruby | GNU GPLv3 | 30.77% (16/52) | 28.57% | 6.19 s | 
| Scapel | Ruby | GNU GPLv3 | 28.85% (15/52) | 20.00% | 0.13 s | 
† The original tests were performed using a MacBook Pro 3.7 GHz Quad-Core Intel Xeon E5 running 10.9.5, while
prosewas timed using a MacBook Pro 2.9 GHz Intel Core i7 running 10.13.3.
package main
import (
    "fmt"
    "strings"
    "github.com/jdkato/prose/v2"
)
func main() {
    // Create a new document with the default configuration:
    doc, _ := prose.NewDocument(strings.Join([]string{
        "I can see Mt. Fuji from here.",
        "St. Michael's Church is on 5th st. near the light."}, " "))
    // Iterate over the doc's sentences:
    sents := doc.Sentences()
    fmt.Println(len(sents)) // 2
    for _, sent := range sents {
        fmt.Println(sent.Text)
        // I can see Mt. Fuji from here.
        // St. Michael's Church is on 5th st. near the light.
    }
}prose includes a tagger based on Textblob's "fast and accurate" POS tagger. Below is a comparison of its performance against NLTK's implementation of the same tagger on the Treebank corpus:
| Library | Accuracy | 5-Run Average (sec) | 
|---|---|---|
| NLTK | 0.893 | 7.224 | 
| prose | 0.961 | 2.538 | 
(See scripts/test_model.py for more information.)
The full list of supported POS tags is given below.
| TAG | DESCRIPTION | 
|---|---|
| ( | left round bracket | 
| ) | right round bracket | 
| , | comma | 
| : | colon | 
| . | period | 
| '' | closing quotation mark | 
| `` | opening quotation mark | 
| # | number sign | 
| $ | currency | 
| CC | conjunction, coordinating | 
| CD | cardinal number | 
| DT | determiner | 
| EX | existential there | 
| FW | foreign word | 
| IN | conjunction, subordinating or preposition | 
| JJ | adjective | 
| JJR | adjective, comparative | 
| JJS | adjective, superlative | 
| LS | list item marker | 
| MD | verb, modal auxiliary | 
| NN | noun, singular or mass | 
| NNP | noun, proper singular | 
| NNPS | noun, proper plural | 
| NNS | noun, plural | 
| PDT | predeterminer | 
| POS | possessive ending | 
| PRP | pronoun, personal | 
| PRP$ | pronoun, possessive | 
| RB | adverb | 
| RBR | adverb, comparative | 
| RBS | adverb, superlative | 
| RP | adverb, particle | 
| SYM | symbol | 
| TO | infinitival to | 
| UH | interjection | 
| VB | verb, base form | 
| VBD | verb, past tense | 
| VBG | verb, gerund or present participle | 
| VBN | verb, past participle | 
| VBP | verb, non-3rd person singular present | 
| VBZ | verb, 3rd person singular present | 
| WDT | wh-determiner | 
| WP | wh-pronoun, personal | 
| WP$ | wh-pronoun, possessive | 
| WRB | wh-adverb | 
prose v2.0.0 includes a much improved version of v1.0.0's chunk package, which can identify people (PERSON) and geographical/political Entities (GPE) by default.
package main
import (
    "github.com/jdkato/prose/v2"
)
func main() {
    doc, _ := prose.NewDocument("Lebron James plays basketball in Los Angeles.")
    for _, ent := range doc.Entities() {
        fmt.Println(ent.Text, ent.Label)
        // Lebron James PERSON
        // Los Angeles GPE
    }
}However, in an attempt to make this feature more useful, we've made it straightforward to train your own models for specific use cases. See Prodigy + prose: Radically efficient machine teaching in Go for a tutorial.