We need a convenient API for extracting tokens from a bulk of documents. This API will be primarily used for testing purposes.