Commandline tool for generating language data sets for Tani dialects.
Requires a Mac or Linux machine with Node.js (4+) and Neo4j installation.
The whole project is built around a directory named concepts, where directories/sub-directories of concept definition files are stored.
A concept definition file is just a .txt file with the following (YAML) structure.
Concept: Concise definition
[Note]: Any special note
[Example]: A list of examples
Note: All values under the "Example" section should be double-quoted.
For example:
Concept: Good to experience"
Note: "Used for describing quality of being"
Examples:
- apatani/hari: "kaa[pyo] = beautiful"
- galo/aalo: "kaa[ken] = beautiful"
- adi/pasighat: "kam[po] = beautiful"
Concept files can be created using the add command or manually. When adding manually, the file name should be next largest number in the directory and the sync --all command must be executed after it.
Dialect files, and hence queries, are created from these concept files.
concepts- directory of concept text filesdialects- text file under dialects/tribe/locality, which contain root word information for a particular dialectqueries- directory ofacffiles, ready to be pushed to the database
- Do not start editing the dialect files, till the concepts are clearly identified and organized a logical taxonomy. Any change in the structure of the
conceptsdirectory will destructively affect the dialect files. - Make sure to quote any string that contains YAML special characters.
DO NOT ADD CONCEPT FILES MANUALLY!
Concepts can be added manually IN FUTURE. If any concept is manually added, the sync command must be run to update the dialect files.
The following command will add a new noun concept for "Animal", and assign the immediate largest number as its id (filename) and create the corresponding file in the concept directory n.
tani add n "Animal"
The following command will create a new noun concept file under the n/ornaments directory. If the directory does not exist, it will be created. Since the concept string is not specified, the file will be empty.
tani add n/ornaments
A new concept can be added and made to take over the id of an existing concept using the add @ command. In such cases, the existing file and all other files after it will be shifted by a value of 1.
tani add v@1 "Do"
tani add n/animals/insects@5 "Grasshopper"
Whenever a new concept is added using the add command, the corresponding dialects directory will be update to reflect the change.
tani compile --all and tani publish must be run after every add command to keep the source files updated.
Read the entry for a concept:
tani read n/1
Read the entry in a dialect for a concept:
tani read n/1 apatani/hari
This will move n/10.txt to n/animals and assign it a new name based on the largest file id in the n/animals directory.
tani move n/10 n/animals
This will move n/10.txt to n/animals as 2.txt, while incrementing the existing file names starting from index 2.
tani move n/10 n/animals@2
The following will rename n/10.txt to n/1.txt. The pre-existing 1.txt will be renamed 2.txt, 2.txt to 3.txt, and so on.
tani move n/5 n@1
DO NOT DELETE CONCEPT FILES MANUALLY!
Concepts can be deleted manually IN FUTURE. If any concept is manually deleted, the sync command must be run to update the dialect files.
NOTE: When a concept file is deleted, the corresponding dialect file is also deleted.
The following command will delete an existing concept at the specified index.
tani delete n/10
All files starting from 11 will be decremented by 1. Do not delete concept files, if this behavior is not wanted.
The following command will delete all the existing concepts in a directory.
tani delete n/animals
To delete all the concepts:
tani delete --all
tani compile --all and tani publish must be run after every delete command to keep the query files updated.
Use this command to update dialect files when new concept are manualy added to the concepts directory, or an existing one is edited.
Note: If there are no dialect files, there is no need for syncing.
Scan all the concepts and apply the changes to the dialect files:
tani sync --all
Scan v and apply the changes to the dialect files:
tani sync v
Scan n/animals and apply the changes to the dialect files:
tani sync n/animals
A dialect is identified using the tribe and locality (<tribe>/<locality>).
Add a new dialect:
tani init apatani/hari
This will create the apatani/hari dialect directory under the dialects dir at the root of the project. The corresponding dialect files for the dialect will be generated from the concepts directory.
This command merely generates the dialect files, the files need to be manually edited to make entries for the dialect by experts in the dialect.
A dialect can be initialized from an existing dialect, this saves time in making entries when the dialects have a lot of similarities.
tani init apatani/hija from apatani/hari
The dialect files for apatani/hija will be created by copying the contents of apatani/hari.
By default, the script used for making entries for a tribe is the one recommended by its apex body - Adi: ABK, Apatani: ALDC, Galo: GWS, Nyishi: NES. To use a custom script, create a script file in the scripts directory and modify the index.js file in the scripts directory.
Here is an example of specifying the script for making the entries for a dialect.
tani init apatani/gyati --script pss
Use the uninit command to delete the dialect directories generated by the init command.
Delete all the dialects:
tani uninit --all
Delete all the dialects of a tribe:
tani uninit apatani
Delete a dialect:
tani uninit apatani/hari
Search for entries in the concept and dialect files.
tani search <string>
Search for entries in concept files.
tani search <string> --concept
Search for entries in dialect files.
tani search <string> --dialect
NOTE: make sure to run the following command after add, move, or sync commands to update search results.
tani index
Concept and dialect files files have to be compiled to generate the query files. Only query files can be published.
Compile everything:
tani compile --all
Compile concepts:
tani compile concepts
Compile verbs:
tani compile v
Compile verb modifiers:
tani compile vm
Similarly for n, nm, adj, adj, conj etc.
Compile all the dialects of a tribe:
tani compile apatani
Compile a dialect:
tani compile apatani/hari
The generated queries directory can be regenarated using the compile command, so the directory should be put in .gitignore.
The publish command must be run at the root of the queries directory. It erases the existing entries in the database, and populate it with the new queries. The compile command must be run, before publish can be run.
Publish everything:
tani publish --all
Publish a specific tribe:
tani publish apatani
Publish a specific dialect:
tani publish apatani/hari
Publish a specific concept directory:
tani publish apatani/hari/n
Publish a specific concept:
tani publish apatani/hari/n/animals@10
Use the unpublish command to remove entries from the database. This will not affect the local files, only the entries in the database will be removed.
Unpublish everything:
tani unpublish --all
Unpublish a specific tribe:
tani unpublish apatani
Unpublish a specific dialect:
tani unpublish apatani/hari
Publish a specific concept directory:
tani unpublish apatani/hari/n
Publish a specific concept:
tani unpublish apatani/hari/n/animals@10
tani status