Speach - Documenting natural languages¶
Welcome to Speach’s documentation! Speach, formerly texttaglib, is a Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, JSON, SQLite, VTT, Audacity, TTL, TIG, ISF, etc.)
Main functions:
Text corpus management
Manipulating ELAN transcription files directly in ELAN Annotation Format (eaf)
TIG - A human-friendly intelinear gloss format for linguistic documentation
Multiple storage formats (text files, JSON files, SQLite databases)
Contributors are welcome!
If you want to help developing speach
, please visit Contributing page.
Installation¶
Speach is availble on PyPI.
pip install speach
ELAN support¶
Speach can be used to extract annotations as well as metadata from ELAN transcripts, for example:
from speach import elan
# Test ELAN reader function in speach
eaf = elan.read_eaf('./test/data/test.eaf')
# accessing tiers & annotations
for tier in eaf:
print(f"{tier.ID} | Participant: {tier.participant} | Type: {tier.type_ref}")
for ann in tier:
print(f"{ann.ID.rjust(4, ' ')}. [{ann.from_ts} :: {ann.to_ts}] {ann.text}")
Speach also provides command line tools for processing EAF files.
# this command converts an eaf file into csv
python -m speach eaf2csv input_elan_file.eaf -o output_file_name.csv
More information:
Useful Links¶
Soure code: https://github.com/neocl/speach/
Speach on PyPI: https://pypi.org/project/speach/
Speach documentation: https://speach.readthedocs.io/
Release Notes¶
Release notes is available here.
Contributors¶
Le Tuan Anh (Maintainer)
Graphic materials¶
The Speach logo () was created by using the snake emoji (created by Selina Bauder) and the peach emoji (created by Marius Schnabel) from Openmoji project. License: CC BY-SA 4.0