ELAN module

speach supports reading and manipulating multi-tier transcriptions from ELAN directly.

Note

For better security, speach will use the package defusedxml automatically if available to parse XML streams (instead of Python’s default parser). When defusedxml is available, the flag speach.elan.SAFE_MODE will be set to True.

For common code samples to processing ELAN, see ELAN Recipes page.

ELAN module functions

ELAN module - manipulating ELAN transcript files (*.eaf, *.pfsx)

speach.elan.read_eaf(eaf_path, encoding='utf-8', *args, **kwargs)

Read an EAF file and return an elan.Doc object

>>> from speach import elan
>>> eaf = elan.read_eaf("myfile.eaf")
Parameters
  • eaf_path (str or Path-like object) – Path to existing EAF file

  • encoding (str) – Encoding of the eaf stream, defaulted to UTF-8

Return type

speach.elan.Doc

speach.elan.parse_eaf_stream(eaf_stream, *args, **kwargs)

Parse an EAF input stream and return an elan.Doc object

>>> with open('test/data/test.eaf').read() as eaf_stream:
>>>    eaf = elan.parse_eaf_stream(eaf_stream)
Parameters

eaf_stream – EAF text input stream

Return type

speach.elan.Doc

speach.elan.parse_string(eaf_string, *args, **kwargs)

Parse EAF content in a string and return an elan.Doc object

>>> with open('test/data/test.eaf').read() as eaf_stream:
>>>    eaf_content = eaf_stream.read()
>>>    eaf = elan.parse_string(eaf_content)
Parameters

eaf_string (str) – EAF content stored in a string

Return type

speach.elan.Doc

ELAN Document model

class speach.elan.Doc(**kwargs)[source]

This class represents an ELAN file (*.eaf)

annotation(ID)[source]

Get annotation by ID

clone(*args, **kwargs)[source]

Clone this ELAN object by using the save() action

classmethod create(media_file='audio.wav', media_url=None, relative_media_url=None, author='', *args, **kwargs)[source]

Create a new blank ELAN doc

>>> from speach import elan
>>> eaf = elan.create()
Parameters

encoding (str) – Encoding of the eaf stream, defaulted to UTF-8

Return type

speach.elan.Doc

cut(section, outfile, media_file=None, use_concat=False, *args, **kwargs)[source]

Cut the source media with timestamps defined in section object

For example, the following code cut all annotations in tier “Tier 1” into appopriate audio files

>>> for idx, ann in enumerate(eaf["Tier 1"], start=1):
>>>     eaf.cut(ann, f"tier1_ann{idx}.wav")
Parameters
  • section – Any object with from_ts and to_ts attributes which return TimeSlot objects

  • outfile – Path to output media file, must not exist or a FileExistsError will be raised

  • media_file – Use to specify source media file. This will override the value specified in source EAF file

Raises

FileExistsError, ValueError

get_linguistic_type(type_id)[source]

Get linguistic type by ID. Return None if can not be found

get_participant_map()[source]

Map participants to tiers Return a map from participant name to a list of corresponding tiers

get_vocab(vocab_id)[source]

Get controlled vocab list by ID

media_path()[source]

Try to determine the best path to source media file

new_timeslot(value)[source]

Create a new timeslot object

Parameters

value (int or str) – Timeslot value (in milliseconds)

classmethod parse_string(eaf_string, *args, **kwargs)[source]

Parse EAF content in a string and return an elan.Doc object

>>> with open('test/data/test.eaf').read() as eaf_stream:
>>>    eaf_content = eaf_stream.read()
>>>    eaf = elan.parse_string(eaf_content)
Parameters

eaf_string (str) – EAF content stored in a string

Return type

speach.elan.Doc

save(path, encoding='utf-8', xml_declaration=None, default_namespace=None, short_empty_elements=True, *args, **kwargs)[source]

Write ELAN Doc to an EAF file

tiers() Tuple[speach.elan.Tier][source]

Collect all existing Tier in this ELAN file

to_csv_rows() List[List[str]][source]

Convert this ELAN Doc into a CSV-friendly structure (i.e. list of list of strings)

Returns

A list of list of strings

Return type

List[List[str]]

to_xml_bin(encoding='utf-8', default_namespace=None, short_empty_elements=True, *args, **kwargs)[source]

Generate EAF content (bytes) in XML format

Returns

EAF content

Return type

bytes

to_xml_str(encoding='utf-8', *args, **kwargs)[source]

Generate EAF content string in XML format

property constraints: Tuple[speach.elan.Constraint]

A tuple of all existing constraints in this ELAN file

property external_refs: Tuple[speach.elan.ExternalRef]

Get all external references

property languages: Tuple[speach.elan.Language]

Get all languages

property licenses: Tuple[speach.elan.License]

Get all licenses

property linguistic_types: Tuple[speach.elan.LinguisticType]

A tuple of all existing linguistic types in this ELAN file

property roots: Tuple[speach.elan.Tier]

All root-level tiers in this ELAN doc

property vocabs: Tuple[speach.elan.ControlledVocab]

A tuple of all existing controlled vocabulary objects in this ELAN file

ELAN Tier model

class speach.elan.Tier(doc=None, xml_node=None, **kwargs)[source]

Represents an ELAN annotation tier

filter(from_ts=None, to_ts=None)[source]

Filter utterances by from_ts or to_ts or both If this tier is not a time-based tier everything will be returned

get_child(ID)[source]

Get a child tier by ID, return None if nothing is found

new_annotation(value, from_ts=None, to_ts=None, ann_ref_id=None, values=None, timeslots=None, check_cv=True)[source]

Create new annotation(s) in this current tier ELAN provides 5 different tier stereotypes.

To create a new standard annotation (in a tier with no constraints), a text value and a pair of from-to timestamp must be provided.

>>> from speach import elan
>>> eaf = elan.create()  # create a new ELAN transcript
>>> # create a new utterance tier
>>> tier = eaf.new_tier('Person1 (Utterance)')
>>> # create a new annotation between 00:00:01.000 and 00:00:02.000
>>> a1 = tier.new_annotation('Xin chào', 1000, 2000)

Included-In tiers

>>> eaf.new_linguistic_type('Phoneme', 'Included_In')
>>> tp = eaf.new_tier('Person1 (Phoneme)', 'Phoneme', 'Person1 (Utterance)')
>>> # string-based timestamps can also be used with the helper function elan.ts2msec()
>>> tt.new_annotation('ch', elan.ts2msec("00:00:01.500"),
                      elan.ts2msec("00:00:01.600"),
                      ann_ref_id=a1.ID)

Annotations in Symbolic-Associtation tiers:

>>> eaf.new_linguistic_type('Translate', 'Symbolic_Association')
>>> tt = eaf.new_tier('Person1 (Translate)', 'Translate', 'Person1 (Utterance)')
>>> tt.new_annotation('Hello', ann_ref_id=a1.ID)

Symbolic-Subdivision tiers:

>>> eaf.new_linguistic_type('Tokens', 'Symbolic_Subdivision')
>>> tto = eaf.new_tier('Person1 (Tokens)', 'Tokens', 'Person1 (Utterance)')
>>> # extra annotations can be provided with the argument values
>>> tto.new_annotation('Xin', values=['chào'], ann_ref_id=a1.ID)
>>> # alternative method (set value to None and provide everything with values)
>>> tto.new_annotation(None, values=['Xin', 'chào'], ann_ref_id=a1.ID)
property linguistic_type: speach.elan.LinguisticType

Linguistic type object of this Tier (alias of type_ref

property name

An alias to tier’s ID

property parent_ref

ID of the parent tier. Return None if this is a root tier

property time_alignable

Check if this tier contains time alignable annotations

property type_ref: speach.elan.LinguisticType

Tier type object

property type_ref_id

ID of the tier type ref

ELAN Annotation model

There are two different annotation types in ELAN: TimeAnnotation and RefAnnotation. TimeAnnotation objects are time-alignable annotations and contain timestamp pairs from_ts, to_ts to refer back to specific chunks in the source media. On the other hand, RefAnnotation objects are annotations that link to something else, such as another annotation or an annotation sequence in the case of symbolic subdivision tiers.

class speach.elan.TimeAnnotation(ID, from_ts, to_ts, value, xml_node=None, **kwargs)[source]

An ELAN time-alignable annotation

overlap(other)[source]

Calculate overlap score between two time annotations Score = 0 means adjacent, score > 0 means overlapped, score < 0 means no overlap (the distance between the two)

property duration: float

Duration of this annotation (in seconds)

property from_ts: speach.elan.TimeSlot

Start timestamp of this annotation

property to_ts: speach.elan.TimeSlot

End timestamp of this annotation

class speach.elan.RefAnnotation(ID, ref_id, previous, value, xml_node=None, **kwargs)[source]

An ELAN ref annotation (not time alignable)

property ref_id

ID of the referenced annotation

class speach.elan.Annotation(ID, value, cve_ref=None, xml_node=None, **kwargs)[source]

An ELAN abstract annotation (for both alignable and non-alignable annotations)

property text

An alias to ELANAnnotation.value

property value: str

Annotated text value.

It is possible to change value of an annotation

>>> ann.value
'Old value'
>>> ann.value = "New value"
>>> ann.value
'New value'
class speach.elan.TimeSlot(xml_node=None, ID=None, value=None, *args, **kwargs)[source]
property sec

Get TimeSlot value in seconds

property ts: str

Return timestamp of this annotation in vtt format (00:01:02.345)

Returns

An empty string will be returned if TimeSlot value is None

property value

TimeSlot value (in milliseconds)