论文标题

Seshat:用于管理和验证音频数据注释活动的工具

Seshat: A tool for managing and verifying annotation campaigns of audio data

论文作者

Titeux, Hadrien, Riad, Rachid, Cao, Xuan-Nga, Hamilakis, Nicolas, Madden, Kris, Cristia, Alejandrina, Bachoud-Lévi, Anne-Catherine, Dupoux, Emmanuel

论文摘要

我们介绍了Seshat,这是一种新的,简单的开源软件,可有效管理语音语料库的注释。 SESHAT软件允许用户轻松自定义和管理大型音频语料库的注释,同时确保符合注释的输出文件的格式和命名约定。此外,它还包括按照可以在个性化解析器中实现的特定规则检查注释内容的程序。最后,我们提出了一种双重保管模式,为此,SESHAT会自动计算与$γ$测量的关联间隔器一致性,并考虑到分类和细分差异。

We introduce Seshat, a new, simple and open-source software to efficiently manage annotations of speech corpora. The Seshat software allows users to easily customise and manage annotations of large audio corpora while ensuring compliance with the formatting and naming conventions of the annotated output files. In addition, it includes procedures for checking the content of annotations following specific rules that can be implemented in personalised parsers. Finally, we propose a double-annotation mode, for which Seshat computes automatically an associated inter-annotator agreement with the $γ$ measure taking into account the categorisation and segmentation discrepancies.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源