standoff-mode

https://github.com/lueck/standoff-mode.git

git clone 'git://github.com/lueck/standoff-mode.git'
1

standoff-mode

standoff-mode is a major mode for GNU Emacs that lets you create annotations on texts in a stand-off manner. It is written for use in the field of digital humanities and the manual annotation of training data for named-entity recognition.

There are several tools for creating stand-off markup. Most of them need to be deployed on a server in a network environment, which may be a barrier. In contrast standoff-mode does not need a networking environment. It wants to enable one to get hands on annotating texts right away.

Markup can be stored in several formats with standoff-mode: including dumped lisp-expressions (implemented), a remote or local SQL-Database or as RDF-triples in a SPARQL-endpoint following the emerging standard defined in the OpenAnnotation ontology (roadmap) or as local files following BRAT's plain-text format (planned).

standoff-mode doesn't want to be everything under one hood. It's just a tool for the manual annotation of texts. Statistics must be done by another tool.

Since it was written for the field of digital humanities, literature studies in particular, standoff-mode works not only with plain text input (source) files, but also with XML. So semantic stand-off markup produced with it may reference structural markup coded in TEI/P5, which may be of advantage for further processing.

Stand-off Markup

Stand-off markup is also known as external markup and means:

Cf. the TEI/P5 guidelines on stand-off markup and the OpenAnnotation ontology.

Features

Roadmap

standoff-mode is under active development. Here's the roadmap:

Requirements

Only GNU Emacs is required. After the installation of the editor the standoff-mode package has to be installed. It was tested on Windows, Linux and Mac, with versions 24.3 and 24.5.

If you want to store your markup in SQL-tables or as RDF-triples, a RDBMS or a SPARQL-endpoint is required.