I'm looking for in the

LNE-Matics: a unified response to large-scale evaluation needs

LNE-Matics is a free and open-source software suite designed for data exploration and the evaluation of systems. Its initial purpose is to address the evaluation of Natural Language Processing systems, and in a next step to open up to a wider range of data derived from artificial intelligence systems.

Exploration of annotated data and evaluation results

What is LNE-Matics?

LNE-Matics is a free and open-source software suite for exploring annotated data and evaluation results. It proposes a dataframe data model allowing the intuitive exploration of data characteristics and evaluation results and provides support for graphing the values and running appropriate statistical tests. The tools already run on several Natural Language Processing tasks and standard annotation formats, and are under on-going development.

LNE-Matics is developed in C++ with the Qt API. It currently runs on Linux. It requires the installation of MongoDB.

What can I find in the software suite?

LNE-Matics comprises two interconnected softwares:

  • DATOMATIC: It is designed for the importation and database indexation of corpora and files. The data can be made up of reference data (e. g. labeled by an expert) and hypothesis data (output of an NLP system, automatically labeled). Source data (i. e. unlabelled and/or unstructured) can also be included, such as plain text or audio. The data can be browsed through via search features, and visualized according to their types (text, video, audio and the related annotations). The software offers several descriptive statistics (signal duration, number of words, speakers, entities, file or language distribution...). Multi-criteria sub-selections on the corpora can be performed. The resulting corpora can be locally exported to be processed in Evalomatic.
  • EVALOMATIC: Evalomatic works exclusively on Datomatic formatted databases. Evalomatic allows running evaluations, for example comparisons between reference and hypothesis data for speech transcription tasks. The reference and hypothesis data (as well as the evaluation results) are structured as dataframes, which allows performing several manipulations on the data for an evaluation at different levels of granularity. The software offers several standard evaluation metrics (e. g. F-measure, Slot Error Rate SER), some of which specifically designed for NLP (e. g. Word Error Rate WER). Statistical functions are provided (e. g. t-tests or Wilcoxon). Data and results can be plotted on graphs (e. g. DET plot, bar chart).

The origins of LNE-Matics

The LNE (laboratoire national de métrologie et d’essais - French national metrology and testing laboratory) has conducted many evaluations of data-processing systems. These evaluations concerned various NLP tasks and systems (speech recognition, speaker diarization, speaker identification, named entities recognition, optical character recognition, etc.), which implied dealing with different system output formats, annotation guides, and comparison metrics. A number of commonalities appeared through time in the process of such evaluations, in the pre-processing and exploration of the data and the computation and viewing of statistical scores, hence the need for a reusable and general framework to carry out the evaluations.

A first tool has been built to provide a unified response to these evaluation needs by first testing some data handling and UI prototype in a pre-project called LNE-Visu, presented in a demonstration at the French JEP-TAL-Recital joint conference in 2016 (Bernard et al., 2016).
Then, taking the results into account, we started an internal project to build the LNE-Matics software suite, to implement the vision we have of such an exploration and evaluation interface..



Scientific communications

  • Galibert, O., Bernard, G., Delaborde, A., Lecadre, S., Kahn, J. (2018) Matics Software Suite: New Tools for Evaluation and Data Exploration. In proc. 11th edition of the Language Resources and Evaluation Conference, 7-12 May 2018, Miyazaki (Japan)
  • Bernard, G., Galibert, O., Rémi, R., Demeyer, S., and Kahn, J. (2016). LNE-Visu : une plateforme d’exploration et de visualisation de données d’évaluation (LNE-Visu: a platform for the exploration and display of evaluation data). In proc. 2016 JEP-TALN-Recital joint conference.