OR

I'm looking for in the

Evaluation of artificial intelligence systems

LNE develops standards and carries out intelligent systems evaluations in order to provide its customers with reliable benchmarks and results to qualify their systems and to make pragmatic and reasoned decision-making possible.

Services provided

Evaluations to the benefit of developers

LNE performs functional testing and performance measurement of AI systems to enable developers to optimize the development process until they get a viable product.

To determine the performance level of a technology, it is necessary to develop metrics. In addition to those relating to the overall performance of the system, specific metrics associated with its various components help identify areas for improvement.. They can be used to assess the relevance of the technological choices and orientations made to advance the effectiveness of the technological solution, in particular when progress is assessed in relation to the investments made, so as to estimate the impact of the latter.

Qualification of intelligent systems is therefore imperative for development and certification purposes. It allows to:

  • identify the origin of underperformance and guide future developments,
  • estimate the amount and nature of effort required prior to commercial launch of the product,
  • assess the impact of investments made to advance technology,
  • characterize the system's scope of use,
  • to guarantee the conformity of the product to fixed quality and performance requirements,
  • position oneself in relation to the competition (by participating in evaluation campaigns).

Indeed, the evaluation will allow the developer to identify the characteristics that differentiate its technology from those of its competitors. A developer who has performed well in an evaluation campaign can not only guarantee his customers that their systems comply with a set of quality requirements, but also demonstrate, for marketing purposes, that his system has stood out from the competition by its effectiveness.

Evaluations for integrators and end users

LNE provides its customers with reliable benchmarks and results to pragmatically choose the AI solution to be adopted by their companies among existing technologies.

The evaluation problem is significantly new and has a metrological specificity: the aptitude of intelligent systems is to be measured mainly on the functional level and lies above all in their adaptability, specific to the notion of intelligence. It is therefore not only a question of quantifying functions and performances but also of validating and characterising operating environments (areas of use).

Given the wide variety of environments to submit to the system, the customer does not have the means to perform all the tests required to meet his needs. And of course, he cannot rely solely on the developer, who will be tempted to reduce his field of evaluation to the cases that seem most convincing for his product. Clients wishing to rely on a third party arbitrator may find it advantageous to turn to LNE, which has several distinctive advantages: it is a public agency, independent of any particular interest and whose opinions are therefore sincere, as is the protection of the intellectual property of the elements entrusted to it (processes and data to be tested); this neutrality is reinforced by its strict specialisation in the evaluation profession.

LNE provides objective quantitative criteria to assist its customers in making an informed choice of artificial intelligence technology to be acquired from existing offers. He thus brings his expertise to:

  • formalize the client's needs and use cases,
  • map the market's potential technological solutions,
  • define an experimental framework (creation of labeled testing databases, development of testing environment, etc.),
  • carry out evaluation campaigns (benchmarking) and audits of potential solutions,
  • write an evaluation report allowing a pragmatic and reasoned decision concerning the AI solution to be acquired.

After acquiring the technological solution, LNE supports its clients in:

  • carrying out validation tests, attesting to the functionalities of the purchased system,
  • post-acquisition technical support, to identify the technological building blocks of a solution purchased off the shelf, which must be adapted to the specific needs of the end user.

Examples of applications concerned by LNE evaluations
Examples of applications concerned by LNE evaluations

 

Evaluations for public funding bodies

LNE, through rigorous measurement of technological progress, enables funding bodies to estimate the impact of investments made.

LNE, as a trusted third party evaluator, assists public bodies with project management by:

  • organizing evaluation campaigns (or challenges) that enable public bodies:
    • to determine the maturity of technologies and validate new R&D paths,
    • to judge the impact of the public funding granted,
    • to encourage innovation by creating emulation ("coopetion"),
  • developing evaluation methods and metrics to ensure:
    • repeatability of performance measurements,
    • reproducibility of experiments.

 

The evaluation campaigns organised by the LNE are multiannual projects which consist in proposing a common framework for the competition of teams developing competing approaches. These campaigns constitute an essential means of organization and motivation, for the maintenance of exchanges between various participants, generating an important ripple effect and making it possible to remove scientific or technological locks, to improve the performances and to accompany the rise in TRL (Technology Readiness Level) of the systems concerned.

Resources available

Databases

Data is the key to AI evaluation and development. LNE is familiar with building large, high-quality, structured and labeled datasets. They can be based on customer data or provided by LNE's partners, business experts in the various fields covered by its evaluations. The LNE ensures that their confidentiality and ownership are respected.

LNE organises evaluations of AI systems that use different types of data:

  • Text: automatic translation, document classification, structuring and summary, recognition of named entities, answering questions, etc.
  • The log file: cybersecurity.
  • Speech: automatic speech recognition, language and speaker identification, spoken word detection, translation, etc.
  • Video and image: object recognition, head detection, person tracking, optical character recognition.
  • Sensor measurements used in robotics or for autonomous vehicles.

Testing environments

Depending on the customer's needs, LNE can perform physical tests in real but controlled environments, virtual tests in fully simulated environments and mixed tests combining real and simulated stimulations.

Evaluation of the LAAS HRP-2 robot in a climatic chamber
Evaluation of the LAAS HRP-2 robot in a climatic chamber

 

The tests in real environments are carried out in anechoic and reverberant rooms, climatic chambers (temperature, humidity, pressure), salt spray or sunlight, in order to analyse the influence of environmental conditions on the performance of intelligent systems. LNE is also able to perform vibration, shock and constant acceleration tests to evaluate the behaviour of systems under extreme conditions, in order to precisely determine the operating limit conditions.

 

Evaluation of the autonomous vehicle (SVA project at IRT SystemX)
Evaluation of the autonomous vehicle (SVA project at IRT SystemX)

 

For the evaluation of autonomous systems moving in open and changing environments, given the almost infinite number of configurations with which the system could be confronted, LNE participates in the development of virtual test environments allowing system validation by simulation. This virtualization of the characterization of intelligent systems eliminates the prohibitive costs that would be generated by conducting all tests in real environments.

Research activities conducted prior to evaluations

In order to develop its evaluation resources and maintain its own skills, LNE also carries out well-targeted research projects, alone or in the framework of public and private partnerships, and ensures the transfer of its results where appropriate. LNE research topics generally focus on:

  • evaluation protocols,
  • tools (data labeling/qualification software, etc.) and metrics,
  • real and virtual testing environments.

Standardization activities

LNE also participates in the major transversal challenges of AI by developing references to explain, guarantee and certify intelligent systems and to enable the development of standards and regulations. In particular, LNE participates in the AFNOR commission on artificial intelligence, the AFNOR strategic information and digital communication committee and UNM section 81 on industrial robotics.

These standards enable manufacturers to know exactly what the regulatory expectations are before an intelligent system is put on the market. They reassure consumers about the product, particularly through an ethical and responsible approach to artificial intelligence.

Projects and publications