Artificial intelligence, towards new contributions for regulators

On the occasion of a study on the use of Data Science for supervisory purposes, the Autorité des marchés financiers (AMF) has explored the potential offered by natural language processing technologies in the analysis of documents prepared by listed companies. The regulator has focused its first experiment on the communication by companies of the risks they are facing.

As part of its ICData program, the AMF has set several major objectives, including the exploration of automatic data processing, such as the automatic reading of documents. A first experiment, the results of which shall be published, allows the exploration of the potentialities offered by one of the branches of artificial intelligence, the automatic processing of natural language (NLP).

Deep Learning for risk factors analysis

To test these possibilities, the study was carried out on the risk factor section of more than a hundred universal registration documents of listed companies over a period from 2012 to 2020. The difficulty of automatically reading this type of content lies in the variety of risk factors and their presentation, sometimes interlinked, but recent advances in deep learning are likely to manage some of these complexities.

Automatic data extraction and their vizualisation

This first study shows that it is possible to automatically extract the distribution of risks by sector or by issuer and to follow their evolution over time. It is also possible to detect the most significant variations from one year to the next in the degree of mention of each of the presented risks. As an illustration, the tool has visually highlighted the emergence of the presentation of pandemic risk in documents published in 2021 (for the 2020 accounting year) as well as the continuing growth of IT security risk.

The importance of machine-readable format

This first experiment suggests a more extensive use of automated processing to support the supervisory actions of regulators. However, the automatic processing of regulatory documents assumes that the regulated players use formats that are more machine-readable. It also requires that players follow good practices to improve the quality of documents, for example by using appropriate tags to better structure both the text and the tables in their documents and thus enable efficient automatic reading.