Part II. The control of voice communications: a considerable task finally solved?

In fact, the control of voice communications by compliance is a considerable task, which until recently was rather poorly equipped, because of the particular nature of these audio data. Successful recent deployments are leading the way, not only for PSI compliance departments, but also for many of the company’s businesses.

Existing approaches

“Given the mass of calls generated in trading rooms, the capacity of human listening is largely insufficient and the probability of discovery on a random sample is extremely limited.”

Compliance officers often have few tools at their disposal to address telephone communications: access and replay of targeted recordings; search by call metadata, such as calling, called, date and time of the call; finally listening by sampling.
Given the mass of calls generated in trading rooms, the capacity of human listening is largely insufficient and the probability of discovery on a random sample is extremely limited. Only a near-exhaustive approach to calls can provide an effective discovery capability. This is a huge task for the compliance department, and impossible to complete without automation.

In terms of automatic voice solutions, the main solutions available up to date have mainly come out of the North American market. These offerings include keyword detection and phonetic indexing to search for phoneme sequences and key phrases in a previously extracted audio database. Since audio is not transformed into text, this type of approach does not allow natural convergence with other digital channels and does not leverage rapid advances in textual data intelligence. These solutions sometimes include some speech-to-text capabilities, but until recently, they have not been able to move beyond limited-scope “ad-hoc” investigations and reach daily surveillance.
The technological revolution brought by deep learning, discussed in a previous article now allows the emergence of much more accurate and robust transcription solutions. Speech recognition is based on Deep Neural Networks (DNN), which constitute what is now called Artificial Intelligence (AI).
However, the audio data of the trading rooms has many particularities that are as many challenges for automatic solutions, even if they are neuronal.

The nature of the data

“The audio data from the trading room recordings has the toughest challenges of voice recognition.”

The audio data that falls within the perimeter of the control of the regulations are essentially the recordings of the telephones of the market room environment: market telephony decks, IP telephones, possibly mobile telephones.
Some market telephony providers prepare offers to analyze their telephone flows at the source; which opens interesting perspectives. For now, these new offers are still limited and not necessarily compatible with the MIF 2 legal retention requirements.
Conventional recordings of conversations in the trading room remain today the essential source of compliance work. These recordings combine many challenges: the volume of data, the multiplicity of types of telephony and recorders, ambient noise, speech superimposed – especially if the lines are mixed together, the poor audio quality due to the high compression of recordings, spontaneity of speech, jargons used, languages and accents.
In summary, the audio data from the trading room recordings combines the most difficult challenges of voice recognition.
Interestingly, the biggest challenges for technology are those related to the European territory, and in general to territories outside the Americas: it is the diversity of languages spoken by traders, the many loans to English in non-English speaking conversations and especially the prevalence of non-native accents in English-speaking conversations.
It has therefore been necessary to push neural technologies even further to meet these specific challenges (see Box 2) and to respond to the concrete needs of the trading rooms.

Read more

About the Author

Arianne Nabeth Halber

Ariane Nabeth-Halber, Director, Strategic Line “Speech”, Bertin IT; Member of the Board of LT-Innovate, Language Technology Industry Association;
Expert and Reviewer at the European Commission; Doctor in Computer Science and Signal Processing.