Find and Fight Unknown Money Laundering Risk with Machine Learning

Advanced analytics has traditionally played an important role in Anti-Money Laundering (AML) compliance. Machine learning (ML) represents the next evolution in advanced analytics and can be leveraged as part of AML compliance technology. While many institutions are concerned about regulatory acceptance of ML in AML compliance, more familiarity with the technologies and better understanding of the benefits is easing some of those initial concerns. In fact, our survey, The Evolving Role of ML in Fighting Financial Crime, found that 77% of European financial institutions¹ have begun to consider using ML as part of their compliance technology.

While data science and ML can improve AML compliance markedly, they will not replace existing systems overnight. The current regulatory environment demands transparency, agility, and efficacy to mitigate risk. New technology will be subjected to the same scrutiny and oversight as existing systems. The gradual introduction of ML models that cooperate with existing systems to enhance risk detection and ease transition to broader application of ML for compliance purposes is the preferred approach.

Where transaction monitoring (TM) systems are somewhat effective in identifying an organisation’s “known risks,” they offer little capability in helping proactively mitigate “unknown risks”—ML can help bridge this critical gap. Here we discuss the benefits and use cases of ML to combat financial crime.

Rules-Based Systems Help Identify Known Risks, But Ineffectively

The primary goal of any TM system is to identify transaction patterns that are indicative of risk typologies relevant to the institution. Typically, these risks are identified through the business-wide AML risk assessment and coverage assessment that identify behaviours that indicate risk based on the institution’s customers, products, services, channels, and geographies. Existing monitoring systems then utilise rules to detect the various behaviours identified in the coverage assessment. For years, transaction monitoring vendors have developed rule-based systems tailored to these typologies to manage the entire process end-to-end, from importing the transaction data to the creation of alerts using rule logic to facilitating case management for the investigation teams. The benefit that these companies attempt to bring with their solutions is to provide a standardised set of rules to cover risks immediately. Regardless, institutions must still demonstrate that the rules cover the risks.

Rules-based TM systems have been in place for more than 15 years and have had varying degrees of success in the identification of potentially suspicious activity. This success, however, comes with inefficiency in the form of false positives. Some rules-based TM systems have rarely alerted on transactional activity that resulted in a SAR. In fact, in our experience, 95%-99% false positive rates are not unusual. Additionally, rules-based systems are only able to react to typologies they are designed to detect, making them ineffective in identifying unknown risks—and this is key, given how criminal methodologies evolve over time.

Existing TM systems use transaction and customer data to detect potentially suspicious activity (alerts). Investigators evaluate these alerts to determine whether a SAR filing is required or if the alert can be dismissed as a false positive. The data created by these systems during alert generation and investigation provides a rich source of information to train ML models.

Using a combination of supervised, semi-supervised, and unsupervised ML models can enhance efficiency and effectiveness in the identification of suspicious activity.

Supervised Learning Improves Efficiency

Supervised models can produce efficiency gains by leveraging prior information to predict future outcomes. They are particularly effective in predicting which cases generated by existing rules have a higher probability of escalation, based on the unique characteristics of the case. The ML prioritisation, or “scoring,” model will filter the alerts generated by the existing TM rules-based system. Prioritisation typically involves a process in which some cases are no longer created and/or cases are sent directly to specific levels of investigator review, bypassing some early levels of the standard workflow. Rules are reactive and identify known typologies and risks. Since supervised models rely on the output from rules, they too are reliant on known typologies and risks.

While case prioritisation models can be very useful at increasing efficiency, there are common challenges that many implementations face. Sufficient prior disposition data is often the first initial hurdle. Even with the best available features and data, this model requires sufficient and accurate historical results to make predictions on future cases. In Guidehouse’s implementation experience, six months to two years of historical cases (depending on volumes) that have been dispositioned (and quality assured) is ideal. When historical data is available, it is crucial to understand both the process the reviewers used and the data reviewed when assigning a conclusion. For example, if the reviewers used negative news searches as a key factor in many dispositions, but that data is not available for a model to digest, that risk will not be captured in the model and the resulting output will not be reflective of historical dispositions.

Another consideration that institutions should be aware of when selecting historical data to use is whether any significant changes to the business have occurred during that timeframe. Changes may include entering a new line of business: If the institution just started correspondent banking two months ago, two years of historical data to train the model would contain relatively few correspondent banking dispositions, resulting in inaccurate scoring.

A final example of potential concerns with historical data is when the ratio of escalations to closures is extremely imbalanced, which is not an uncommon occurrence in existing TM regimes that generate a significant number of false positives. In this circumstance, the ML developers must carefully evaluate model algorithms that handle high class imbalance, such as tree-based models. Over, and under, sampling techniques may also be valuable in this circumstance.

Typically, the bulk of an ML model’s features will be the transaction-level data itself, which form the basis of the model. Other potential sources of model features are CDD data and third-party data, which leverages any information external to the business. This ranges from simple datasets like High Risk Country lists, to other structured data such as LexisNexis or Thomson Reuters entity data, including link analysis data, to unstructured data such as negative news. However, there is no limit to what can be included as a potential feature in a ML model, so long as the suggested feature has a sound basis and can be directly or indirectly tied to disposition results.

Semi-Supervised & Unsupervised Models—Start Knowing Unknown Risk

Semi-supervised and unsupervised models allow an institution’s monitoring environment to be proactive and identify potentially suspicious activity the existing rules missed and identify unknown risks. There is significant attention paid to false positives in AML as they are distracting and expensive to compliance functions. In the current AML landscape, however, there does not seem to be a similar amount of attention paid to the identification of false negatives, which can be equally or more dangerous to an institution, sometimes resulting in hefty fines and lookback or monitorship requirements.

Semi-supervised and unsupervised models can assist in addressing the identification of false negatives. Semi-supervised models leverage historical-dispositioned case information, as well as historical transactional information. The objective of a semi-supervised model to address risk detection is to identify transactions and/or entities that exhibit similar characteristics to historically productive alerts or SARs that are not included in existing rules-based alerts. This type of model leverages additional information outside of the parameters used to generate alerts in a rules-based system, which enables detection of potentially suspicious underlying factors that mimic prior escalations.

Unsupervised models built for risk detection leverage historical transactional data to identify potential false negatives. These models typically consist of features that attempt to identify significant changes or anomalies in behaviour or activity that may indicate suspicious activity.

Semi-supervised and unsupervised models for risk detection are an excellent complement to supervised models that identify and reduce false positives. The false positives that are eliminated from the investigative workload by the supervised model can be replaced with investigations resulting from the semi-supervised and unsupervised models, allowing investigative resources to focus effort on activity that is more likely to be risky—regulators applaud this renewed emphasis on risk detection.

Furthermore, the output from semi-supervised and unsupervised models will be presented differently from traditional rules, where an investigator can understand the activity that triggered the alert. This emphasises the need for documented investigation policies and procedures that investigators can follow when reviewing the results of ML models.

Case Study: Driving TM Effectiveness for a Correspondent Bank

Guidehouse partnered with a ML vendor² and analysed three years of anonymised correspondent banking transactions using a combination of ML models to detect unusual activity worthy of review by an experienced investigator. The results clearly demonstrated a significant reduction in alert volumes and identification of cases that would not have been identified using a rules-based system.

The rules-based system generated 57,000 alerts compared to only 16,000 alerts using a combination of all three ML models—more than a 70% reduction. Based on a review of a sample of alerts from the rules-based system, 58 cases were sent to Level 3 review. A similar sample from the ML models identified 25 new cases that were worthy of a Level 3 review. These are cases that were not identified by the traditional rules-based system. The unknown risks!

Getting Started with Machine Learning

Prior to any model being promoted to production, there are some final considerations to be made, which will vary based on the model use case and the configuration of the current environment. If the model performed well enough in testing, the temptation is to immediately move the model into production. Even with proper testing, if the model fails to remain stable in production, this may require additional resource requirements or full re-development of the model. Therefore, it is suggested to run any model in parallel with production to ensure model performance. Running in parallel allows the model to avoid the “production” label until the parallel validation period is complete and the results are acceptable. This approach is typically taken, both to gain more confidence in the model and to accumulate evidence to show that the model has met expectations. Once implemented, ML models require strong governance and regular maintenance to account for changes to the data.

Full transition to ML models from rules-based systems poses several challenges for an institution. Rules provide a traceable path from TM to typology/risk that allows regulators to understand how an institution is mapping its risk and coverage assessments to rule logic to monitor for known risk. Furthermore, the logic within the rules can be easily examined to ensure it both addresses known risks and is functioning correctly. Conversely, mapping ML models directly to typologies and examining the model’s components and effectiveness requires more expertise and analysis.

To reap the benefits and minimise the costs of the TM ecosystem as a whole, introducing ML models into the current technology environment and allowing them to co-exist with rules-based systems is a common approach that can be leveraged by most institutions. As described above, there are use-cases related to rules-based systems where ML models can enhance the existing environments without fully replacing existing systems.

Conclusion

Financial institutions understand the shortcomings of rules-based TM solutions, considering ML. These models, when developed and introduced correctly, can work in conjunction with existing systems to address issues of efficiency and effectiveness. It is critical for financial institutions to consider all use cases and challenges before setting out on the ML journey— too often they focus solely on efficiency. ML models can be introduced in a variety of ways to enhance existing AML technology environments in a more comprehensive manner. With more exposure, regulators will have an increased appetite for ML, paving the way for other artificial intelligence solutions throughout AML and other compliance functions. Most importantly, financial institutions will shift from a reactive to proactive approach—solving for both risks they are aware of and those they are not.

^{1 https://guidehouse.com/insights/financial-crimes/2022/machine-learning-adoption-europe

2 Guidehouse has also built supervised models that have eliminated approximately 60% of false positives.}