Logo

The Data Daily

Unleashing the power of machine learning models in banking through explainable artificial intelligence (XAI)

Unleashing the power of machine learning models in banking through explainable artificial intelligence (XAI)

Go straight to smart. Download the Deloitte Insights app
Opens in new window
The “black-box” conundrum is one of the biggest roadblocks preventing banks from executing their artificial intelligence (AI) strategies. It’s easy to see why: Picture a large bank known for its technology prowess designing a new neural network model that predicts creditworthiness among the underserved community more accurately than any other algorithm in the marketplace. This model processes dozens of variables as inputs, including never-before-used alternative data. The developers are thrilled, senior management is happy that they can expand their services to the underserved market, and business executives believe they now have a competitive differentiator.
But there is one pesky problem: The developers who built the model cannot explain how it arrives at the credit outcomes, let alone identify which factors had the biggest influence on them.
This anecdote may be hypothetical, but it’s indicative of the types of challenges confronting many banks’ deployment of AI models. Machine learning models tasked with identifying patterns in data, making predictions, and solving complex problems are often opaque, obscuring their under-the-hood mechanisms. Deploying such models without explainability poses risks. In addition, a lack of explainability can preclude many banks from taking advantage of cutting-edge AI applications, including underwriting models that use alternative data, 2 facial-recognition software for ATMs, 3 or bots that can track compliance with new regulations. 4
“Black-box” algorithms also raise a number of thorny questions. First and foremost, should every machine learning model be self-explainable by design? Or should banks forgo explainability in favor of model accuracy? Or should the level of explainability depend on the context, purpose, and regulatory compliance expectations?
Model risk managers at several large banks 5 —mirroring the AI research community at large 6 —are reportedly divided on this matter, 7 and there appears to be no clear consensus yet.
Explainable AI to the rescue
The emerging field of explainable AI (or XAI) can help banks navigate issues of transparency and trust, and provide greater clarity on their AI governance. XAI aims to make AI models more explainable, intuitive, and understandable to human users without sacrificing performance or prediction accuracy. 8 Explainability is also becoming a more pressing concern for banking regulators 9 who want to be assured that AI processes and outcomes are “reasonably understood” by bank employees. 10
This heightened interest is also evident among many consumer advocacy groups, counterparties, and even internal stakeholders at financial institutions. In fact, the expanding repertoire of XAI techniques, methodologies, and tools has become a top priority for many banks. These organizations are not only advancing XAI research in partnership with academic and scientific communities, but they are also spearheading innovative applications of explainability techniques within their respective firms. For example, banks have worked with leading specialists at Carnegie Mellon University and the University of Hong Kong to propose novel uses of XAI, 11 and co-founded innovation labs that aim to produce explainable machine learning models that advance their business goals. 12
XAI may also help banks see more of their pilot projects come to light, since a lack of explainability can be a major hurdle to deploying AI models.
A robust XAI program can offer a number of other benefits to organizations as well. Explainability tools can unlock different types of information about a model, depending on what type of answers are being sought and the types of modeling approaches used. 13 For example, XAI techniques that shed light on a model’s functioning can be valuable for understanding relationships among variables, diagnosing poor performance, or identifying potential information leakages. 14 Collectively, these efforts are important for upholding customer protection and fair lending by identifying model parameters that lead to disparate impact; understanding of trade-offs in model performance; developing more convincing business cases for model adoption and buy-in from management; ensuring greater trust and credibility in the models; and preventing any potential regulatory/compliance issues.
In this article, we shed light on the practical considerations banks should evaluate in implementing and scaling XAI across the enterprise. Specifically, we consider the following questions: 1) How should banks weigh the benefits of explainability against potential reductions in accuracy and performance? 2) What’s the most effective way for AI teams to prioritize efforts that enhance transparency across the model development pipeline? 3) Which models should be the biggest priority/focus of explainability? And 4) How should banks deploy their limited resources to ensure explainability across their model inventory?
We will also share recommendations on XAI governance, and how it connects to AI governance more broadly, and functions such as model risk management. This road map can serve as a guide for senior managers, risk and compliance executives, heads of AI research and business unit leaders that are ready to execute XAI initiatives within their respective domains.
Explainability for ethical and responsible AI
Explainability is an integral part of the Trustworthy AI™ framework (figure 1). This framework also emphasizes factors including fairness, robustness, privacy, security, and accountability of AI models. Embedding Trustworthy AI™ into the processes that bring AI to life is paramount for upholding ethical and responsible AI. 15 It provides a common language and lens through which organizations can identify and manage risk, enabling them to facilitate faster and more consistent adoption of AI technology. In doing so, it can spark more fruitful human and machine collaboration throughout the organization. (For more details, see Trustworthy AI: Bridging the ethics gap surrounding AI .)
Show more
The explainability vs. model performance trade-off
Explainability has taken on more urgency within banks, a result of the growing complexity in AI algorithms, which is happening at a breakneck pace thanks to greater computing power, the explosion of big data, and advances in modeling techniques such as neural networks and deep learning.
Several banks are establishing special task forces to spearhead explainability initiatives in coordination with their AI teams and business units. The use of automated machine learning (AutoML) solutions from vendors has also increased considerably.
Concurrently, there is a growing focus on the trade-off between model accuracy and interpretability (figure 2). 16 XAI can help model developers weigh these trade-offs more tangibly and advise on how they should begin bridging the gap between complexity and intuitiveness.
Today, there is a whole spectrum of models ranging from decision trees to deep neural networks (DNNs). On the one hand, simpler models may be more interpretable, but they often have less predictive power and accuracy, especially compared to more complex models. Many of these complex algorithms have become critical to the deployment of advanced AI applications in banking, such as facial or voice recognition, securities trading, and cybersecurity.
Other trends are prompting banks to prioritize XAI as well. For example, many banks are looking to adopt more off-the-shelf AutoML solutions offered by vendors, cloud providers, and software companies. These prepackaged solutions have varying degrees of explainability, and it may take months to analyze and document how they work. The push for agile development is also raising new concerns over explainability, since faster processes can make it more difficult to install appropriate guardrails and controls in early stages of model design.
Copy
Regulators’ perspectives on XAI
Most financial regulators do not mandate where and how banks can use black boxes, but some, such as Germany’s Federal Financial Supervisory Authority, have advised institutions to weigh the benefits of choosing a more complex model, and document why they decided against more interpretable options. 18 In addition, financial watchdogs have recommended that banks run traditional models alongside sophisticated machine learning models, and assign analysts, or a “human in the loop,” to address major discrepancies that arise between the two. 19
Engaging with regulators will be important for banks to continue developing advanced AI applications, since oversight groups are increasing their scrutiny of machine learning in every corner of the globe. Bank regulators in North America, 20 for example, have solicited feedback on banks’ explainability practices, and the degree to which their limitations can impact model risk management. Recently, five US agencies formally requested information on how banks manage AI risks, including when a “lack of explainability” raises uncertainty about the soundness and reliability of their machine learning models. 21 In addition, the watchdog for consumer protection in financial services expanded its policing of discriminatory practices to include the ways in which banks use models and algorithms to advertise and sell products and services. 22 Meanwhile, some policymakers in the European Union 23 and Asia 24 have passed regulations that allow customers to request an explanation for any decision generated by AI and learn how their personal data was used to determine it.
Many regulators are taking a pragmatic approach, relaying that there is no “one size fits all” formula to assess these trade-offs. Instead, they suggest that banks should weigh the purpose of the model, the environment in which it will be deployed, and the goal of explainability. Some have indicated that it may be acceptable for banks to use opaque models to test theories, anticipate liquidity needs, or identify trading opportunities, so long as they use more interpretable models when acting on predictions. 25
Other instances where explainability may not be a priority is in the application of optical character recognition (OCR) systems that extract information from scanned documents, or natural language processing technologies that wade through contracts and legal agreements. 26 Similarly, banks may not need to seek a high degree of explainability for algorithms that yield accurate outcomes when identifying fraudulent transactions. 27
A playbook for implementing XAI
Implementing XAI more broadly across the enterprise is a multifaceted and multistep process, requiring potential changes to data sources, model development, interface with various stakeholders, governance processes, and engagement with third-party vendors. However, this may be easier said than done, since there are no commonly accepted practices to delineate how much explainability is needed for different machine learning applications, and which techniques should be pursued in light of those considerations. Nevertheless, there are several goals that should be central to banks’ implementation of XAI:
XAI should facilitate an understanding of which variables or feature interactions impacted model predictions, and the steps a model has taken to reach a decision.
Explanations should provide information on a model’s strengths and weaknesses, as well as how it might behave in the future. 28
Users should be able to understand explanations—they should be intuitive and presented according to the simplicity, technical knowledge, and vocabulary of the target audience. 29
In addition to insights on model behavior, XAI processes should shed light on the ways in which outcomes will be used by an organization. 30
Establishing XAI as a formal discipline can put banks on the fast track to achieving these objectives. This will likely mean introducing new policies and methods, from the premodeling stages to postdeployment monitoring and evaluation. It will also require every stakeholder who contributes to AI model development to act purposely and intentionally with each decision they make. For example, developers should apply explainability principles to their choice of training and input data for model prototypes. Instead of focusing solely on datasets that will maximize performance, they should also consider whether the input or training data may perpetuate hidden bias (e.g., historical lending data may favor certain demographics that had easier access to credit), whether the data contains customers’ personal information, and if it spans a long enough timeframe to capture rare or unusual events.
Model development teams should also conduct a preliminary assessment of model performance and interpretability, to get a sense of how accurate the model will be compared to simpler and more traditional analysis methods. This deliberation should begin in the premodeling stage, so designers can tailor machine learning architecture to the target explanation. In some cases, banks may want the model to be transparent to all users, and will prioritize an interpretable design (“glass box,” or “ante-hoc explainability” 31 ). In others, they may build a complex model, and either apply XAI techniques to the trained model (post-hoc explainability) or create a surrogate model that emulates its behavior with easier-to-follow reasoning.
Either way, banks should assess which techniques and tools are most helpful in advancing explainability. There are several factors that should drive these decisions, leading with the target stakeholder: regulator, risk manager or auditor, business practitioner, or customer (figure 3). For example, underwriting officers can be served well by counterfactual explanations, which show the degree to which different aspects of a customer’s application should be tweaked to change the outcome (e.g., increase income by a certain amount to gain loan approval). 32 Other bank employees may need “an explanation of an explanation,” 33 or visualizations that map out patterns and flag anomalies in the data, such as groups of individuals that may be inappropriately segmented for marketing campaigns. There are also varying levels of explainability that should be taken into account (see sidebar, “Varying levels of explainability”).
Copy
Varying levels of explainability
The level of explainability needed for each model will depend on several factors. Banks should weigh all of them when determining how complex to make their models. The top considerations include:
Target user/stakeholder: Who will be served by the explanation? How? To what extent do they need to understand the model’s functioning?
Purpose of explanation: What is the goal of XAI? What function will the explanation serve?
Complexity: To what extent does the model’s complexity make it hard to understand its underlying behavior? To what extent does it matter?
Scope: How much information should explanations convey? Is it sufficient to trace the steps taken to arrive at one decision, or does the entire model need to be understood?
Business context and implications: How will the model be used within the bank, and who will be impacted? Could the model trigger high-stake decisions that impact customers, profit margins, or the business’s reputation?
Regulatory constraints: What rules or guidelines may dictate how the model should be used? For example, there may be restrictions on using AI to increase pricing, but not identifying customers at risk of switching banks.
Impact on stakeholder trust: Would external parties approve of banks using the models, given how well model risk teams understand its behavior and actions?
Results of testing and verification: Is it clear within which boundaries the model will perform well? Has the model been sufficiently probed for accuracy and robustness? Are there differences in the level of explainability gained by XAI technique, and an understanding of what each method achieves? Can those results stand up over time?
Confidence in XAI techniques/opportunity for “Human in the Loop”: Are the methodologies that will be used to apply XAI still nascent? Can they be trusted? Is there a possibility for humans to act as an intermediary, and detect when XAI techniques may no longer be sufficient?
User testing and feedback: Can users understand and absorb the explanation? Is it being presented on the most effective interface, and does it meet user expectations? Should there be an opportunity for customers to provide feedback on the quality of explanations provided to them?
Show less
Or copy link
Copy
Then, to fully embed explainability into the AI production life cycle, banks should establish a cross-functional task force comprising specialists from the executive committee, businesses, IT, risk and compliance, legal, ethics, and AI research. The US military’s Defense Advanced Research Projects Agency (DARPA) Explainable AI program discovered that some of the most effective teams have cross-disciplinary skill sets and expertise, such as computer scientists who work with experimental psychologists and user experience and design specialists. 34
The XAI committee should have routine informed discussions about which models and AI solutions should be prioritized for the application of explainability techniques, using criteria that are most important to the bank. They should also decide which scenarios have serious implications for the algorithm to be used as is. 35 For example, a model designed to target media platforms that shows an ad to unintended target audience members 10% of the time may be more acceptable than a recommendation engine that advises clients on misdirected products 10% of the time.
This task force can begin with smaller projects within business units, gauging how effectively the lines of communication among the three lines of defense identify and address explainability issues. It can then scale processes that work across other businesses. It will be important for banks to infuse explainability into cultural norms and practices as well, from annual employee training to evaluating the ROI of new prototypes, and projecting multiyear returns on those projects. Over time, explainability should become a key part of the “ethics fluency” that personnel from the boardroom to interns must retain, until everyone in the business understands XAI principles, can ask the right questions, and elevate issues to appropriate supervisors. 36
For an added layer of oversight, many banks are also collaborating with academics who can serve as an independent third party that reviews explainability concerns and approves of model use within specific functions. In addition, more companies in financial services and other industries are appointing a chief AI ethics officer to serve as the lead on AI explainability, transparency, and trustworthiness. 37 In the absence of this role, banks should name a senior executive—ideally the chief financial officer or chief risk officer—who will be held accountable for decisions made on behalf of AI projects.
Auditors and risk managers may not need to track each model’s individual path to making a prediction as much as they need to see aggregate levels of model risk. These supervisors can benefit from scoring systems that provide a visual indication of XAI metrics related to functionality, performance, and usefulness. 38 Deloitte’s Lucid[ML], for example, can display three dimensions of explainability alongside results from a simpler AI technique and a surrogate “white box” model. 39 These dashboards can provide a high-level perspective of the explainability risks, as well as a deeper analysis of interpretability constraints (see sidebar, “Assessing risks with explainability dashboards”).
Assessing risks with explainability dashboards
Software tools can provide an overall explainability score for black-box models based on several different characteristics, including:
The complexity of the dataset on which the model was trained
How easy it is to pinpoint the features that have the biggest impact on individual decisions or overall model functioning
The results of shifting values for a particular feature, and the extent to which it leads to model errors and/or loss
The degree to which a rise or fall in input values has a reciprocal effect on output values
Estimates of how well a model’s predicted values match up with expected or actual output values (which can indicate whether the model may be overtrained or undertrained on a particular dataset)
Source: Deloitte AI Studio. 40
Show less
Show more
The XAI ecosystem is quickly expanding, and vendors that tailor services to the banking industry range from large software companies to more nascent tech startups. These providers run the gamut from “plug and play” solutions to tools and frameworks that can be fully integrated into commonly used cloud computing platforms. Banks can apply these XAI processes to functions including fraud identification, loss and churn prediction, and debt collection.
XAI techniques: Making choices
There are several approaches and techniques banks can take to activate explainability (figure 5). Some are already commonplace in the XAI practice, while others are still being refined. Banks can still quantify risk levels associated with model transparency—and assess how much explainability is needed, or whether alternative modeling options may be available to them.
Copy
The limits of explainability
Despite recent advancements in XAI research, banks still face many technical challenges implementing explainability into the AI pipeline. For example, many XAI techniques that are best suited for credit risk models tend to accommodate binary decisions made on behalf of consumers, such as “lend” or “don’t lend.” These explanations may not consider the full range of modifications that can be made to interest rates, repayment schedules, and credit limits, and overlook consumers’ preferences for different terms of a loan. 41
In addition, banks may want to provide counterfactual explanations that show customers why their loan applicants were denied, and coach them on how to improve their financial standing. However, because such advice may or may not be appropriate for the specific borrower’s context, caution should be exercised in accounting for how explanations may be interpreted by the public. Banks should also consider that explanations may not retain validity over time due to changes in external conditions or instances when the model must be retrained to improve performance. As a result, organizations must be aware of the actions they recommend to customers and communicate the time periods in which the results will be most applicable, in tandem with other caveats.
XAI teams can also be limited by resource constraints since it frequently takes longer to compute explainability techniques as models grow in complexity and precision. In addition, some institutions have expressed concern that explainability may allow competitors to reverse engineer their machine learning models, thereby revealing the “secret sauce” behind their proprietary algorithms. They have also highlighted the risk that XAI could make it easier for outsiders to “trick” their models or launch adversarial attacks that cause them to malfunction.
One of the biggest challenges also lies with talent acquisition. XAI specialists will likely remain sparse, given the dearth of candidates with a background in XAI, a field that’s even more niche than machine learning and data science. Banks can overcome this challenge by hiring talent outside their organization or recruiting engineering and computer science graduates straight out of school and training them on explainable AI internally.
What more should banks do to keep pace with emerging developments in XAI?
Many banks are exploring algorithms that are complex and inherently interpretable, such as explainable gradient boosting and neural networks. They should continue researching new techniques for developing deep learning applications that are transparent by design, and do not require post-hoc explainability.
They should also consider partnerships with think tanks, universities and research institutions, and groups such as Singapore’s Veritas Consortium, which brings together large financial institutions from around the world to develop common guidelines for the industry to follow. 42 In addition, it’s important for banks to be active participants in conferences and workshops that cover emerging XAI topics, and collaborate on research that can drive the field and its practical applications forward. They can also push vendors to continue making prepackaged models more explainable, so they can more easily adopt third-party solutions.
Recently, momentum has also been observed for the idea of advisability, which incorporates feedback from users to improve the functioning of AI models that consume and produce explanations in collaboration with humans. 43 In addition, explainability is expected to be fundamental to producing “third-wave AI systems” that allow machines to contextualize their outside environment, and bring that knowledge into XAI models. 44 Eventually, it may be possible for AI developers to create complex systems such as deep learning networks that are inherently interpretable by design.
It will be paramount for banks to share their experiences with regulators, and work with government agencies and other oversight bodies to produce guidelines that enable AI research and development but also protect customers’ interests. By taking extra steps to be ethical and transparent—while continuing to spearhead research and leading practices in XAI—banks can solicit regulatory trust, and find the ideal balance between tech innovation and safety.
Endnotes

Images Powered by Shutterstock