Less Risk, More Reward: Improving Bank-Default Predictions with Data MiningNovember 2012
Topics: Economics, Data Management, Risk Management
If government regulators want to determine the risk of default for a given bank, they have a variety of tools at their disposal. In recent years, however, some traditional methods of financial risk modeling have proven inadequate. With the recent banking crises in the U.S. and Europe, it's become clear we need better methods to gauge the health of U.S. banks.
MITRE researchers have now experimented with a method, based on cutting-edge data-mining processes known as "link-based classification," that significantly improves the accuracy of bank-default risk predictions. The key to the new method is a closer examination of the relationships among banking institutions, such as inter-bank exposures (loans or obligations between banks) and shared counter-parties (loans or obligations to the same borrowers). These relationships can provide important details about the bank's financial health. The results of this research could provide powerful new analytic tools to regulatory agencies, as well as a way to assess the validity of various analytic models.
"Traditionally, when regulators wanted to consider the risk of a bank defaulting, they looked at internal attributes such as the riskiness of the bank's outstanding loans and the size of the assets it has dedicated to cover losses," explains Charles Worrell, a MITRE systems engineer and lead researcher on the project. "We wanted to consider how you could predict default risk more accurately. Our technique incorporates the risks inherent to other banks associated with the original institution."
Lessons Learned from 2008 Banking Crises
According to Shaun Brady, a MITRE systems engineer and subject matter expert who worked on the project, the banking crises of 2008 illustrated the problem with the traditional methods of gauging default risk. "When Lehman Brothers and AIG went bankrupt, nobody realized the linkages among these types of institutions and the ripple effect they could have on the economy." This was true even though the government collects a wealth of information on financial institutions both large and small.
"Every quarter you have 8,000 U.S. banks reporting to the FDIC [Federal Deposit Insurance Corporation] on their financial health. The FDIC examines something like 1,000 standard data elements that are all related to the individual banks. However, this data doesn't give regulators any information about the relationships between institutions. Without this information they can't explore how the failure of one institution could impact another institution."
The link-based classification method of data mining allows researchers to represent groups of banks as a network and to identify common risk factors. "We looked at merger history, board members, asset and liability profiles, and the geographic proximity of banks' headquarters. Because data on direct relationships were not available, these proxies for similar sets of risks were used to classify banks in each network," Worrell explains.
Worrell's team found that certain factors are critical indicators. "There are large banks, little hometown banks, and mega-banks. The size of a bank really matters on the ways you model and identify risks, as does geography. Geography seems to matter more for smaller local and regional banks making loans to, for example, farmers in a particular region. For larger banks, however, geography isn't as important in this analysis."
"Our assumption was that geographic proximity led to shared exposure to risk of default for business loans and mortgages outstanding," Brady adds. "If you have two similar-sized banks in the same city, operating under the same local economic conditions, and one fails, it can have an impact on the other bank's risk of default. For example, an increasing unemployment rate usually has a negative impact on local consumers and businesses, which in turn equals an increasing risk of default."
For MITRE's government sponsors, early indicators of bank failure would allow for earlier, lower-cost interventions to potentially head off disaster.
Putting Link-Based Classification to the Test
To test their hypothesis that link-based classification provides better results than traditional analytical tools, Worrell and his team examined historical data on some of the banks tracked in the FDIC's Statistics on Depository Institutions database. "We isolated data on a group of banks determined to be at risk for the 2007 to 2009 period and applied our method to that data," he explains.
The researchers applied a variety of link-based classification algorithms to the 1,000 data elements traditionally tracked by the FDIC. To show a common "link," their analysis took into consideration the unemployment rate change over the two-year period and the relationship of the banks to one another in terms of loans and other factors. They then applied a commonly used measure of default risk known as the "Texas ratio," which examines a bank's level of bad assets compared to its supply of capital and loan loss reserves. This analysis provided them with indicators of failure risk for the banks in question.
"When we compared our results to what actually happened to these banks, we found that in seven out of 10 instances, the link-based method was more accurate than other methods," Worrell says.
The team also determined that the improved risk scores that resulted from the link-based analysis could be used to generate lists of banks with "secondary exposure" to the highest risk banks. "With this information, we developed a cascading graph to show all the banks influenced by the five most risky banks." This information could allow regulators to proactively address the consequences of these risks. Given that U.S. banks are regulated by a number of agenciessuch as state banking regulators, the FDIC, the OCC, and the Federal ReserveMITRE's research has a large potential audience.
Helping the Government Assess Risk Prediction Models
According to Brady, these agencies also have a great need for more accurate tools to assess the accuracy of their risk-prediction models, including emerging link-based models such as those evaluated by the MITRE team.
"The regulatory agencies have to get a better handle on their risk management approaches," Brady says. "They need an analytical modeling environment that allows them to understand how these risk-prediction models are being used across the government, and to assess their usefulness." Brady joined MITRE in 2009 after working for almost 30 years in the credit and lending industry; he supports a number of efforts to develop tools to help the government identify and monitor systemic risks.
"The health of the banks is a huge public issue. The cost to the taxpayer is huge. MITRE is focusing on building our expertise in this area." To encourage further discussion on these issues, the team wrote a technical paper on its work published by the IEEE earlier this year.
"We've built an infrastructure here that puts us ahead of where the regulators are. There are very few places for the regulatory agencies to bring their data, and MITRE has a secure environment for this." Brady says. "We purposely have picked technologies that are somewhat agnosticwe don't promote any one answer or platform. The real purpose of our work is to show the regulatory agencies which issues they need to better understand. We want to help them understand the questions to ask to help them evaluate the approaches they could take."
—by Maria S. Lee