A game changer – is synthetic data set to revolutionise financial services?

November 2023 | FEATURE | BANKING & FINANCE

Financier Worldwide Magazine

November 2023 Issue

The financial services industry tends to be a slow adopter of innovation. Increasingly, however, industry leaders are turning to synthetic data to facilitate data exchange and collaboration across teams, improve fraud detection, enhance marketing to customers and open new revenue streams.

The synthetic data space has grown exponentially in recent years. According to Valuates, the global synthetic data generation market was valued at $168.9m in 2021 and is projected to reach $3.5bn by 2031 – a compound annual growth rate (CAGR) of 35.8 percent from 2022 to 2031. It has been spurred by a growing trend of digital transformation across organisations and an increase in the adoption of cutting-edge technologies like artificial intelligence (AI) and machine learning (ML).

Synthetic data will play a central role in financial services going forward. All told, it helps bring about three distinct technological changes from which the sector can benefit: enhancing data privacy, facilitating data innovation and allowing firms to monetise their existing data.

Protecting customer privacy

Synthetic data generators (SDG) use algorithms to generate data that preserves the original data’s statistical features while producing entirely new data points. This offers a way to generate high-quality data that does not contain sensitive private information.

The various applications of synthetic data are particularly useful due to the burgeoning number of data protection regulations being introduced globally. The European Union’s (EU’s) General Data Protection Regulation (GDPR) and the US Health Insurance Portability and Accountability Act (HIPAA), for example, were created in response to the growing demand for data privacy and security.

“When viewed in a data protection context, the attractiveness of synthetic data, which generates plausible but artificial data while preserving the statistical characteristics of genuine data and protecting sensitive information, becomes clear.”

Many jurisdictions now have restrictions in place regarding the sale of original customer data to third parties, for example. Even if anonymised, it still carries a risk of re-identification. Using synthetic data allows firms to generate revenue without putting user privacy at risk.

According to MOSTLY AI, it is not surprising that financial businesses guard their customers’ data carefully: “Customers demand and reward this scrutiny. With the advance of data-driven services, personal relationships are no longer the main driver of trust. Instead, the biggest driver of loyalty for banking customers is the ability to trust their bank in protecting their personal data, with 43 percent citing this reason.”

When viewed in a data protection context, the attractiveness of synthetic data, which generates plausible but artificial data while preserving the statistical characteristics of genuine data and protecting sensitive information, becomes clear. This is crucial for the financial services industry, which processes significant amounts of valuable, high-risk, sensitive personal information which, if mishandled, could result in considerable fines.

Synthetic data allows financial services firms to build models that vastly improve know your customer (KYC) and customer onboarding systems while minimising lending risk. The challenge of navigating high volume, complex, potentially unstructured data stored in sprawling, siloed databases is solved with synthetic data. It also ensures that customer data cannot be accidentally exposed or stolen as a result of a breach.

In terms of innovation, synthetic data is not bound by compliance and regulatory restrictions in the same way as normal customer data. As a result, firms can utilise data to ‘unlock’ new flexible ways to use data architecture and cloud-based infrastructure.

Potential applications of synthetic data include training new models without using real customer information, increasing the data size for a training model, validating models and ML systems by generating adversarial scenarios, fixing structural deficiencies in data, and using data in an unsafe environment. “Synthetic data allow us to find and fix problems in AI models to make them more fair, robust, and transferrable to other tasks,” notes Inkit Padhi at IBM.

Generating revenue

Synthetic data also creates opportunities to create new and maximise existing revenue streams. Especially for firms operating across borders, a diminished ability to examine customer data as a source of insights could lead to a loss of competitive edge. Synthetic data may offer a way to overcome such barriers.

Despite the advantages, it is important to note that synthetic data is a relatively new area of data science within financial services. As such, in-house teams will need to develop their knowledge, perhaps in partnership with external experts.

The potential privacy implications of AI and ML are significant. As such, the use of synthetic data in financial services may be integral to these technologies reaching their full potential.

As FIs develop synthetic data capabilities, they will be able process sensitive information without compromising privacy, and thus utilise it to build better products and open up new revenue streams.

Richard Summerfield