Challenges in applying machine learning to finance

Monday 10 February 2020

AI and Machine Learning

Author: Chris Vryonides

In the second of this three-part series, Chris Vryonides, Director at AI consultancy Cognosian, takes a look at the difficulties in implementing machine learning to financial markets 

Financial data are non-stationary; their statistical characteristics constantly change in the way that those determining, say, the appearance of a cat don’t. For the most part, algorithms can only predict things consistent with what they have seen before.

So, while state-of-the-art image recognition algorithms exceed human performance, this is not quite the case (yet) with machine learning (ML) in finance. We are a long way from seeing algorithms undertake investment decisions that drastically, and spectacularly, challenge the status quo (e.g. Paulson & Co shorting US housing from 2007-2009).

Additionally, more powerful models, required to capture complex relationships necessarily have more parameters. They therefore need more data to train them (we need more data than there are unknowns).

So, what’s the big deal – with the rise of high frequency trading (HFT) etc., don’t we have billions of ticks to train them with? Alas not quite, highlighting the other big problem: financial series are also extremely noisy so the data points are less informative and effectively fewer than our overtaxed local area network (LAN) infrastructure would suggest.

It is this combination of non-stationarity and low signal-to-noise that makes this domain so challenging.

Another key aspect, as noted in part 1, is the adaptive and (particularly with HFT) adversarial nature of markets 1.

ML workflow

The actual practice of machine learning has changed markedly in recent years; in the past, the focus was more on algorithm development and implementation. The field was arguably more academic, or strictly speaking, anchored in academia. Now, most cutting-edge research is done in the vastly better funded labs of big industry players. A proliferation of high quality open-source algorithms means that for most practitioners, the bulk of effort is spent on technology rather than maths; on data munging, plumbing, housekeeping and the like.

A typical, if abbreviated, ML workflow involves:

a) Defining a credible application area and objectives

b) Sourcing, cleaning and exploring data

c) Testing different models

d) Promoting a solution to production

e) Monitoring and maintenance

With these in mind, we can examine where problems often arise.

Motivation

As with any kind of potentially game-changing innovation, the pursuit of ML can be driven by board level FOMO; panic when competitors (claim they) are doing something with AI, so we better (claim to) do it too. That said, at least our goals in the market’s domain are self-evident, so we avoid the mistake of investing heavily in AI as a solution without an actual problem in mind.

A recent ML in finance conference 2 suggested that 85% of attendees believe in ML, with only 25% actually having found value in it. Let’s proceed assuming we aim to be in both contingents.

Half-hearted attempts

Many financial institutions are finding their traditional business model under stress from a combination of prolonged low interest rates, increased competition and fee compression. They are thus reluctant to divert sufficient resources to – inherently risky, for reasons noted - ML projects to give them a chance of success when their traditional business model is squeezed. This virtually guarantees failure. On the other hand, there is a good chance that a failure to innovate will wound the company in the longer term.

Under such pressures, success requires identifying areas where ML can help, setting realistic goals and perhaps identifying scope for reuse of expertise, infrastructure and software so that costs and rewards are spread.

Computing resources

Additional frustrations abound, particularly in slower moving firms. The most cutting-edge projects, e.g. market prediction from high frequency, proprietary and/or alternative data, can required enormous computing capacity for rapid model exploration.

Companies lacking in-house cloud facilities have been known to eschew external vendors due to over-zealous concerns around data protection. They might usefully reconsider given that the likes of the Pentagon/Department of Defence have confronted such issues and adopted third party infrastructure.

Controls

Assuming a promising model has been found, the process of testing, coding up for production,  deploying and monitoring can involve months of work if associated frameworks need to be built. Such housekeeping, while critical, is not the best use of a quant researcher’s time, so dedicated support should be provided.
 
Now imagine that, additionally, a researcher has too spend weeks explaining every last detail of a model to an internal model review group. It would be unfair to expect such a group to be well versed in the subtler aspects of ML; they will thus tend to focus on (familiar) minutiae of a project and may fail to appreciate where the risk really lies (overfitting, model robustness to spurious inputs etc). 

As far as automated trading goes, there is something of a double standard here; in general, human traders aren’t asked to document and justify every step of their decision-making process and, the occasional blue screen aside, machines don’t trade hungover. 

Policies should therefore concentrate on setting high-level controls e.g. maximum allowable deviation from primary market price for a market making model, or portfolio limits for a systematic strategy.

Finally, we should ensure that suitable monitoring and fail safes are properly implemented, as mandated for an automated system in any domain.

Recommendations to companies pursuing ML in financial markets

  1. Set realistic targets

  2. Be prepared to resource properly

  3. More ambitious projects should be viewed as pure R&D, with regard to risk/reward and duration

  4. Provide tech support and minimize obstacles ML quant researchers have to deal with, automate as far as possible

  5. Policies and controls on models should focus on setting acceptable bounds on model utilization, high level monitoring and “circuit breakers” rather than detailed inner workings
 

------------------------------------------------------------------------------------

1 This points to the application of reinforcement learning techniques which can “learn” and adapt automatically to changes in the environment

2 JPMorgan Machine Learning in Financial Markets Conference, Paris 2019

-----------------------------------------------------------------------------------

 

Read the first article of the series                                     Read the third article of the series 

 

--------------------------------------------------------------------------------

 

Chris VryonidesChris Vryonides, Director at AI consultancy Cognosian, a consultancy providing bespoke AI solutions. 
He has over 20 years prior experience of quantitative finance in both bulge bracket banking and investment management and has been applying machine learning and statistical methods since 2008.
 

 

 

 



 

This article was produced on behalf of the CFA UK Fintech AI & Machine Learning working group. 

 

Related Articles

Sep 2020 » Asset Owners

How pension funds should invest in difficult times

Jul 2020 » People

3 minute interview: When the next crisis hits, the cavalry won't come to the rescue

Jun 2020 » Opinion

It's time to press the reset button on Anglo American hypercapitalism ideology

Jun 2020 » ESG

Covid-19: Does ESG matter?