Five Questions: Machine Learning in Investing with Kevin Zatloukal

Artificial intelligence and machine learning are going to change investing in many ways. The ability of computers to think, analyze data, and react more and more like human beings has the potential to change everything from how advisors interact with their clients to how investment strategies are created to the fees that are charged for investment advisory services.

While very few people dispute the potential for these technologies in investing, there is significant disagreement with respect to whether and to what degree it can enhance factor-based strategies and other strategies that invest using company fundamentals.

Despite spending a lot of time studying AI, I am still very new to it, so I am certainly not the person to help provide these answers. We are fortunate to be joined for our latest interview, however, by someone who can. Kevin Zatloukal teaches computer science at the University of Washington. He has a PhD in computer science from MIT and worked at both Microsoft and Google. He is also the second person selected for O’Shaughnessy Asset Management’s new research partner program. His first paper written in collaboration with OSAM, “ML & Investing Part 1: From Linear Regression to Ensembles of Decision Stumps”, offers an excellent look at how machine learning can be applied in the world of investing and I highly recommend you read it if you haven’t already.


Jack: Thank you for taking the time to talk to us.

You have an interesting perspective as someone outside the world of investing with an extensive knowledge of machine learning. One of the things I often wonder about is how machine learning will change the investing world, and which areas it will have the most impact in. Those of us inside the world of investing are prone to have our own biased thoughts on what machine learning may or not mean, but in many ways we may be too close to the issue to see it clearly. When you look at the impacts that machine learning may have in investing five or ten years out, what areas do you think it will impact the most? What problems in investing do you think it is most suited to solve?

Kevin:  Let me first say that I’m not sure that my predictions about this should be trusted any more than anyone else’s. As Yogi Berra said, predictions are hard, especially about the future.

My best argument for the use of machine learning becoming wide spread amongst asset managers is the fact that that is happening at just about every other company in America. We all want to use the data now available to make better decisions, and machine learning is a key enabler for that, especially when there is too much data for humans to examine. I can’t see any reason to expect asset management companies to be different.

My best argument in the other direction would be to point out that there seems to be a minimum level of accuracy required before human beings are willing to trust machine learning. Once the accuracy is sufficiently high, when the machine makes a prediction that doesn’t make sense to us, I think we instead assume that we are the ones who have made a mistake (that becomes the most likely explanation) rather than the machine being wrong. Users report it feeling like “magic” when that level of accuracy is reached… that the machine has suddenly become intelligent. (Here, machine learning is teaching us new things about human psychology as well!)

While we can achieve high accuracy in predicting the word being spoken or the move most likely to win in a game of Go, it could be that the stock market contains enough inherent randomness that the level of accuracy required for people to trust machine stock picks simply isn’t achievable.

I suppose, if I had to make a prediction about the future ten years from now, I would guess that both things are true: use of machine learning is wide spread by asset managers, but the humans picking asset managers still tend to pile money into whoever had the best returns last year.

As far as which area of investing machine learning will have the biggest impact in goes, It’s really hard to say. We don’t usually know how well ML is going to work on any problem until we try. To quote Elements of Statistical Learning: “it is seldom known in advance which procedure will perform best or even well for any given problem” (emphasis added).

Jack: One of the interesting ways you have utilized machine learning is in fantasy football. You showed a simple example of predicting future NFL performance for wide receivers using college performance data and physical attributes in your paper with OSAM, but I am assuming the models you use in the real world are more complex. Fantasy football is an interesting proving ground for this technology because it puts it up against the ability of human beings who have the same data. Without giving away any of your secrets, can you talk about how you have applied machine learning to fantasy football and how has it performed relative to your human competitors?

Kevin: The models I use in the “real world” are not much more complicated, if at all. In fact, for wide receivers, I’ve found that a simple model using only two features — draft pick and career market share of team yards from scrimmage (one measure of college production) — works well. I probably lean on that model more than any other.

The main reason why simple models work so well in fantasy football is that a significant fraction of the edge to be found there is behavioral (rather than informational). Sticking to the models and ignoring the noise gives you a big advantage, just as with quantitative approaches to investing. When the draft comes around, your competitors are often over-reacting to all sorts of information, while the simple model (and its strong history of accuracy) tells you that none of that really matters.

Your competitors will pass on those players and let you acquire them for less than the model says they are worth (their intrinsic value). It is basic value investing.

Overall, I would say that results have been surprisingly good. I’ve found myself relying more on models over time rather than less.

However, there is clearly a “paradox of skill” problem in fantasy football just as much as in investing. As I’ve done well, I’ve gravitated toward leagues with more skillful players, many of whom rely heavily on their own models. As the level of competition rises, the results depend more on luck since the competitors are more evenly matched.

Most of my competitors have similar models to the one I described above, so I’ve had to continue to find edges in new areas. Last year, I improved my models for how to price players in auction leagues, and that worked well. We’ll see what I can come up this year.

Jack: A recent Research Affiliates paper argued that the effectiveness of using machine learning to predict future stock returns will be limited because there isn’t enough data. The paper put it this way: “Today, we have about 55 years of high-quality equity data (or less than 700 monthly observations) for many of the metrics in each of the stocks we may wish to consider. This tiny sample is far too small for most machine learning applications, and impossibly small for advanced approaches such as deep learning.” Do you agree with this assessment?

Kevin: No, I strongly disagree with that assessment.

Let’s go back to the fantasy football example. Instead of 55 years of data, I only have 20 years of data for fantasy football (because the game has changed quite a bit from earlier years). And instead of thousands of companies and 12 months of data from each year, there are only 10–20 receivers drafted each year. As a result, my wide receiver data set has only a few hundred examples in it, whereas we have hundreds of thousands of examples in investing!

Yet, machine learning approaches work extremely well in fantasy football. In fact, the lack of data was the primary reason that I had to turn to more sophisticated machine learning algorithms in trying to analyze fantasy football in the first place. Classic techniques like linear regression couldn’t cope with the lack of data, but other machine learning approaches could.

Stepping back a bit, I think that trying to separate machine learning methods from the classical approaches used in the finance literature, like linear regression, is really the wrong way to look at this….

For starters, linear regression is a machine learning algorithm. When I took my first ML class at MIT, we spent a week studying it. So, from my point of view, linear regression is one of many different machine learning techniques, each of which has its own strengths and weaknesses.

The amount of data needed is one dimension along which ML techniques differ. Linear regression typically needs about 10 times as many examples as features in the model, so it often works well with low to moderate amounts of data. Deep learning seems to really excel when you have enormous amounts of data. And at the other extreme, techniques like support vector machines and lasso regression work well when you have very small amounts of data. So, while the best technique may differ by problem, machine learning methods are useful no matter how much data you have.

Like any other machine learning technique, there will be problems where linear regression works well and others where it does not. If you want to be able to tackle a wide range of problems, you’ll want to include linear regression in your tool belt and not be predisposed toward or against it.

Jack: Two of the common criticisms I hear of machine learning from people who use more traditional methods is that it is effectively data mining and that it has no way of identifying whether there is an economic reason for the relationship between two variables. On the first point, there is a saying that if you torture the data enough, it will eventually confess, and some argue that machine learning is essentially just running as many tests as it takes to produce the desired result. On the second, if you fed a machine learning system something like the results of every NFL game in a season as a feature and stock returns as an output, it could potentially find a relationship, but that relationship would be spurious. What do you think of these criticisms?

Kevin: I disagree with the first point.

Applying machine learning is a constant battle against data mining (or, as we call it, “over-fitting”), so users of machine learning are quite aware of it, and the basic techniques of machine learning focus on how to prevent that.

In fact, users of traditional methods may be more susceptible to data mining, due to a false sense of security. (Results on the non-replicability of past research on hundreds of different alleged “factors” seems to indicate this.) Users of machine learning, in contrast, are always on guard against data mining.

On the second point, I must admit that I’m suspicious of the importance of humans finding a “true economic reason” for a statistical relationship.

Human beings, even smart ones, are perfectly capable of finding explanations for relationships that don’t exist. People thinking vaccines are the cause of autism is a recent example. And on the other side, the fact that we can’t think of an explanation for something doesn’t mean it isn’t true. For example, over a decade before Pasteur uncovered the link between bacteria and diseases, a Hungarian doctor had already noticed that doctors washing their hands before delivering babies led to fewer deaths in child birth, even if he didn’t know the true reason why.

I’d love to see a blind study that tested out how well human analysts are at separating true economic relationships from false ones. Perhaps we’d see some alpha from that. Or perhaps we’d see the same thing that Joel Greenblatt saw when he let his clients choose from the stocks that his quantitative model liked: that filtering through their reasoning reduced performance. (Of course, in that case, you really should be using information from the human analysts — by taking the ones they don’t pick!)

Finally, it’s worth pointing out that finding a true economic relationship is not a prerequisite for making money. Jim Simons, from Renaissance Technology, has said that they use many signals that do not make sense economically, and it appears they are making a lot of money.

Stepping back, I think the reason people prefer to stick to statistical relationships that they believe are based on true economic relationships is that they believe those statistical relationships are less likely to change in the future. However, I haven’t seen evidence that this is true. (Again, someone should do a study!) Alternatively, you can just let the data tell you that the relationship has changed. In fact, you must do that anyway because even “true economic relationships” can disappear in the future.

Jack: One of the biggest issues faced by those of us who follow value models is the issue of value traps. Whenever you buy cheap stocks, some of them are going to be cheap for a reason. Adding filters for quality can help with this, but also can have side effects of eliminating potentially successful investments along with the bad ones. I am wondering if machine learning may be able to help with this problem by analyzing historical winners and losers to find characteristics that are common among value traps, but are not present with successful investments. Do you think this is a promising use of the technology and how would you go about using it for that purpose?

Kevin: Absolutely, machine learning is well suited to that problem.

I would start by deciding how you want to define value traps.

Let’s say you are only worried about companies that are end up going bankrupt. In that case, it’s probably only a small fraction of the companies in your data set that fall into that category, so you may end up in a situation as in fantasy football where there is a small set of examples to learn from. In that case, I’d try out support vector machines or regularized logistic regression — methods that work well with limited data. Try those out and see if they are accurate on out-of-sample data. If they are, then you can feel confident applying them to new companies.

Next, you need to figure out how to incorporate that model into your process. The model will give you an estimated probability of bankruptcy, so you could do something like exclude any of the companies whose estimate is above some threshold or in the top decile by probability of bankruptcy. However, you may find that those filters give worse returns because, as you suggested, they might throw out too many good investments with the bad ones.

In that case, there are more knobs that you can try. The models I mentioned will let you set weights on the examples. You could add weight to the big winners, essentially telling the model to work harder to not misclassify those ones. That might solve your problem, or it might create some other problem. Perhaps standard deviation of returns becomes too high. Well, you can look for knobs to adjust for that.

And on and on it goes… There’s a reason why these ML projects at Google and other places take 10 PhDs three years to complete. It’s hard work!

Jack: Thank you again for taking the time to talk to us today. If investors want to find out more about you and your research, where are the best places to go?

Kevin: I’m on Twitter as @kczat. I’m always happy to chat there.

If the reader is also interested in learning more about ML, I’m planning to have more articles published at OSAM Research in the future that will hopefully be informative.