Digital Data: Gold Mine or Garbage Dump?

November 30th, 2021 3 minutes read

If you’d like to learn more about how to navigate digital data, read our previously published article “Backtesting in the Age of Financial Machine Learning.” 

Did you know that all the world’s digital data is believed to reach 40 zettabytes (40 trillion gigabytes) by the end of 2020? For the human mind, this staggering amount of information is hard to comprehend. 

To put this in perspective, this means that an average person creates 1.7MB of data every second (and going up). And if you place this on a graph, it will show the amount of digital data has been doubling every two years for the past 10 years.

In fact, by 2025, it is predicted that we will have 175ZB of data. But, is all this data information? While it’s not wrong to think that this digital age has brought us a goldmine of information, the reality is most of this data is noise. Traders looking for signals or clean filtered data that support hypotheses face a critical limitation: time.

Time is of critical importance when trading assets, as traders need to decide immediately. Having a lot of noise present makes it difficult to see things clearly. While one cannot expect to find trading signals easily, naive economic theory argues that under such conditions,  the search may be futile. 

On the other hand, rational economic theory says that signals may exist in the presence of noise. However, the collective cost of their discovery is limited to the benefits gathered when trading. Otherwise, other traders will enter your space and the situation will no longer occur. 

Understanding rational economic theory is essential to understanding the limitations of signals and the significance of proper backtesting1.

Where should you look for signals?

For decades, economists have said that observing price behavior could tell the direction of individual and collective market behavior. Historically, they believed the rise and fall of prices could be seen as indicators of market sentiment--including objective and subjective information. While such suggestions have never found solid theoretical or continuous statistical backing, observing individuals' direct interests may yield more plausible results. 

What’s great about having zettabytes of the world’s data at our disposal is we can mine signals from online interactions. These seem to offer a new perspective on market participants' behavior in periods of large market movements. But what exactly should we test?

Why Nowcasting Instead of Forecasting

From a signal standpoint and given all the traps, traders must focus on nowcasting instead of the usual forecasting models. Through nowcasting, traders can get

  • Direct measurements that always hold true because they do not rely on a statistical lead-lag relationship
  • Short-range predictions are statistically more reliable than long-range ones, which also implies that most published discoveries or signals in finance are false after a while

Unfortunately a common but shaky practice, for some academics and many practitioners, is to run tens of thousands of historical backtests to identify a promising investment strategy. The best cherry-picked test is then reported as if a single trial had taken place. This then becomes the basis for publication, or for launching a new fund.

 

 

 

 

 

 

 

 

Related Blogs
Generating Alpha: Machine Learning Helps Traders
Investors are adopting machine learning as a strategy to identify alpha and gain market advantage through sentiment analysis, alternative data insights and maximizing gains. However, the implementation of a professional’s analytical skills and moral values are important in the decision-making processes.

October 1st, 2021

Key Benefits of Machine Learning
Market professionals' main goal is producing unique and fresh alpha. Machine learning is one of the most valuable tools. We have listed its key benefits below.

October 15th, 2021

What is Machine Learning?
Machine Learning is the most prominent topic in tech right now. However, it can be greatly beneficial to traders if rightly incorporated into their strategies.

November 1st, 2021