Predicting Stock Returns with AI and Online News: A New Frontier
Published Date: 12/14/2023 Author: Thanakorn Rojanasasitornwong
In the ever-evolving landscape of stock trading, the combination between artificial intelligence (AI), data science, and financial news insights has given rise to a new era. Beyond traditional models, researchers at Cornell have pioneered a groundbreaking approach, leveraging machine learning, natural language processing (NLP), and finance to construct a new predictive framework. In this blog post, we'll explore the convergence of these disciplines and how it is reshaping the way we predict stock returns.
Addressing a common critique of machine learning - its lack of interpretability - Cornell researchers have introduced a model that not only predicts stock returns but also provides a clear understanding of the underlying factors. By incorporating text data from financial news, the researchers have created interpretable machine-learning models, allowing traders to distinguish the important features explicitly. Lead author Liao Zhu emphasizes the significance of utilizing financial news to "cluster the data," bringing order to the sometimes chaotic results generated by algorithms. This innovative approach aims to enhance our understanding of the relationship between certain tradable assets, such as exchange-traded funds (ETFs), and specific stocks or industries.
The Cornell research builds on existing methods, introducing a flexible prediction framework that bridges market data and text data without relying on sentiment analysis. Instead, the researchers employ the method of "word embeddings" from natural language processing, creating "asset embeddings" for tradable assets derived from financial news. This novel approach integrates new, interpretable machine-learning algorithms, allowing for a more nuanced understanding of the market. Zhu clarifies that their algorithm doesn't rely on sentiment from the news but instead utilizes the news as guidance for identifying assets or words relevant to specific stocks or industries, revealing more stock- and industry-specific information.
To demonstrate the efficacy of their approach, the researchers developed two distinct models. The News Embedding UMAP Sparse Selection (NEUSS) model predicts returns for individual stocks, while the News Sparse Encoder with Rationale (INSER) model identifies crucial words for each specific industry, enhancing the accuracy of industry return predictions. The NEUSS model outperformed the traditional Fama-French 5-factor model by an impressive 50%, showcasing its predictive prowess. Similarly, the INSER model surpassed the benchmark by 10%, underscoring the importance of industry-specific information.
The Cornell research represents a significant stride in the ongoing AI revolution in finance. By harnessing the power of advanced machine-learning algorithms and integrating diverse data types, the financial landscape is undergoing a paradigm shift. This research not only moves the revolution forward but also highlights the potential for a more sophisticated and informed approach to stock trading. As AI continues to reshape the finance field, the Cornell study stands as a testament to the transformative possibilities that lie at the intersection of technology, data, and financial insights.