I described a basic alpha research process in the previous post — How to Build Quant Algorithmic Trading Model in Python — and this is the extension to cover the backtesting piece.
In this backtesting phase, we perform the following steps on each date for the backtest period:
This is just one basic way and the model described in this post…
I described basics of alpha research in my previous post — How to Build Quant Algorithmic Trading Model in Python and backtesting process in How to Perform Backtesting in Python. Here’s how to apply machine learning technique in generating a better alpha from a number of alpha factors. There are many different ways to achieve this and what I applied here is a classification by supervised machine learning, taking a quantised one-week forward return as the label and calculate the weighted sum of probability of each label.
In Section 3 of the previous post, I used the following three factors…
Data in applied machine learning for NLP often contain both text and numerical inputs. For instance, when you build a model to predict future sales of a product from twitter or news, it would be more effective to consider past sales figures, the number of visitors, market trend, etc. along with the texts. You won’t predict a stock price movement simply from news sentiment but rather use it to compliment the model based on economic indices and the historical prices. This post shows how to combine text inputs and numerical inputs in scikit-learn (for Tfidf) and pytorch (for LSTM/BERT).
When…
This post covers the basics of the alpha research process. zipline
and alphalens
are used to manage the pipeline and measure the performance in a way to be explicit about the quantitative process. You can check zipline tutorial as well.
The focus of this post is on pure alpha research process and I will cover the backtesting (How to Perform Backtesting in Python — added on 14 Jan 2021) and utilisation of AI/machine learning in another post later (How to generate AI Alpha Factor in Python — added on 26 Dec 2020). …
This blog describes how I analysed central bank policy by means of NLP techniques in a past project. The source code is available in github repo.
FOMC has eight regular meetings to determine the monetary policy. At each meeting, it publishes press conference minutes, statements as well as scripts in the website. In addition to this regular meetings, the members’ speeches and testimonies are also scripted on the website.
At a meeting, the policy makers discuss, vote and decide the monetary policy and publish the decision along with their view on current economic situation and forecast, including Forward Guidance since…
Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seeking professional advice. See our Reader Terms for details.
After seeing the competitive result of BERT in the sentiment analysis on financial text, I performed another preliminary study on more informal text as the ultimate goal is to analyse traders’ voice over the phones and chat in addition to the news sentiment. …
Deep learning in Computer Vision has been successfully adopted in a variety of applications since a pioneer CNN called AlexNet on ImageNet in 2012. On the contrary, NLP has been behind in terms of the deep neural network utilisation. A lot of applications which claim the use of AI often use some sort of rule-based algorithm and traditional machine learning rather than deep neural networks. In 2018 saw a state-of-the-art (STOA) model called BERT outperformed human scores in some NLP tasks. Here, I apply several models for a sentiment analysis task to see how useful they are in the financial…
The computer vision is one of the top fast growing domain and the deep learning based approach is now widely applied to solve real-world problems such as face recognition, cancer detection, etc.
One of the most effective tool is Tensorflow Object Detection API and use their pre-trained model, replacing the last layer for the particular problem trying to solve and fine tune the model.
Now the API supports Tensorflow 2.x. There are good reasons to use TF2 instead of TF1 — e.g. eager execution, which was introduced in TF1.5 to make the coding simpler and debugging easier, and new state…
The computer vision is one of the top fast growing domain and the deep learning based approach is now widely applied to solve real-world problems such as face recognition, cancer detection, etc.
One of the most effective tool is Tensorflow Object Detection API and use their pre-trained model, replacing the last layer for the particular problem trying to solve and fine tune the model.
Now the API is adopted to Tensorflow 2.x. However, the runtime environment sometimes does not allow to use the latest version and still need to use Tensorflow 1.x. This blog post demonstrate how to use the…
Stack Overflow publishes annual developer survey result. The survey data covers 64,000 reviews from 213 countries and territories. The survey aims to understand multiple aspects of jobs related to software development and data analytics. There were more than 150 questions as a part of the survey.
The survey page itself shows a nice graph of average salary by Developer Type, Language and Experience. However, I wondered how much the average of 213 countries is relevant to me. Then, why don’t we just analyse the raw data?
For any data science project, the data need to be cleaned first. …