Stocks

Academic stock data

This post will introduce a couple of interesting datasets I recently stumbled upon. They contain historical stock return and fundamental data going back to`the 1980’s. Below I will outline the process by which I have made this data available, and perform an initial exploratory analysis. Background If you read this post, you will know I am collecting accounting and fundamental data for US stocks via the SEC EDGAR database. Price and other reference type data is also collected, and you can read about it here.

Multi-level stock valuation

“datasets are often highly structured, containing clusters of non-independent observational units that are hierarchical in nature, and Linear Mixed Models allow us to explicitly model the non-independence in such data” (Harrison et al., 2018) [1] They allow modeling of data measured on different levels at the same time - for instance students nested within classes and schools - thus taking complex dependency structures into account (Burkner, 2018) [2]

Analysing Momentum

This post is going to analyse the momentum effect in US stocks using both publicly available aggregate data, and privately collected individual stock level data. The momentum effect is the tendency for stocks that have gone up (down) in the past to continue going up (down) in the immediate future. Going up or down in the past is usually defined as the prior 12 months returns and is measured on a relative basis.

Abalone and Outliers

That title deserves an explanation. This note will look at the Theil Sen estimator for robust regression. I’m going to use the UCI Machine Learning abalone data set to compare this technique with Ordinary Least Squares. This one is via a Colab notebook, all is explained here.

Stock Master

What’s a stock master? It’s database, that contains data on stocks. It is also the master or authoritative, at least for me, source of that data. What kind of data exactly? Prices and fundamentals (and maybe economic time series). This post is going to document the data sources and tools used in building this database. The repo for the project is here. Motivation Firstly, why do I need a database that containing this type of information?

Intra portfolio correlation

This is a quick post about intra-portfolio correlation. Intra-portfolio correlation (“IPC”) is defined as a weighted average for all unique pairwise correlations within a portfolio. It has typically been used to measure a portfolio’s diversification. That’s not what I’m interested in however. I’m looking at IPC as a potential technical trading indicator. The idea being that an increase or decrease in the co-movement of a group of stocks (or the market as a whole for that matter) may say something about their future returns.