Part IX: Appendices
Appendix A: Basic Statistics
(Check back soon for more detailed outline– sign up for email list updates)
- Appendix Objectives
- The difference between descriptive and inferential statistics
- How to calculate common measures of central tendency and dispersion
- The process of regression
- The basic premises and statistics related to MPT
- Intro
- Returns
- Probability and Statistics
- Independence
- Permutations
- Combinations
- Descriptive Statistics
- Intro
- Descriptives statistics merely tries to describe or characterize data in a shorthand manner
- Inferential statistics tries to infer various statements about data based on observed outcomes or assumptions about outcomes
- Measures of Central Tendency
- Intro
- Mean (arithmetic mean)
- Median
- Mode
- Geometric Mean
- Also called “compound rate of return”
- Measures of Dispersion
- Intro
- Variance
- Standard deviation
- sample
- population
- unbiased estimate
- degrees of freedom
- Intro
- Relationship Between Variables
- variance – one variable
- covariance – two-variable version of variance
- time series
- time series variable
- time series data
- observations of a variable at consecutive timer intervals
- r-squared
- Also called “Coefficient of determination”
- error term
- also called “residual”
- the portion of the unexplained Y variable in each period
- Autocorrelation
- also called “serial dependence”
- when the error terms themselves are correlated with each other
- Durbin-Watson test
- statistical test that helps detect autocorrelation
- also called “serial dependence”
- Dependent variable
- the variable being explained
- Independent variable (“explanatory variable”)
- the variable doing the explaining
- Multiple regression
- We can extend the idea of regression to more than one independent variable
- multiple regression
- Logically, two explanatory variables are better than one, three are better than two, etc.
- Virtually any additional explanatory variable we include in a regression will improve the r-squared
- but using additional variables to improve r-squared is not always good
- Adjusted r-squared value penalizes the r-squared value as more independent variables are added to the regression equation. Is it beneficial to add a particular variable?
- Multicollinearity
- occurs when there is a reasonably strong correlation between two or more of the independent variables
- Multicollinearity clouds the picture concerning which independent variables are statistically significant
- Statistically significant = How likely is it tha tI would observe this outcome purely based on chance alone?
- Statistically significant threshold of 5% if often used. If we observe something that we would expect to see less than 5% of the time based strictly on chance alone, then it might be deemed statistically significant.
- Threshold of 1% would be a more stringent test.
- Statistically significant doesn’t always mean economically significant. Some trading systems might be statistically significant but because of transaction costs and other factors, might not be profit producing.
- We can extend the idea of regression to more than one independent variable
- Intro
- Inferential Statistics
- Intro
- Modern Portfolio Theory
- Intro
- Performance Measurement
- Sharpe performance (or Sharpe Ratio)
- Treynor measure of performance
- Jensen’s alpa
- Advanced Statistical Methods
- Time series modeling
- ARCH and GARCH
- Generalized autoregressive conditional heteroskedasticity
- volatility of a series is not generally constant or consistent
- Maximum likelihood
- work backward from the observed data to make inferences about the probability distribution that produced those outcomes
- try to find the distribution that was most likely to be the source of the outcomes
- can be applied to many different statistical problems
- can even be an alternative to least squares in performing regression
- Artificial Intelligence
- Review Questions
Proceed to Appendix B: Types of Orders and Other Trader Terminology (in Kirkpatrick and Dahlquist)
Follow Me Here: