Advanced Data Analysis from an Elementary Point of View
Computers & Technology
Advanced Data Analysis from an Elementary Point of View
To the Reader
Concepts You Should Know
I Regression and Its Generalizations
Regression Basics
Statistics, Data Analysis, Regression
Guessing the Value of a Random Variable
Estimating the Expected Value
The Regression Function
Some Disclaimers
Estimating the Regression Function
The Bias-Variance Tradeoff
The Bias-Variance Trade-Off in Action
Ordinary Least Squares Linear Regression as Smoothing
Linear Smoothers
k-Nearest-Neighbor Regression
Kernel Smoothers
The Truth about Linear Regression
Optimal Linear Prediction: Multiple Variables
The Prediction and Its Error
Estimating the Optimal Linear Predictor
Unbiasedness and Variance of Ordinary Least Squares Estimates
Shifting Distributions, Omitted Variables, and Transformations
Changing Slopes
R2: Distraction or Nuisance?
Omitted Variables and Shifting Distributions
Errors in Variables
Adding Probabilistic Assumptions
Examine the Residuals
On Significant Coefficients
Linear Regression Is Not the Philosopher's Stone
Further Reading
Model Evaluation
What Are Statistical Models For?
Errors, In and Out of Sample
Over-Fitting and Model Selection
Data-set Splitting
k-Fold Cross-Validation (CV)
Leave-one-out Cross-Validation
Parameter Interpretation
Further Reading
Smoothing in Regression
How Much Should We Smooth?
Adapting to Unknown Roughness
Bandwidth Selection by Cross-Validation
Convergence of Kernel Smoothing and Bandwidth Scaling
Summary on Kernel Smoothing in 1D
Kernel Regression with Multiple Inputs
Interpreting Smoothers: Plots
Average Predictive Comparisons
Computational Advice: npreg
Further Reading
What Do We Mean by ``Simulation''?
How Do We Simulate Stochastic Models?
Chaining Together Random Variables
Random Variable Generation
Built-in Random Number Generators
Quantile Method
Sampling Rows from Data Frames
Multinomials and Multinoullis
Probabilities of Observation
Repeating Simulations
Why Simulate?
Understanding the Model; Monte Carlo
Checking the Model
``Exploratory'' Analysis of Simulations
Sensitivity Analysis
Further Reading
The Bootstrap
Stochastic Models, Uncertainty, Sampling Distributions
The Bootstrap Principle
Variances and Standard Errors
Bias Correction
Confidence Intervals
Other Bootstrap Confidence Intervals
Hypothesis Testing
Double bootstrap hypothesis testing
Parametric Bootstrapping Example: Pareto's Law of Wealth Inequality
Parametric vs. Nonparametric Bootstrapping
Bootstrapping Regression Models
Re-sampling Points: Parametric Example
Re-sampling Points: Non-parametric Example
Re-sampling Residuals: Example
Bootstrap with Dependent Data
Things Bootstrapping Does Poorly
Which Bootstrap When?
Further Reading
Weighting and Variance
Weighted Least Squares
Weighted Least Squares as a Solution to Heteroskedasticity
Some Explanations for Weighted Least Squares
Finding the Variance and Weights
Conditional Variance Function Estimation
Iterative Refinement of Mean and Variance: An Example
Real Data Example: Old Heteroskedastic
Re-sampling Residuals with Heteroskedasticity
Local Linear Regression
For and Against Locally Linear Regression
Smoothing by Penalizing Curve Flexibility
The Meaning of the Splines
Computational Example: Splines for Stock Returns
Confidence Bands for Splines
Basis Functions and Degrees of Freedom
Basis Functions
Degrees of Freedom
Splines in Multiple Dimensions
Smoothing Splines versus Kernel Regression
Some of the Math Behind Splines
Further Reading
Additive Models
Additive Models
Partial Residuals and Back-fitting
Back-fitting for Linear Models
Backfitting Additive Models
The Curse of Dimensionality
Example: California House Prices Revisited
Closing Modeling Advice
Further Reading
Testing Regression Specifications
Testing Functional Forms
Examples of Testing a Parametric Model
Why Use Parametric Models At All?
Why We Sometimes Want Mis-Specified Parametric Models
Further Reading
More about Hypothesis Testing
Logistic Regression
Modeling Conditional Probabilities
Logistic Regression
Likelihood Function for Logistic Regression
Numerical Optimization of the Likelihood
Iteratively Re-Weighted Least Squares
Generalized Linear and Additive Models
Generalized Additive Models
Model Checking
Non-parametric Alternatives
A Toy Example
Weather Forecasting in Snoqualmie Falls
Logistic Regression with More Than Two Classes
GLMs and GAMs
Generalized Linear Models and Iterative Least Squares
GLMs in General
Examples of GLMs
Vanilla Linear Models
Binomial Regression
Poisson Regression
Modeling Dispersion
Likelihood and Deviance
Maximum Likelihood and the Choice of Link Function
R: glm
Generalized Additive Models
Further Reading
Prediction Trees
Regression Trees
Example: California Real Estate Again
Regression Tree Fitting
Cross-Validation and Pruning in R
Uncertainty in Regression Trees
Classification Trees
Measuring Information
Making Predictions
Measuring Error
Misclassification Rate
Average Loss
Likelihood and Cross-Entropy
Neyman-Pearson Approach
Further Reading
II Multivariate Data, Distributions, and Latent Structure
Multivariate Distributions
Review of Definitions
Multivariate Gaussians
Linear Algebra and the Covariance Matrix
Conditional Distributions and Least Squares
Projections of Multivariate Gaussians
Computing with Multivariate Gaussians
Inference with Multivariate Distributions
Model Comparison
Density Estimation
Histograms Revisited
``The Fundamental Theorem of Statistics''
Error for Density Estimates
Error Analysis for Histogram Density Estimates
Kernel Density Estimates
Analysis of Kernel Density Estimates
Joint Density Estimates
Categorical and Ordered Variables
Kernel Density Estimation in R: An Economic Example
Conditional Density Estimation
Practicalities and a Second Example
More on the Expected Log-Likelihood Ratio
Simulating from Density Estimates
Simulating from Kernel Density Estimates
Sampling from a Joint Density
Sampling from a Conditional Density
Drawing from Histogram Estimates
Examples of Simulating from Kernel Density Estimates
Relative Distributions and Smooth Tests
Smooth Tests of Goodness of Fit
From Continuous CDFs to Uniform Distributions
Testing Uniformity
Neyman's Smooth Test
Choice of Function Basis
Choice of Number of Basis Functions
Application: Combining p-Values
Density Estimation by Series Expansion
Smooth Tests of Non-Uniform Parametric Families
Estimated Parameters
Implementation in R
Some Examples
Conditional Distributions and Calibration
Relative Distributions
Estimating the Relative Distribution
R Implementation and Examples
Example: Conservative versus Liberal Brains
Example: Economic Growth Rates
Adjusting for Covariates
Example: Adjusting Growth Rates
Further Reading
Principal Components Analysis
Mathematics of Principal Components
Minimizing Projection Residuals
Maximizing Variance
More Geometry; Back to the Residuals
Scree Plots
Statistical Inference, or Not
Example 1: Cars
Example 2: The United States circa 1977
Latent Semantic Analysis
Principal Components of the New York Times
PCA for Visualization
PCA Cautions
Further Reading
Factor Analysis
From PCA to Factor Analysis
Preserving correlations
The Graphical Model
Observables Are Correlated Through the Factors
Geometry: Approximation by Linear Subspaces
Roots of Factor Analysis in Causal Discovery
Degrees of Freedom
A Clue from Spearman's One-Factor Model
Estimating Factor Loadings and Specific Variances
Maximum Likelihood Estimation
Alternative Approaches
Estimating Factor Scores
The Rotation Problem
Factor Analysis as a Predictive Model
How Many Factors?
R2 and Goodness of Fit
Factor Models versus PCA Once More
Examples in R
Example 1: Back to the US circa 1977
Example 2: Stocks
Reification, and Alternatives to Factor Models
The Rotation Problem Again
Factors or Mixtures?
The Thomson Sampling Model
Further Reading
Nonlinear Dimensionality Reduction
Why We Need Nonlinear Dimensionality Reduction
Local Linearity and Manifolds
Locally Linear Embedding (LLE)
Finding Neighborhoods
Finding Weights
k > p
Finding Coordinates
More Fun with Eigenvalues and Eigenvectors
Finding the Weights
k > p
Finding the Coordinates
Finding the Nearest Neighbors
Calculating the Weights
Calculating the Coordinates
Diffusion Maps
Diffusion-Map Coordinates
Fun with Transition Matrices
Multiple Scales
Choosing q
What to Do with the Diffusion Map Once You Have It
Spectral Clustering
The Kernel Trick
Mixture Models
Two Routes to Mixture Models
From Factor Analysis to Mixture Models
From Kernel Density Estimates to Mixture Models
Mixture Models
Probabilistic Clustering
Estimating Parametric Mixture Models
More about the EM Algorithm
Further Reading on and Applications of EM
Topic Models and Probabilistic LSA
Non-parametric Mixture Modeling
Worked Computating Example
Mixture Models in R
Fitting a Mixture of Gaussians to Real Data
Calibration-checking for the Mixture
Selecting the Number of Components by Cross-Validation
Interpreting the Mixture Components, or Not
Hypothesis Testing for Mixture-Model Selection
Graphical Models
Conditional Independence and Factor Models
Directed Acyclic Graph (DAG) Models
Conditional Independence and the Markov Property
Examples of DAG Models and Their Uses
Missing Variables
Non-DAG Graphical Models
Undirected Graphs
Directed but Cyclic Graphs
Further Reading
III Causal Inference
Graphical Causal Models
Causation and Counterfactuals
Causal Graphical Models
Calculating the ``effects of causes''
Back to Teeth
Conditional Independence and d-Separation
D-Separation Illustrated
Linear Graphical Models and Path Coefficients
Positive and Negative Associations
Independence and Information
Further Reading
Identifying Causal Effects
Causal Effects, Interventions and Experiments
The Special Role of Experiment
Identification and Confounding
Identification Strategies
The Back-Door Criterion: Identification by Conditioning
The Entner Rules
The Front-Door Criterion: Identification by Mechanisms
The Front-Door Criterion and Mechanistic Explanation
Instrumental Variables
Some Invalid Instruments
Critique of Instrumental Variables
Failures of Identification
Further Reading
Estimating Causal Effects
Estimators in the Back- and Front- Door Criteria
Estimating Average Causal Effects
Avoiding Estimating Marginal Distributions
Propensity Scores
Matching and Propensity Scores
Instrumental-Variables Estimates
Uncertainty and Inference
Further Reading
Discovering Causal Structure
Testing DAGs
Testing Conditional Independence
Faithfulness and Equivalence
Partial Identification of Effects
Causal Discovery with Known Variables
The PC Algorithm
Causal Discovery with Hidden Variables
On Conditional Independence Tests
Software and Examples
Limitations on Consistency of Causal Discovery
Further Reading
Pseudo-code for the SGS and PC Algorithms
The SGS Algorithm
The PC Algorithm
Experimental Causal Inference
Why Experiment?
Basic Ideas Guiding Experimental Design
How Randomization Solves the Causal Identification Problem
Randomization and Linear Models
Randomization and Non-Linear Models
Modes of Randomization
IID Assignment
Planned Assignment
Perspectives: Units vs. Treatments
Choice of Levels
Parameter Estimation or Prediction
Maximizing Yield
Model Discrimination
Multiple Goals
Multiple Manipulated Variables
Factorial Designs
Within-Subject Designs
Summary on the elements of an experimental design
``What the experiment died of''
Further Reading
IV Dependent Data
Time Series
Time Series, What They Are
The Ergodic Theorem
The World's Simplest Ergodic Theorem
Rate of Convergence
Why Ergodicity Matters
Markov Models
Meaning of the Markov Property
Autoregressive Models
Autoregressions with Covariates
Additive Autoregressions
Linear Autoregression
``Unit Roots'' and Stationary Solutions
Conditional Variance
Regression with Correlated Noise; Generalized Least Squares
Bootstrapping Time Series
Parametric or Model-Based Bootstrap
Block Bootstraps
Sieve Bootstrap
Trends and De-Trending
Forecasting Trends
Seasonal Components
Detrending by Differencing
Cautions with Detrending
Bootstrapping with Trends
Further Reading
Time Series with Latent Variables
Simulation-Based Inference
The Method of Simulated Moments
The Method of Moments
Adding in the Simulation
An Example: Moving Average Models and the Stock Market
Appendix: Some Design Notes on the Method of Moments Code
Longitudinal, Spatial and Network Data
V Data-Analysis Problem Sets
What's That Got to Do with the Price of Condos in California?
The Advantages of Backwardness
The Size of a Cat's Heart
It's Not the Heat that Gets You
Nice Demo City, but Will It Scale?
Version 1
Tasks and Questions
Version 2
Fair's Affairs
How the North American Paleofauna Got a Crook in Its Regression Line
How the Hyracotherium Got Its Mass
How the Recent Mammals Got Their Size Distribution
Red Brain, Blue Brain
Patterns of Exchange
Is This Assignment Really Necessary?
Mystery Multivariate Analysis
Separated at Birth
Brought to You by the Letters D, A and G
Estimating with DAGs
Use and Abuse of Conditioning
What Makes the Union Strong?
An Insufficiently Random Walk Down Wall Street
Macroeconomic Forecasting
Debt Needs Time for What It Kills to Grow In
Linear Algebra
Eigenvalues and Eigenvectors of Matrices
Singular Value Decomposition
Square Root of a Matrix
Special Kinds of Matrix
Orthonormal Bases
Orthogonal Projections
Function Spaces
Eigenvalues and Eigenfunctions of Operators
Further Reading
Big O and Little o Notation
Taylor Expansions
Propagation of Error
Optimization Theory
Basic Concepts of Optimization
Small-Noise Asymptotics for Optimization
Application to Maximum Likelihood
Aside: The Akaike Information Criterion
Constrained and Penalized Optimization
Constrained Optimization
Lagrange Multipliers
Penalized Optimization
Constrained Linear Regression
Statistical Remark: ``Ridge Regression'' and ``The Lasso''
Optimization Methods
Optimization with First- and Second- Derivatives
Gradient Descent
Newton's Method
Newton's Method in More than One Dimension
Stochastic Approximation
Stochastic Newton's Method
Pros and Cons of Stochastic Gradient Methods
Derivative-Free Optimization Techniques
Nelder-Mead, a.k.a. the Simplex Method
Simulated Annealing
Methods for Constraints
R Notes
2 and Likelihood Ratios
Proof of the Gauss-Markov Theorem
Rudimentary Graph Theory
Uncorrelated versus Independent
Information Theory
First Example: Pareto Quantiles
Functions Which Call Functions
Sanity-Checking Arguments
Layering Functions and Debugging
More on Debugging
Automating Repetition and Passing Arguments
Avoiding Iteration: Manipulating Objects
ifelse and which
apply and Its Variants
More Complicated Return Values
Re-Writing Your Code: An Extended Example
General Advice on Programming
Comment your code
Use meaningful names
Check whether your program works
Avoid writing the same thing twice
Start from the beginning and break it down
Break your code into many short, meaningful functions
Further Reading
Generating Random Variables
Rejection Method
The Metropolis Algorithm and Markov Chain Monte Carlo
Generating Uniform Random Numbers
Further Reading
The book hasn't received reviews yet.