System rebuilding...
.
.
.
Machine Learning
Principal Component Analysis (PCA)
Kernel Principal Component Analysis (Kernel PCA)
Tree-Based Methods
Gradient Boost Decision Tree
XGBoost
LightGBM
Incoming Topics...
Logistic Regression
K-Nearest Neighbor (KNN)
Support Vector Machine (SVM)
.
.
.
Introduction to Deep Learning
Description: A simple graph demonstrating how MLP, RNN (LSTM, GRU) work, including the mathematical derivation of feedforward, backpropagation, and backpropagation through time.
MLP (Multiple Layer Perceptron)
RNN (Recurrent Neural Network)
LSTM (Long Short-Term Memory) & GRU (Gate Recurrent Unit)
Attention
Transformer
.
. .
Statistical Foundation of Data Science (A Guide)
Set 1 Probability theory and mathematical statistics (I)
- Discrete Random Variables
- Continuous Random Variables
- Functions of a Random Variable
- Joint Distributions
- Independent Random Variables
- Conditional Distributions
- Functions of Jointly Distributed Random Variables
- Extrema and Order Statistics
- Expected Values
- Limit Theorems
Set 2 Probability theory and mathematical statistics (II)
- Examples & Reasons for Fitting Distribution
- Parameter Estimation
- The Method of Moments
- The Method of Maximum Likelihood
- Maximum Likelihood Estimates of Multinomial Cell Probabilities
- Large Sample Theory for Maximum Likelihood Estimates
- Confidence Intervals from Maximum Likelihood Estimates
- Efficiency and the Cramer-Rao Lower Bound
- Ancillary Statistics
- Sufficient Statistics
- A Factorization Theorem
- The Rao-Blackwell Theorem
- Delta method
Set 3 The Bayesian Approach to Parameter Estimation
- Example of Bayesian Inference
- Bayesian Point Estimation and Interval Estimation
- Large Sample Normal Approximation to the Posterior
- Computational Aspects: Gibbs Sampling, Markov Chain Monte Carlo
- Bayesian Testing Procedures
Set 4 Testing Hypotheses and Assessing Goodness of Fit
- Basics of Hypotheses Testing
- The Neyman-Pearson Paradigm
- Specification of the Significance Level and the Concept of a p-value
- Uniformly Most Powerful Tests
- The Duality of Confidence Intervals and Hypothesis Tests
- Generalized Likelihood Ratio Tests
- Likelihood Ratio Tests for the Multinomial Distribution
- The Poisson Dispersion Test
- Probability Plots
- Tests for Normality
Set 5 Nonparametric Statistics
- Nonparametric hypothesis testing:
Permutation testing, Rank-based tests: Mann-Whitney Test, Wilcoxon Rank Sum Test - Empirical distributions and the plug-in principle
Empirical CDF, empirical distributions, convergence theorems, Monte Carlo integration - Density estimation
Histogram estimators, Kernel density estimators - Nonparametric regression
Definitions, Linear regression: Regressograms, Kernel regression: Nadaraya-Watson kernel regression,
Cross-validation, Curse of dimensionality
Set 6 Bootstrap
- Motivation for bootstrap
- Bootstrap basics
- Bootstrap confidence intervals
- Other uses of bootstrap
- Quantifying uncertainty more generally
Set 7 Monte Carlo Sampling
- Motivation from Bayesian inference
- Monte Carlo Methods:
- Direct Sampling
- Rejection Sampling
- Importance Sampling
- Markov Chain Monte Carlo
- Cont.