System rebuilding...

Machine Learning

Principal Component Analysis (PCA)

Kernel Principal Component Analysis (Kernel PCA)

K-means

Gaussian Mixture Model

Tree-Based Methods

Decision Tree

Gradient Boost Decision Tree

XGBoost

LightGBM

Incoming Topics...

Logistic Regression

K-Nearest Neighbor (KNN)

Support Vector Machine (SVM)

.
.
.

Introduction to Deep Learning

Description: A simple graph demonstrating how MLP, RNN (LSTM, GRU) work, including the mathematical derivation of feedforward, backpropagation, and backpropagation through time.

MLP (Multiple Layer Perceptron)

RNN (Recurrent Neural Network)

LSTM (Long Short-Term Memory) & GRU (Gate Recurrent Unit)

Attention

Transformer

.
. .

Statistical Foundation of Data Science (A Guide)

Set 1 Probability theory and mathematical statistics (I)

Discrete Random Variables
Continuous Random Variables
Functions of a Random Variable
Joint Distributions
Independent Random Variables
Conditional Distributions
Functions of Jointly Distributed Random Variables
Extrema and Order Statistics
Expected Values
Limit Theorems

Set 2 Probability theory and mathematical statistics (II)

Examples & Reasons for Fitting Distribution
Parameter Estimation
The Method of Moments
The Method of Maximum Likelihood
Maximum Likelihood Estimates of Multinomial Cell Probabilities
Large Sample Theory for Maximum Likelihood Estimates
Confidence Intervals from Maximum Likelihood Estimates
Efficiency and the Cramer-Rao Lower Bound
Ancillary Statistics
Sufficient Statistics
A Factorization Theorem
The Rao-Blackwell Theorem
Delta method

Set 3 The Bayesian Approach to Parameter Estimation

Example of Bayesian Inference
Bayesian Point Estimation and Interval Estimation
Large Sample Normal Approximation to the Posterior
Computational Aspects: Gibbs Sampling, Markov Chain Monte Carlo
Bayesian Testing Procedures

Set 4 Testing Hypotheses and Assessing Goodness of Fit

Basics of Hypotheses Testing
The Neyman-Pearson Paradigm
Specification of the Significance Level and the Concept of a p-value
Uniformly Most Powerful Tests
The Duality of Confidence Intervals and Hypothesis Tests
Generalized Likelihood Ratio Tests
Likelihood Ratio Tests for the Multinomial Distribution
The Poisson Dispersion Test
Probability Plots
Tests for Normality

Set 5 Nonparametric Statistics

Nonparametric hypothesis testing:
Permutation testing, Rank-based tests: Mann-Whitney Test, Wilcoxon Rank Sum Test
Empirical distributions and the plug-in principle
Empirical CDF, empirical distributions, convergence theorems, Monte Carlo integration
Density estimation
Histogram estimators, Kernel density estimators
Nonparametric regression
Definitions, Linear regression: Regressograms, Kernel regression: Nadaraya-Watson kernel regression,
Cross-validation, Curse of dimensionality

Set 6 Bootstrap

Motivation for bootstrap
Bootstrap basics
Bootstrap confidence intervals
Other uses of bootstrap
Quantifying uncertainty more generally

Set 7 Monte Carlo Sampling

Motivation from Bayesian inference
Monte Carlo Methods:
Direct Sampling
Rejection Sampling
Importance Sampling
Markov Chain Monte Carlo
Cont.

Github Resource from Set 4-7
.
.
.

Chuizixiaoxing

Statistics