Feature Selection - Dr Venugopala Rao Manneni

Need: Feature selection, the process of finding and selecting the most useful features in a dataset, is a crucial step of the machine learning pipeline. Unnecessary features decrease training speed, decrease model interpretability, and, most importantly, decrease generalization performance on the test set

Approach: There are three main types of feature selection techniques: supervised and unsupervised (Dimensionality reduction based), variable importance-based methods

Filter based: Methods allow the selection or the deselection of entire features (columns, essentially) based on the passing score of some metrics. For example, if the variance (meaning that the values are too dispersed) of a column is too high, we have good reasons to avoid putting that column in the model, as it will likely diminish its accuracy. The same applies to different metrics.

T test/Anova
Fisher’s Score
Correlation Coefficient
Chi-Square Test
Information Gain..etc

Wrapper-based: Wrapper methods group several techniques that use an R Squared value to measure whether a feature should be conserved or not. These techniques work by iteratively using features while monitoring a change in score, hence they work recursively:

Exhaustive Feature Selection
Forward Regression
Backward Regression/RFE_Recursive Feature Elimination
Stepwise Regression
Bi-directional elimination

Embedded: Embedded methods are ways of performing feature selection while training the model. This is common in neural networks, where, automatically, features are selected or deselected with Normalization techniques:

L1 Normalization
L2 Normalization

Feature importance methods: Feature importance refers to techniques that assign a score to input features based on how useful they are at predicting a target variable, Linear Regression Feature Importance,

Regression Feature Importance
Logistic Regression Feature Importance
Decision Tree Feature Importance
Random Forest Feature Importance. Etc

Dimensionality Reduction(Unsupervised Methods)

One can describe Principal Components Regression as an approach for deriving a low-dimensional set of features from a large set of variables. The idea is that the principal components capture the most variance in the data using linear combinations of the data in subsequently orthogonal directions .In this way, we can also combine the effects of correlated variables to get more information out of the available data

For code

https://github.com/drstatsvenu/feature-selection

Feature Selection

Venugopal Manneni

A doctor in statistics from Osmania University. I have been working in the fields of Analytics and research for the last 15 years. My expertise is to architecting the solutions for the data driven problems using statistical methods, Machine Learning and deep learning algorithms for both structured and unstructured data. In these fields I’ve also published papers. I love to play cricket and badminton.

Venugopal Manneni

Post navigation