Machine Learning

Using UMAP preprocessing for image classification

UMAP Uniform manifold approximation and projection or in short UMAP is a type of dimension reduction techniques. So, basically UMAP will project a set of features into a smaller space.

Explore data using PCA

Principal component analysis (PCA) PCA is a dimension reduction techniques. So, if we have a large number of predictors, instead of using all the predictors for modelling or other analysis, we can compressed all the information from the variables and create a new set of variables.

A short note on variable selection

Variable selection Variable or feature selection is one of the important step whether in machine learning or statistical analysis. This post is geared more to the machine learning side.

Variable selection using genetic algorithm

Background Genetic algorithm is inspired by a natural selection process by which the fittest individuals be selected to reproduce. This algorithm has been used in optimization and search problem, and also, can be used for variable selection.

Hyperparameter tuning in tidymodels

This post will not go very detail in each of the approach of hyperparameter tuning. This post mainly aims to summarize a few things that I studied for the last couple of days.

Handling imbalanced data

Overview Imbalance data happens when there is unequal distribution of data within a categorical outcome variable. Imbalance data occurs due to several reasons such as biased sampling method and measurement errors.