Machine Learning 👩‍💻

In this repository we present an implementation from scratch, using only numpy and python most basic libraries, of the classical machine learning algorithms. We will go through the following algorithms :

For each algo we create a specific repository in which you can find the python implementation from scratch along with a jupyter notebook which is meant to train and test our code on some open source datasets. To provide a better understanding of the algorithm implemented, you will find in each repository a readme.md file that goes through the matematics that run the algo.

In the datasets/ folder you will find several classic open source datasets that we will use to train and test our models. In addition a python module called utils.py gives some useful functions to work with the datasets.

Note : In this repository we will propose an implementation for vanilla neural network in the mlp (multi layer perceptron) repository but we won't go further into deep learning. More deep learning code will be shared in some other coming repositories.

Models 🗳️

Decision Tree 🍃

In processing.

K-Means 🥝

K-Means is a very simple-to-understand clustering algorithm. We start by setting the parameter K which represents the number of clusters we are looking for. Then we initialize K points at random as our clusters centroïds. As its name suggests, a centroïd is simply the center point of a cluster. Once our centroïds are randomly choosen, we compute for each point its Euclidien Distances to the centroïds. We form the clusters by assigning each point to its closest centroïd. After that we get K groups of data and we will compute the centers of these clusters which we will assign as the new centroïds. Then we can once again form the clusters, and compute the centroïds, and form new clusters, and compute the new centroïds and so on...

More details and illustrations on the K-Means algorithm will soon be available in the coming k_mean/readme.md file.

Linear Regression 📈

Linear regression is probably the most common machine learning algorithm. Most of us had already used it even before starting to learn data science or artificial intelligence. This algorithm deals with regression problems and it can also be applied to classification but it's less relevent. The assumption made is that the output $y$ of an input $x$ is linear combination that input with a set of parameters $\omega$. You'll get more details on Linear Regression in the linear_regression/readme.md file

Our code is done in a way that allows the model to be used for multi-linear regression.

Linear Regression Regularized 👮

A classic linear regression model can easily suffer from over-fitting especially when we deal with polynomial regression, which is not yet presented in this machine_learning repository. To avoid such a behaviour we can restrict the parameters $\omega$ to be not too large thanks to regularization. Our LinearRegression() class implemented in linear_regression/ supports regularization. To do so you have to use the arguments regularization_type and regularization_coef arguments either in the __init__() of fit() method.

Logistic Regression 🤹

Logistic regression is probably the most famous classification algorithm. It is quite similar to linear regression except that we pass its result into a sigmoïd function that resizes the output between 0 and 1. We can then interpret this result as a probability and assign to any $x$ the class $C_k$ with the highest probability. Once again, please refer to the logistic_regression/readme.md file to have the global matematical overview.

Note : The current implementation only allows binary classification. This is quite a restriction so we will soon improve it to be multi-class compatible.

Linear Discriminant Analysis 🕵

Linear Discriminant Analysis is a fundamental discriminative model. It can be used for classification but also for dimension reduction. In this repository we only tackle classfication for the moment. It's not a hard-to-understand algorithm and it is very beneficial to take a bit of time the step into how it is built. This is why we recommand you to take a look to the implementation in LDA.py, the case-study which is an application of LDA for classification on the iris dataset thanks to the notebook LDANotebook.ipynb and have a glance to the readme.md to get a global overview of how the algorithm works.

Multi Layer Perceptron ⛓️

First draft of code available and description coming very soon

Naïve Bayes ⛵

Code available and description coming very soon

[Quadratic Discriminant Analysis] 🚧

Coming soon

Random Forest 🌳

First draft of code available and description coming very soon

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
_datasets		_datasets
_illustrations		_illustrations
_utils		_utils
decision_tree		decision_tree
k_means		k_means
linear_discriminant_analysis		linear_discriminant_analysis
linear_regression		linear_regression
logistic_regression		logistic_regression
multi_layer_perceptron		multi_layer_perceptron
naive_bayes		naive_bayes
random_forest		random_forest
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning 👩‍💻

Models 🗳️

Decision Tree 🍃

K-Means 🥝

Linear Regression 📈

Linear Regression Regularized 👮

Logistic Regression 🤹

Linear Discriminant Analysis 🕵

Multi Layer Perceptron ⛓️

Naïve Bayes ⛵

[Quadratic Discriminant Analysis] 🚧

Random Forest 🌳

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Machine Learning 👩‍💻

Models 🗳️

Decision Tree 🍃

K-Means 🥝

Linear Regression 📈

Linear Regression Regularized 👮

Logistic Regression 🤹

Linear Discriminant Analysis 🕵

Multi Layer Perceptron ⛓️

Naïve Bayes ⛵

[Quadratic Discriminant Analysis] 🚧

Random Forest 🌳

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages