Posts

Showing posts from July, 2019

Ubuntu Cheat Sheet

Ubuntu Cheat Sheet Note : This blog is is still under construction. This blog gives you some of the common commands or tasks which you would be performing everyday in Ubuntu machine. This is mainly for the beginners who just moved from Windows to Ubuntu. How to list the installed software in Ubuntu? dpkg --list How to install a package? sudo apt-get install <package_name> How to completely remove a package from Ubuntu system? sudo apt-get purge <package_name> xkill

Machine Learning Basics

In this blog I give an overview of the Machine Learning Project Flow. Every Machine Learning project involves the below steps: Understand the client requirement / Problem statement Data Understanding Data Collection (CSV file, logs, sensor data, data from SQL etc) Data Explore Data Quality Analysis : Analyse the data such that the sufficient information or data is available to prepare the plan for building ML model. Data Preparation Cleaning the data :  Check for NULL and NA values in the dataset, and take necessary actions. a. Remove if dataset is huge and removing a samples doesn’t affect the quality of the data. b. Impute the missing values with mean, median or KNN. Outliers :  Might be due to human error. This can be checked by using boxplot or the summary statistics of the data. Remove or replace accordingly. Sample Distribution : Check how the features are distribute using histogram. Divide the data into train and test data set. Feature Selection :  Thi