So you have a dataset and you are trying to make sense out of it? Let's start with the very basics and understand what exactly your data consists of.
A dataset is a set of Entities, for example, for a sales dataset it will be the collection of sales, for a university dataset, it will be a collection of student information.
These entities are represented by Attributes( also known as columns, features, variables). These represent specific information collected for an entity. For example, if we are looking at a sales dataset. The entity will be a record that represents the…
Working on datasets can be tedious and the first issue that you can encounter is missing data. While it is not in the control of a data scientist to eradicate the problem from the route, it is indeed in control of how to deal with it. Various methods could be used for imputing missing data, by imputing it means replacing the missing data points with something.
Here are some methods that can be leveraged -
With convolution neural networks, we can create deep learning models which are really good at classifying images and can be scaled to process a wide variety of image data. When processing sequence data, such as text data, time-series data, or audio data where we are looking at information that spans through time and forms a sequence of information. CNN may not be a good choice.
In the case of sequential data, where the next set of data can be dependent on the previous set it is necessary to have some sort of history to be maintained, which can be leveraged…
In the earlier post, Artificial Neural Networks, Part 4 — Convolution Neural Networks, we looked into the theory behind CNNs. In this post, we will go through the practical implementation using Keras and Tensorflow. The dataset is taken from Kaggle — Chest X-Ray Images. There are about 5864 images and 2 categories (PNEUMONIA / NORMAL).
Let us get straight into the implementation.
The first step starts with importing the required libraries
In this part, we will go over one of the widely used neural networks in image recognition tasks, the convolution neural networks. Before we get into the details, let us go over some of the key terms and concepts that come into play when working with CNNs.
In the previous posts of this series, we covered the following topics —
This post will cover the implementation of a basic neural network using Tensorflow and Keras.
The problem is based on the Bike Sharing Dataset available on UCI Machine Learning repository and can be found here —
There are two datasets provided in the zip file, daily.csv, and hour.csv.
In this problem, I have used the daily.csv file which is a rollup of the hour.csv file.
The problem involves predicting the count of rental bikes including…
Artificial Neural Networks, Part 2 — Understanding Gradient Descent (without the math)
In post 1 of this series, we went over the basics of artificial neural networks, the related components like nodes, weights, bias, activation functions, etc. This post is about, another major topic. I will explain the concept of Gradient Descent, without using a lot of math.
The main motive behind gradient descent is to update the weights and biases to reach a point where cost/loss function value is minimum. This results in a model that is capable of almost correct predictions. …
This post is the first of a series, where I will try to explain the concept behind ANNs or Artificial Neural Networks.
We will start with a simple perceptron model and try to understand the intuition behind it without getting into a lot of math.
Whenever we hear or read about ANN or Artificial Neural Networks, the most appropriate analogy that comes in the mind is that of the neurons in our brain. Neurons consist of a lot of connections going in and out of the central node which performs the required operations. …
In this blog, we will go over why do we need categorical encoding and different methods to perform categorical encoding for variables.
As the name implies, categorical encoding is performed for variables of categorical types. This means every category in the variable is assigned a numeric value to represent the actual value. Encoding is required as most of machine learning algorithms cannot handle the categorical values as is. They need numeric values to function and sometimes the performance of the algorithm depends on how the encoding is performed.
There are various methods to perform encoding such as -
Precision — Proportion of data points that are classified as true are actually true. It can also be understood as “ How many of the predicted items are relevant”.
Specificity — Proportion of the predicted False values which are actually False. This is also called a True Negative Rate (TNR) TN/(TN+FP)
Accuracy — This is the proportion of values that are predicted as correct values. This is not necessarily the best method to identify the predictive power of the model as it is dependent on the balance of the True and False value in the target variable. If there…