KITTI Dataset

Introduction The KITTI Vision Benchmark Suite is a repository of real-world data for autonomous driving created by Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago [1]. The series of datasets in KITTI is widely used for research because...

Region Proposal Network

Region proposal networks (RPNs) are a significant component in many object detection models. The input to RPNs is typically an extracted feature map. Every pixel in the feature map is an anchor point. Each anchor point can have multiple anchors, which are candidate...

Sparsely Embedded Convolutional Detection

Voxelnet proposed in [1] is a milestone because the model minimizes the effort of manual preprocessing by automatic feature extraction. The sparsely embedded convolutional detection model proposed in [2] replaces the regular convolutional layers with sparse...

Introduction to Neural Networks

Two key components in a neural network are neurons and connections between neurons. Each neuron or node in the neural network performs a function on the input and optionally uses a nonlinear activation function before outputting. A connection transfers the weighted...

Deep learning for Object Detection

Object detection is the task of detecting and recognizing objects in media such as images. Objects of interest need to be recognized, positioned, and classified. Object detection has been popular in 2D computer vision since the success of CNN in image recognition....

Perception in Self-driving

Sensors Various types of sensors are used for perception in the self-driving industry. Most current solutions rely on either a camera or LIDAR as the main sensor. A camera captures reflections of light passively and stores data as 2D images. In contrast, LIDAR...

Kaggle Overview

Kaggle is a platform for data sciences developer. It is based on two programming languages, Python and R . It has many outstanding features : You can find and use dataset in your machine learning application. You can find datasets in the link...

Python projektet

Jag är en pythonprogrammerare. Jag har fyra års erfarenhet av python. Jag är redo att acceptera Python-projektet.   Keras, TensorFlow, Scipy, Numpy, Konstgjort neuralt nätverk i Python, Bildbehandling i Python, OpenCV, Pybrain, Matplotlib, Scikit-Learn , Pandas...

JUPYTER: USING IT AND ITS GREAT FUNCTIONS

The Jupyter notebook is a great friend of the data scientist. It allows the user to write code and create visualizations of data all in the same tab on their browser. It is included in the standard distribution of Anaconda, and can be launched from the command line...

Efficient retrieval

The retrieval can be slow because it’s a brute-force method. Matching can be made faster using approximate nearest neighbor. The curse of dimensionality also kicks in, as shown in the following figure: With every increasing dimension, complexity increases as the...

Building the retrieval pipeline

The sequence of steps to get the best matches from target images for a query image is called the retrieval pipeline. The retrieval pipeline has multiple steps or components. The features of the image database have to be extracted offline and stored in a database. For...

Content-based image retrieval

The technique of Content-based Image Retrieval (CBIR) takes a query image as the input and ranks images from a database of target images, producing the output. CBIR is an image to image search engine with a specific goal. A database of target images is required for...

Classification

Image classification is the task of labelling the whole image with an object or concept with confidence. The applications include gender classification given an image of a person’s face, identifying the type of pet, tagging photos, and so on. The following is an...

Deep learning for computer vision

Computer vision enables the properties of human vision on a computer. A computer could be in the form of a smartphone, drones, CCTV, MRI scanner, and so on, with various sensors for perception. The sensor produces images in a digital form that has to be interpreted by...

Long short-term memory (LSTM)

Long short-term memory (LSTM) can store information for longer periods of time, and hence, it is efficient in capturing long-term efficiencies. The following figure illustrates how an LSTM cell is designed:   LSTM has several gates: forget, input, and output....

Recurrent neural networks (RNN)

Recurrent neural networks (RNN) can model sequential information. They do not assume that the data points are intensive. They perform the same task from the output of the previous data of a series of sequence data. This can also be thought of as memory. RNN cannot...

Max pooling in Convolutional neural network

Pooling layers are placed between convolution layers. Pooling layers reduce the size of the image across layers by sampling. The sampling is done by selecting the maximum value in a window. Average pooling averages over the window. Pooling also acts as a...

Kernel in Convolutional neural network

Kernel is the parameter convolution layer used to convolve the image. The convolution operation is shown in the following figure:     The kernel has two parameters, called stride and size. The size can be any dimension of a rectangle. Stride is the number of...

Convolutional neural network

Convolutional neural networks (CNN) are similar to the neural networks described in the previous sections. CNNs have weights, biases, and outputs through a nonlinear activation. Regular neural networks take inputs and the neurons fully connected to the next layers....

Playing with TensorFlow playground

TensorFlow playground is an interactive visualization of neural networks. Visit http://playground.tensorflow.org/, play by changing the parameters to see how the previously mentioned terms work together. Here is a screenshot of the playground:  ...

Stochastic gradient descent

SGD is the same as gradient descent, except that it is used for only partial data to train every time. The parameter is called mini-batch size. Theoretically, even one example can be used for training. In practice, it is better to experiment with various numbers. In...

Softmax in Neural Network

Softmax is a way of forcing the neural networks to output the sum of 1. Thereby, the output values of the softmax function can be considered as part of a probability distribution. This is useful in multi-class classification problems. Softmax is a kind...

Gradient descent

The gradient descent algorithm performs multidimensional optimization. The objective is to reach the global maximum. Gradient descent is a popular optimization technique used in many machine-learning models. It is used to improve or optimize the model prediction. One...

Backpropagation

A backpropagation algorithm is commonly used for training artificial neural networks. The weights are updated from backward based on the error calculated as shown in the following image: After calculating the error, gradient descent can be used to calculate the weight...

Training neural networks

Training ANN is tricky as it contains several parameters to optimize. The procedure of updating the weights is called backpropagation. The procedure to minimize the error is called optimization. 

L1 and L2 regularization

L1 penalizes the absolute value of the weight and tends to make the weights zero. L2 penalizes the squared value of the weight and tends to make the weight smaller during the training. Both the regularizes assume that models with smaller weights are better.

Batch normalization

Batch normalization, or batch-norm, increase the stability and performance of neural network training. It normalizes the output from a layer with zero mean and a standard deviation of 1. This reduces overfitting and makes the network train faster. It is very useful in...

Dropout in Neural Network

Dropout is an effective way of regularizing neural networks to avoid the overfitting of ANN. During training, the dropout layer cripples the neural network by removing hidden units stochastically as shown in the following image:   Note how the neurons are...

Cross-entropy

Cross-entropy compares the distance between the outputs of softmax and one-hot encoding. Cross-entropy is a loss function for which error has to be minimized. Neural networks estimate the probability of the given data to every class. The probability has to be...

One-hot encoding

One-hot encoding is a way to represent the target variables or classes in case of a classification problem. The target variables can be converted from the string labels to one-hot encoded vectors. A one-hot vector is filled with 1 at the index of the target class but...

Perceptron

An artificial neuron or perceptron takes several inputs and performs a weighted summation to produce an output. The weight of the perceptron is determined during the training process and is based on the training data. The following is a diagram of the perceptron:...