Artificial Intelligence 🤖
Introduction

Introduction

A neural network is a collection of neurons. A single neuron (Perceptron) performs linear regression without applying activation. It takes inputs, either real values or boolean, and calculates the weighted sum of:

z=wx+bz = wx + b

Based on a pre-defined threshold, the Perceptron can predict an output. Perceptrons are limited though - they can only output binary values. Also, if we tried to make a small change in either the weight or the bias term, then the Perceptron could flip its output entirely. To modify the output, we use the sigmoid function, allowing outputs between 0 and 1 which can be interpreted as a probability. This is the basis of logistic regression.

Sigmoid Neuron Representation

By stacking these neurons in layers, we can create a Deep Neural Network. These can have multiple hidden layers for more complex tasks:

Deep Neural Network

Supervised Learning

Neural networks often learn through Supervised Learning. You provide labeled data (inputs XX and outputs YY) to train the network to find the function that maps XX to YY

Neural Network Types

Different problems require different types of neural networks:

  • CNNs: Best for image tasks.
  • RNNs: Good for sequence tasks like speech.
  • Standard Neural Networks: Used for structured data.
  • Hybrid Networks: For complex, varied tasks.

Structured data is organized like that you find in databases. Unstructured data includes things like images or text.

Key Drivers for why DL is Taking Off

Deep learning is advancing due to three main factors:

  1. Data
  2. Computation
  3. Algorithms.

Data

Data Chart

  • Small data: With small data, a neural network can perform at levels similar to linear regression or SVM.
  • Big data: For big data, a small neural network outperforms SVM.
  • Bigger data: At this stage, the bigger the NN, the more we outperform smaller NNs.

The explosion of data sources has facilitated this:

  • Mobile devices: Ubiquitous and always on.
  • Internet of Things (IoT): A network of devices, all generating data.

Computation

The second pillar is computational power. The rise of powerful GPUs & CPUs, and distributed computing has made training complex neural networks faster and more efficient. ASICs (opens in a new tab) have also contributed to this rapid growth.

Algorithms

Creative new algorithms and techniques have appeared that change the way we work with Neural Networks, such as using the Rectified Linear Unit (RELU) over the Sigmoid function to solve problems like the vanishing gradient.

Deep learning frameworks

Deep learning frameworks have become essential tools in the field of Artificial Intelligence, providing pre-built functions, layers, and training algorithms that streamline the development of neural network models. Rather than building every component from the ground up, these frameworks enable you to focus on designing and implementing models tailored to specific problems. Some key frameworks are:

  • Torch/PyTorch: Known for its flexibility and dynamic computation graph, it's favored in research for rapid prototyping and experimentation.
  • TensorFlow: Developed by Google, it's renowned for its powerful production deployment capabilities and extensive community support.
  • Apache MXNet: Offers a balance between efficiency in both training and inference, and it's scalable across multiple GPUs and machines.
  • Caffe/Caffe2: Stands out for its speed in convolutional neural network (CNN) models and its simplicity in feeding data to networks.
  • CNTK: The Microsoft Cognitive Toolkit is recognized for its performance and scalability across multiple GPUs.
  • DL4J: A framework for Java and JVM, it brings deep learning to enterprise environments.
  • Keras: Acts as an interface for TensorFlow, prioritizing human-friendly code and modularity.
  • Lasagne: A lightweight library to build and train neural networks in Theano.
  • PaddlePaddle: Baidu's offering, known for its ease of use and scalability.
  • Theano: Although development has ceased, it laid the groundwork for many concepts in current frameworks.

When selecting a deep learning framework, I consider the following:

  • Ease of Programming: How simple is it to write and maintain code? The development experience can greatly affect productivity.
  • Running Speed: Evaluate the framework's performance, especially if you're working with large datasets or complex models.
  • Openness: An active open-source community and transparent governance are vital for long-term support and collaboration.

These frameworks are getting better month by month. Comparison between them can be found here (opens in a new tab).


Resources: