Artificial Intelligence 🤖
Recurrent Neural Networks (RNNs)
Introduction

Introduction

Sequence Model Motivation

Sequence Models, such as RNNs and LSTMs, have dramatically revolutionized learning from sequences. Applications of sequence data include:

  1. Speech Recognition (Sequence to Sequence):

    • XX: Wave sequence
    • YY: Text sequence
  2. Music Generation (One to Sequence):

    • XX: Nothing or an integer
    • YY: Wave sequence
  3. Sentiment Classification (Sequence to One):

    • XX: Text sequence
    • YY: Integer rating (1 to 5)
  4. DNA Sequence Analysis (Sequence to Sequence):

    • XX: DNA sequence
    • YY: DNA labels
  5. Machine Translation (Sequence to Sequence):

    • XX: Text sequence (in one language)
    • YY: Text sequence (in another language)
  6. Video Activity Recognition (Sequence to One):

    • XX: Video frames
    • YY: Activity label
  7. Name Entity Recognition (Sequence to Sequence):

    • XX: Text sequence
    • YY: Label sequence
    • Useful for search engines to index different types of words within a text.

Each of these problems, with varying input and output formats, can be approached as supervised learning with labeled data (X,Y)(X, Y) as the training set. XX and YY can have different lengths, and sometimes only one is a sequence.

Notation

We will adopt the following notations throughout this section, taking Name Entity Recognition as our motivating example:

  • X(1)X^{(1)}: "Harry Potter and Hermione Granger invented a new spell."

  • Y(1)Y^{(1)}: 1 1 0 1 1 0 0 0 0

    • Both sequences have a length of 9.
    • 1 indicates a name, while 0 indicates otherwise.
  • X(i)<t>X^{(i)<t>}: The tt-th element in the input sequence of the ii-th training example.

    • For example, X(1)<1>="Harry"X^{(1)<1>} = \text{"Harry"}, X(1)<2>="Potter"X^{(1)<2>} = \text{"Potter"}
  • Y(i)<t>Y^{(i)<t>}: The tt-th element in the output sequence of the ii-th training example.

    • For example, Y(1)<t>=1Y^{(1)<t>} = 1, Y(1)<2>=1Y^{(1)<2>} = 1
  • Tx(i)T_x^{(i)}: Length of the input sequence for the ii-th training example.

    • Varies across different examples.
  • Ty(i)T_y^{(i)}: Length of the output sequence for the ii-th training example.

Representing Words:

In NLP (Natural Language Processing), a key challenge is how to represent words. There are two main approaches:

  1. Vocabulary List:

    • Contains all target set words.
    • Example: [a, ..., And, ..., Harry, ..., Potter, ..., Zulu]
      • Each word has a unique index.
      • Sorted alphabetically.
    • Vocabulary sizes range from 30,000 to 50,000, with larger companies using up to a million.
    • Build a vocabulary list by analyzing texts for the most frequent words.
  2. One-Hot Encoding:

    • Create a one-hot encoded vector for each word based on the vocabulary.
    • Handle unknown words with a special token, such as <UKN>, in the vocabulary.

Example:

The objective is to learn a mapping from this representation of XX to the target output YY as part of a supervised learning problem.