Access all tutorials at https://www.muratkarakaya.net
Code: https://colab.research.google.com/dri...
Text Generation Playlist: • Text Generation in Deep Learning with...
TensorFlow Input Pipeline Playlist: • TensorFlow Data Pipeline: How to Desi...
All About LSTM playlist: • All About LSTM
Character Level Text Generation with an LSTM Model
This tutorial is the fifth part of the "Text Generation in Deep Learning with Tensorflow & Keras" series. In this series, we have been covering all the topics related to Text Generation with sample implementations in Python, Tensorflow & Keras. In this tutorial, we will focus on how to build a Language Model using Keras LSTM layer for Character Level Text Generation. First, we will download a sample corpus (text file). After opening the file, we will apply the TensorFlow input pipeline that we have developed in Part B to prepare the training dataset by preprocessing and splitting the text into input character sequence (X) and output character (y). Then, we will design an LSTM-based Language Model and train it using the train set. Later on, we will apply several sampling methods that we have implemented in Part D to generate text and observe the effect of these sampling methods on the generated text. Thus, in the end, we will have a trained LSTM-based Language Model for character-level text generation with three sampling methods.
If you would like to learn more about Deep Learning with practical coding examples, please subscribe to Murat Karakaya Akademi YouTube Channel or follow my blog on Medium
You can access this Colab Notebook using the link given in the video description below.
If you are ready, let's get started!
Text Generation in Deep Learning with Tensorflow & Keras Series:
Part A: Fundamentals
Part B: Tensorflow Data Pipeline for Character Level Text Generation
Part C: Tensorflow Data Pipeline for Word Level Text Generation
Part D: Sampling in Text Generation
Part E: Recurrent Neural Network (LSTM) Model for Character Level Text Generation
Part F: Encoder-Decoder Model for Character Level Text Generation
Part G: Recurrent Neural Network (LSTM) Model for Word Level Text Generation
Part H: Encoder-Decoder Model for Word Level Text Generation
You can watch all these parts on Murat Karakaya Akademi channel on YouTube in ENGLISH or TURKISH
I assume that you have already watched all previous parts.
Please ensure that you have reviewed the previous parts in order to utilize this part better.
References
What is a Data Pipeline?
tf.data: Build TensorFlow input pipelines
Text classification from scratch
Working with Keras preprocessing layers
Character-level text generation with LSTM
Toward Controlled Generation of Text
Attention Is All You Need
What is the difference between word-based and char-based text generation RNNs?
The survey: Text generation models in deep learning
Generative Adversarial Networks for Text Generation
FGGAN: Feature-Guiding Generative Adversarial Networks for Text Generation
How to sample from language models
How to generate text: using different decoding methods for language generation with Transformers
Hierarchical Neural Story Generation
How to sample from language models
Text generation with LSTM
A guide to language model sampling in AllenNLP
Generating text from the language model
How to Implement a Beam Search Decoder for Natural Language Processing
Controllable Neural Text Generation
What is a Character Level Text Generation?
A Language Model can be trained to generate text character-by-character. In this case, each of the input and output tokens is a character. Moreover, Language Model outputs a conditional probability distribution over the character set.
1. BUILD A TENSORFLOW INPUT PIPELINE
For more information please refer to Part B: Tensorflow Data Pipeline for Character Level Text Generation on Youtube ( ENGLISH / TURKISH) or Medium.
What is a Data Pipeline?
Data Pipeline is an automated process that involves in extracting, transforming, combining, validating, and loading data for further analysis and visualization.
It provides end-to-end velocity by eliminating errors and combatting bottlenecks or latency.
It can process multiple data streams at once.
In short, it is an absolute necessity for today’s data-driven solutions.
If you are not familiar with data pipelines, you can check my tutorials in English or Turkish.
What will we do in this Text Data pipeline?
We will create a data pipeline to prepare training data for character-level text generator.
convert the text into a sequence of characters
remove unwanted characters such as punctuations, HTML tags, white spaces, etc.
generate input (X) and output (y) pairs as character sequences
cache, prefetch, and batch the train data for performance
Watch video Character Level Text Generation with a LSTM Based Language Model online without registration, duration hours minute second in high quality. This video was added by user Murat Karakaya Akademi 01 January 1970, don't forget to share it with your friends and acquaintances, it has been viewed on our site 2,019 once and liked it 22 people.