In this simple TensorFlow multiclass prediction problem in Python we will be an NLP model and attempt to classify poems as Affection, Death, Environment, or Love. Poems are a very difficult NLP classification due to the abstract nature of the writing makes this a very difficult dataset to get a very high score. Our goal will be an accuracy of .50, which is normally very low but very difficult to achieve on Poem classification.
We will build our model today in the Sequential model in TensorFlow. The first layer in our model will be an Embedding layer. We will follow our embedding layer with a special type of dropout layer called Spatial Dropout and this drops each neuron by changing not a certain percentage of each layer. This allows some epochs to have more or less total neurons turned off. This type of dropout works for NLP problems.
After our Spatial Dropout layers, we use a Global Average Pooling layers and that's it. This simple style of NLP model doesn't work well on all types of datasets but tends to work well when the concept of the words is abstract like is often true in poems.
Follow Data Science Teacher Brandyn
dataGroups:
Here we use tried to use replace in Pandas but \xa0 turns out to be a special character to represent a line break and we had to use the split and join functions to fix this.
Comments