Emoji Modelling - Building an Emoji Autocomplete through Deep Neural nets

Saurav Prateek

Engineer @ Google | Ex-SWE @ GeeksForGeeks | Authoring engineering newsletter with 30K+ Subs | 60K+ Linkedin | Content Creator | Mentor

Published May 28, 2025

Large Language Models have taken the world by storm and it’s important to understand how they work internally. In this edition I will focus on Embedding and how language models use Word Embeddings to understand words and their similarities (or relationship between them).

LLMs, like other NLP models, use a technique called word embeddings (or more generally, token embeddings). These embeddings are dense vector representations of words (or sub-word units called tokens). The key idea is that words with similar meanings or that appear in similar contexts are located closer to each other in this high-dimensional vector space.

Keeping this idea in mind, I will implement a similar concept but not on textual data but on emojis. Yes, emojis! 🙂

We use emojis heavily while chatting and if we see some texts or tweets (having emojis) we can witness that they have some patterns. A text having a happy / positive sentiment might follow a smiley or a heart emoji. While a text / tweet having a negative / outrageous sentiment might have a sad or angry face emoji too!

We will use this idea to predict or autocomplete the next set of emojis given an initial emoji.

Setting up the Problem statement

Let’s set up a problem statement for our Neural Network to work upon so that we can build our Autocomplete engine. We know that a Neural Network works on the idea of Backpropagation where we give a set of input data along with the desired outcome and at every iteration calculate how far our network prediction is from the desired outcome. And then use backprop at every step to reduce this loss value and make the neural network prediction more accurate, useful and general.

Now let’s try to frame a problem statement for our Neural Network to work upon. We need our network to predict the future set of Emojis when given a set of Emojis as an input. For this our network needs to be trained on the dataset which has a series of emojis.

We pick this data from a public dataset of Tweets. When we pick emojis from the tweets we make sure that their context is preserved which we cannot achieve if we pick the emojis from a standard tabular dataset. By saying its context is preserved, I mean positive emojis follow the positive emojis (in tweets). It’s very unlikely we will see positive emojis following negative emojis (won’t make sense).

By this we preserve the order of the emojis so that they make sense and when trained on a Neural Network, our network can predict meaningful emojis. Following up on that, I picked up the Tweet dataset (having emojis) and extracted all the emojis out of it preserving their order.

Sharing some sample tweets below from the dataset which is used in training the neural net.

We can understand that if we preserve the order of emojis from the above content we will have multiple sets of emojis clubbed together that interpret a sentiment. Could be happy, sad, enraged, angry, fear, love many more…

Once we have the series of emojis extracted from the dataset, we make a pair of 3:1 dataset where we have 3 emojis in the series and the 4th one taken up as a target. In this way we order our entire series of emojis to build the training dataset for our network.

After processing the tweets and extracting the emojis and building the input dataset for our neural network, we have approx 90K sets of emojis with their targets.

Neural Network architecture

We have our dataset in place, now it’s time to build our Neural Network architecture. I will be using a fairly simple architecture which will be as follows.

Sequential (
  (0): Linear(in_features=30, out_features=100, bias=True)
  (1): Tanh()
  (2): Linear(in_features=100, out_features=50, bias=True)
  (3): Tanh()
  (4): Linear(in_features=50, out_features=839, bias=True)
)

I have used a 3-layer Neural Network followed by a Tanh activation function. The last layer returns logits which are used to understand the probability of each emojis to come next in the sequence. By comparing the probability of the target emoji we can train the network to improve upon it through backprop.

We have used Emoji Embedding which is a 10 dimensional Tensor which represents every emoji in our dataset by a 10 dimensional vector. The goal of the network is to train this embedding vector so that the network can make meaningful predictions.

Training the Neural Network

We trained our above defined neural net on the input emoji dataset with a batch of 1024 emoji sets (picked at random) at once for about 50K iterations and achieved a loss of approx 2.5

Inference

Now it’s time to use the above trained network to predict the next set of emojis from a starter emoji. Let’s see how well our network has trained to make some predictions.

These were some predictions made by our network.

You can see our model predicting some decent similar emojis!

We can use a more sophisticated model on a much more dense and cleaned dataset to build a more efficient emoji autocomplete model.

Conclusion

You can check my Youtube channel for more relatable technical content.

Meanwhile what you all can also do is to Like and Share this edition among your peers and also Subscribe to this newsletter so that you all can get notified when I come up with more content in future.

Until next time, Dive Deep and Keep Learning!

Systems That Scale

31,541 followers

+ Subscribe

Mustafa Radaideh

2mo

Nice one & worth reading.

1 Reaction

Ananya Shukla

2mo

Saurav Prateek thank you for sharing...it was worth reading 🚀👍

1 Reaction

Daniel Gillett

Node.js • NestJS • Serverless APIs • AWS • React • Next.js • Fullstack

2mo

Thank you to Saurav Prateek for sharing his thoughtful exploration of how deep neural networks and embedding techniques can be adapted to the world of emojis. I particularly enjoyed seeing how you preserve the natural order of emojis from live conversations helps the model capture real user sentiment sequences, rather than relying on abstract tabular data. Very insightful, and useful! Cheers!

Emoji Modelling - Building an Emoji Autocomplete through Deep Neural nets

Saurav Prateek

Engineer @ Google | Ex-SWE @ GeeksForGeeks | Authoring engineering newsletter with 30K+ Subs | 60K+ Linkedin | Content Creator | Mentor

Setting up the Problem statement

Neural Network architecture

Training the Neural Network

Inference

Conclusion

Systems That Scale

31,541 followers

More articles by this author

Others also viewed

In search of equivalent of CNNs for wireless communication

Artificial Neural Network in Machine Learning: Working And Types

🧠⚙️ Neuro-Symbolic Reinforcement Learning: Building Trustworthy and Generalizable AI

Deep tech concepts of LLMs using Neural Network in simple words.

Transformer Theory Made Simple

Transformers Made Simple: A User-Friendly guide to Formal Algorithms for Transformers

Which Machine Learning Model Simulates the Human Brain’s Interconnectivity?

Attention

Transformers Simplified: A Guide to Attention Is All You Need

How I Built a Transformer from Scratch!

Explore topics

Setting up the Problem statement

Neural Network architecture

Training the Neural Network

Inference

Conclusion

Systems That Scale

31,541 followers

Word Embeddings - How a Neural Net understands words and their relationships in a multi-dimensional space

Jun 23, 2025

Parallel execution of nodes in LangGraph - Enhancing the performance of your graph workflows

Mar 7, 2025

Dissecting Forward Propagation in Neural Networks

Feb 15, 2025

Dissecting Backpropagation in Neural Networks

Feb 9, 2025

A Deep Neural Network from scratch - Micrograd implemented in Java

Jan 29, 2025

Building Agentic RAG from scratch - A Youtube playlist

Oct 2, 2024

Tool Calling with LangChain - Do more with your AI agents

Sep 22, 2024

Evaluating our Retrieval Augmented Generation (RAG) framework’s performance

Sep 13, 2024

Hallucination in our Retrieval Augmented Generation (RAG) framework

Sep 8, 2024

Building a Document Grader in LangGraph | Prompt Templates and Conditional Edges in LangChain

Sep 1, 2024

Others also viewed

In search of equivalent of CNNs for wireless communication

Artificial Neural Network in Machine Learning: Working And Types

🧠⚙️ Neuro-Symbolic Reinforcement Learning: Building Trustworthy and Generalizable AI

Deep tech concepts of LLMs using Neural Network in simple words.

Transformer Theory Made Simple

Transformers Made Simple: A User-Friendly guide to Formal Algorithms for Transformers

Which Machine Learning Model Simulates the Human Brain’s Interconnectivity?

Attention

Transformers Simplified: A Guide to Attention Is All You Need

How I Built a Transformer from Scratch!

Explore topics