{"id":10459,"date":"2024-07-17T10:11:58","date_gmt":"2024-07-17T10:11:58","guid":{"rendered":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/"},"modified":"2024-11-25T11:36:27","modified_gmt":"2024-11-25T11:36:27","slug":"semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21","status":"publish","type":"post","link":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/","title":{"rendered":"Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80"},"content":{"rendered":"<p class=\"wp-block-paragraph\"><\/p>\n<p class=\"wp-block-paragraph\">If you are on social media like Twitter or LinkedIn, you have probably noticed that emojis are creatively used in both informal and professional text-based communication. For example, the <em>Rocket<\/em> emoji \ud83d\ude80  is often used on LinkedIn to symbolize high aspirations and ambitious goals, while the <em>Bullseye<\/em> \ud83c\udfaf  emoji is used in the context of achieving goals. Despite this growth of creative emoji use, most social media platforms lack a utility that assists users in choosing the right emoji to effectively communicate their message. I therefor<a href=\"https:\/\/emojeez.streamlit.app\/\">e de<\/a>cided to invest some time to work on a project I called Emojeez \ud83d\udc8e  , an AI-powered engine for emoji search and retrieval. You can experience Emojeez \ud83d\udc8e   live using this fun interactive demo.<\/p>\n<p class=\"wp-block-paragraph\">In this article, I will discuss my experience and explain how I employed advanced <strong>natural language processing<\/strong> (NLP) technologies to develop a <strong>semantic search engine<\/strong> for emojis. Concretely, I will present a case study on embedding-based semantic search with the following steps<\/p>\n<ol class=\"wp-block-list\">\n<li>How to use <strong>LLMs<\/strong> \ud83e\udd9cto generate semantically rich emoji descriptions<\/li>\n<li>How to use Hugging Face \ud83e\udd17 T<strong>ransformers<\/strong> for multilingual embeddings<\/li>\n<li>How to integrate <strong>Qdrant<\/strong> \ud83e\uddd1\ud83c\udffb \u200d\ud83d\ude80  vector database to perform efficient semantic search<\/li>\n<\/ol>\n<p class=\"wp-block-paragraph\">I made the full code for this project available on <a href=\"https:\/\/github.com\/badrex\/emojeez\">GitHub<\/a>.<\/p>\n<h2 class=\"wp-block-heading\">Inspiration\ud83d\udca1<\/h2>\n<p class=\"wp-block-paragraph\">Every new idea often begins with a spark of inspiration. For me, the spark came from Luciano Ramalho&#8217;s book <em>Fluent Python<\/em>. It is a fantastic read that I highly recommend for anyone who likes to write truly Pythonic code. In chapter 4 of his book, Luciano shows how to search over Unicode characters by querying their names in the Unicode standards. He created a Python utility that takes a query like &quot;cat smiling&quot; and retrieves all Unicode characters that have both &quot;cat&quot; and &quot;smiling&quot; in their names. Given the query &quot;cat smiling&quot;, the utility retrieves three emojis: \ud83d\ude3b , \ud83d\ude3a , and \ud83d\ude38 . Pretty cool, right?<\/p>\n<p class=\"wp-block-paragraph\">From there, I started thinking how modern AI technology could be used to build an even better emoji search utility. By &quot;better,&quot; I envisioned a search engine that not only has better emoji coverage but also supports user queries in multiple languages beyond English.<\/p>\n<h2 class=\"wp-block-heading\">Limitations of Keyword Search \ud83d\ude13<\/h2>\n<p class=\"wp-block-paragraph\">If you are an emoji enthusiast, you know that \ud83d\ude3b , \ud83d\ude3a , and \ud83d\ude38   aren&#8217;t the only smiley cat emojis out there. Some cat emojis are missing, notably \ud83d\ude38   and \ud83d\ude39  . This is a known limitation of keyword search algorithms, which rely on string matching to retrieve relevant items. Keyword, <strong>or lexical sea<\/strong>rch algorithms, are known among information retrieval practitioners to ha<strong>ve high precis<\/strong>ion b<strong>ut low rec<\/strong>all. High precision means the retrieved items usually match the user query well. One the other hand, low recall means the algorithm might not retrieve all relevant items. In many cases, the lower recall is due to string matching. For example, the emoji \ud83d\ude39   does not have &quot;smiling&quot; in its nam<em>e &#8211; cat with tears of<\/em> joy. Therefore, it cannot be retrieved with the query &quot;cat smiling&quot; if we search for both te<em>rms<\/em> cat _and smi_ling in its name.<\/p>\n<p class=\"wp-block-paragraph\">Another issue with lexical search is that it is usually <strong>language-specific<\/strong>. In Luciano&#8217;s Fluent Python example, you can&#8217;t find emojis using a query in another language because all Unicode characters, including emojis, have English names. To support other languages, we would need to translate each query into English first using machine translation. This will add more complexity and might not work well for all languages.<\/p>\n<p class=\"wp-block-paragraph\">But hey, it&#8217;s 2024 and AI has come a long way. We now have solutions to address these limitations. In the rest of this article, I will show you how.<\/p>\n<h2 class=\"wp-block-heading\">Embedding-based Semantic Search \u2728<\/h2>\n<p class=\"wp-block-paragraph\">In recent years, a new search paradigm has emerged with the popularity of deep neural networks for NLP. In this paradigm, the search algorithm does not look at the strings that make up the items in the search database or the query. Instead, it operates on numerical representations of text, known as <strong>vector embeddings<\/strong>. In embedding-based search algorithms, the search items, whether text documents or visual images, are first converted into data points in a vector space such that <strong>semantically relevant<\/strong> items are nearby. Embeddings enable us to perform similarity search based on the meaning of the emoji description rather than the keywords in its name. Because they retrieve items based on <strong>semantic similarity<\/strong> rather than keyword similarity, embedding-based search algorithms are known as semantic search.<\/p>\n<p class=\"wp-block-paragraph\">Using semantic search for emoji retrieval solves two problems:<\/p>\n<ol class=\"wp-block-list\">\n<li>We can go beyond keyword matching and use semantic similarity between emoji descriptions and user queries. This improves the coverage of the retrieved emojis, leading to higher recall.<\/li>\n<li>If we represent emojis as data points in a <strong>multilingual embedding<\/strong> space, we can enable user queries written in languages other than English, without needing translation into English. That is very cool, isn&#8217;t it? Let&#8217;s see how \ud83d\udc40 <\/li>\n<\/ol>\n<h2 class=\"wp-block-heading\">Step 1: Generating Rich Emoji Descriptions using LLMs \ud83e\udd9c<\/h2>\n<p class=\"wp-block-paragraph\">If you use social media, you probably know that many emojis are almost never used literally. For example, \ud83c\udf46  and \ud83c\udf51  rarely denote an <em>eggplant<\/em> and <em>peach<\/em>. Social media users are very creative in assigning meanings to emojis that go beyond their literal interpretation. This creativity limits the expressiveness of emoji names in the Unicode standards. A notable example is the \ud83c\udf08  emoji, which is described in the Unicode name simply as <em>rainbow<\/em>, yet it is commonly used in contexts related to diversity, peace, and LGBTQ+ community.<\/p>\n<p class=\"wp-block-paragraph\">To build a useful search engine, we need a rich semantic description for each emoji that defines what the emoji represents and what it symbolizes. Given that there are more than 5000 emojis in the current Unicode standards, doing this manually is not feasible. Luckily, we can employ <strong>Large Language Models<\/strong> (LLMs) to assist us in generating metadata for each emoji. Since LLMs are trained on the entire web, they have likely seen how each emoji is used in context.<\/p>\n<p class=\"wp-block-paragraph\">For this task, I used the \ud83e\udd99 L<strong>lama 3<\/strong> LLM to generate metadata for each emoji. I wrote a prompt to define the task and what the LLM is expected to do. As illustrated in the figure below, the LLM generated a rich semantic description for the B<em>ullseye<\/em> \ud83c\udfaf  emoji. These descriptions are more suitable for semantic search compared to Unicode names. I released the LLM-generated descriptions as a Hugging Face d<a href=\"https:\/\/huggingface.co\/datasets\/badrex\/llm-emoji-dataset\">ataset.<\/a><\/p>\n<p class=\"wp-block-paragraph\"><\/p>\n<h2 class=\"wp-block-heading\">Step 2: Representing Emojis as Embeddings using Sentence Transformers \ud83d\udd04<\/h2>\n<p class=\"wp-block-paragraph\">Now that we have a rich semantic description for each emoji in the Unicode standard, the next step is to represent each emoji as a vector embedding in a multidimensional space that captures the meaning of the emoji description. For this task, I used a multilingual transformer based on the <strong>BERT<\/strong> architecture, fine-tuned for sentence similarity across 50 languages. You can see the supported languages in the model <a href=\"https:\/\/huggingface.co\/sentence-transformers\/paraphrase-multilingual-MiniLM-L12-v2\">card<\/a> in the Hugging Face \ud83e\udd17 library.<\/p>\n<p class=\"wp-block-paragraph\">So far, I have only discussed the embedding of emoji descriptions generated by the LLM, which are in English. But how can we support languages other than English?<\/p>\n<p class=\"wp-block-paragraph\">Well, here&#8217;s where the magic of multilingual transformers comes in. The multilingual support is enabled through the embedding space itself. This means we can take user queries in any of the 50 supported languages and match them to emojis based on their English descriptions. The multilingual sentence encoder (or embedding model) maps semantically similar text phrases to nearby points in its embedding space. Let me show you what I mean with the following illustration.<\/p>\n<p class=\"wp-block-paragraph\"><\/p>\n<p class=\"wp-block-paragraph\">In the figure above, we see that semantically similar phrases end up being data points that are nearby in the embedding space, even if they are expressed in different languages. Multilingual sentence Transformers enable <strong>cross-lingual search<\/strong> applications, therefore user queries and indexed search items do not have to be expressed in the same language.<\/p>\n<h2 class=\"wp-block-heading\">Step 3: Integrating Qdrant&#8217;s Vector Database \ud83e\uddd1\ud83c\udffb \u200d\ud83d\ude80<\/h2>\n<p class=\"wp-block-paragraph\">Once we have our emojis represented as vector embeddings, the next step is to build an index over these embeddings in a way that allows for efficient search operations. For this purpose, I chose to use <strong>Qdrant<\/strong>, an open-source vector similarity search engine that provides high-performance search capabilities.<\/p>\n<p class=\"wp-block-paragraph\">Setting up Qdrant for this task is a simple as the code snippet below (you can also check out this Jupyter <a href=\"https:\/\/github.com\/badrex\/emojeez\/blob\/main\/notebooks\/emoji_search_notebook.ipynb\">Notebook<\/a>).<\/p>\n<pre class=\"wp-block-code\"><code># Load the emoji dictionary from a pickle file\nwith open(file_path, &#039;rb&#039;) as file:\n    emoji_dict: Dict[str, Dict[str, Any]] = pickle.load(file)\n\n# Setup the Qdrant client and populate the database\nvector_DB_client = QdrantClient(&quot;:memory:&quot;)\n\nembedding_dict = {\n    emoji: np.array(metadata[&#039;embedding&#039;]) \n    for emoji, metadata in emoji_dict.items()\n}\n\n# Remove the embeddings from the dictionary so it can be used \n# as payload in Qdrant\nfor emoji in list(emoji_dict):\n    del emoji_dict[emoji][&#039;embedding&#039;]\n\nembedding_dim: int = next(iter(embedding_dict.values())).shape[0]\n\n# Create a new collection in Qdrant\nvector_DB_client.create_collection(\n    collection_name=&quot;EMOJIS&quot;,\n    vectors_config=models.VectorParams(\n        size=embedding_dim, \n        distance=models.Distance.COSINE\n    ),\n)\n\n# Upload vectors to the collection\nvector_DB_client.upload_points( \n    collection_name=&quot;EMOJIS&quot;,\n    points=[\n        models.PointStruct(\n            id=idx, \n            vector=embedding_dict[emoji].tolist(),\n            payload=emoji_dict[emoji]\n        )\n        for idx, emoji in enumerate(emoji_dict)\n    ],\n)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Now the search index _vector_DB<em>client<\/em> is ready to take queries. All we need to do is to transform the coming user query into a vector embedding using the same embedding model we used to embed the emoji descriptions. This can be done through the function below.<\/p>\n<pre class=\"wp-block-code\"><code>def retrieve_relevant_emojis(\n        embedding_model: SentenceTransformer,\n        vector_DB_client: QdrantClient,\n        query: str, \n        num_to_retrieve: int) -&amp;gt; List[str]:\n    &quot;&quot;&quot;\n    Return emojis relevant to the query using sentence encoder and Qdrant. \n    &quot;&quot;&quot;\n\n    # Embed the query\n    query_vector = embedding_model.encode(query).tolist()\n\n    hits = vector_DB_client.search(\n        collection_name=&quot;EMOJIS&quot;,\n        query_vector=query_vector,\n        limit=num_to_retrieve,\n    )\n\n    return hits<\/code><\/pre>\n<p class=\"wp-block-paragraph\">To further show the retrieved emojis, their similarity score with the query, and their Unicode names, I wrote the following helper function.<\/p>\n<pre class=\"wp-block-code\"><code>def show_top_10(query: str) -&amp;gt; None:\n    &quot;&quot;&quot;\n    Show emojis that are most relevant to the query.\n    &quot;&quot;&quot;\n    emojis = retrieve_relevant_emojis(\n        sentence_encoder, \n        vector_DB_clinet, \n        query, \n        num_to_retrieve=10\n    )\n\n    for i, hit in enumerate(emojis, start=1):\n\n        emoji_char = hit.payload[&#039;Emoji&#039;]\n        score = hit.score\n\n        space = len(emoji_char) + 3\n\n        unicode_desc = &#039; &#039;.join(\n           em.demojize(emoji_char).split(&#039;_&#039;)\n        ).upper()\n\n        print(f&quot;{i:&amp;lt;3} {emoji_char:&amp;lt;{space}}&quot;, end=&#039;&#039;)\n        print(f&quot;{score:&amp;lt;7.3f}&quot;, end= &#039;&#039;)\n        print(f&quot;{unicode_desc[1:-1]:&amp;lt;55}&quot;)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Now everything is set up, and we can look at a few examples. Remember the &quot;cat smiling&quot; query from Luciano&#8217;s book? Let&#8217;s see how semantic search is different from keyword search.<\/p>\n<pre class=\"wp-block-code\"><code>&amp;gt;&amp;gt;&amp;gt; show_top_10(&#039;cat smiling&#039;)\n1   \ud83d\ude3c    0.651  CAT WITH WRY SMILE                                     \n2   \ud83d\ude38    0.643  GRINNING CAT WITH SMILING EYES                         \n3   \ud83d\ude39    0.611  CAT WITH TEARS OF JOY                                  \n4   \ud83d\ude3b    0.603  SMILING CAT WITH HEART-EYES                            \n5   \ud83d\ude3a    0.596  GRINNING CAT                                           \n6   \ud83d\udc31    0.522  CAT FACE                                               \n7   \ud83d\udc08     0.513  CAT                                                    \n8   \ud83d\udc08  \u200d\u2b1b   0.495  BLACK CAT                                              \n9   \ud83d\ude3d    0.468  KISSING CAT                                            \n10  \ud83d\udc06    0.452  LEOPARD<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Awesome! Not only did we get the expected cat emojis like \ud83d\ude38 , \ud83d\ude3a , and \ud83d\ude3b , which the keyword search retrieved, but it also the smiley cats \ud83d\ude3c , \ud83d\ude39 , \ud83d\udc31 , and \ud83d\ude3d . This showcases the higher recall, or higher coverage of the retrieved items, I mentioned earlier. Indeed, more cats is always better!<\/p>\n<h2 class=\"wp-block-heading\">The Real Power of Semantic Search \ud83e\ude84<\/h2>\n<p class=\"wp-block-paragraph\">The previous &quot;cat smiling&quot; example shows how embedding-based semantic search can retrieve a broader and more meaningful set of items, improving the overall search experience. However, I don&#8217;t think this example truly shows the power of semantic search.<\/p>\n<p class=\"wp-block-paragraph\">Imagine looking for something but not knowing its name. For example, take the \ud83e\uddff object. Do you know what it&#8217;s called in English? I sure didn&#8217;t. But I know a bit about it. In Middle Eastern and Central Asian cultures, the \ud83e\uddff is believed to protect against the evil eye. So, I knew what it does but not what it&#8217;s called.<\/p>\n<p class=\"wp-block-paragraph\">Let&#8217;s see if we can find the emoji \ud83e\uddff with our search engine by describing it using the query &quot;protect from evil eye&quot;.<\/p>\n<pre class=\"wp-block-code\"><code>&amp;gt;&amp;gt;&amp;gt; show_top_10(&#039;protect from evil eye&#039;)\n1   \ud83e\uddff   0.409  NAZAR AMULET                                           \n2   \ud83d\udc53    0.405  GLASSES                                                \n3   \ud83e\udd7d   0.387  GOGGLES                                                \n4   \ud83d\udc41    0.383  EYE                                                    \n5   \ud83e\uddb9\ud83c\udffb     0.382  SUPERVILLAIN LIGHT SKIN TONE                           \n6   \ud83d\udc40    0.374  EYES                                                   \n7   \ud83e\uddb9\ud83c\udfff    0.370  SUPERVILLAIN DARK SKIN TONE                            \n8   \ud83d\udee1 \ufe0f   0.369  SHIELD                                                 \n9   \ud83e\uddb9\ud83c\udffc    0.366  SUPERVILLAIN MEDIUM-LIGHT SKIN TONE                    \n10  \ud83e\uddb9\ud83c\udffb  \u200d\u2642   0.364  MAN SUPERVILLAIN LIGHT SKIN TONE                       <\/code><\/pre>\n<p class=\"wp-block-paragraph\">And Viola! It turns out that the \ud83e\uddff is actually called N<em>azar Amulet.<\/em> I learned something new \ud83d\ude04<\/p>\n<h2 class=\"wp-block-heading\">Going Beyond English \ud83c\udf0d  \ud83c\udf0f  \ud83c\udf0e<\/h2>\n<p class=\"wp-block-paragraph\">One of the features I really wanted for this search engine to have is for it to support as many languages besides English as possible. So far, we have not tested that. Let&#8217;s test the multilingual capabilities using the description of the <em>Nazar Amulet<\/em> \ud83e\uddff emoji by translating the phrase &quot;protection from evil eyes&quot; into other languages and using them as queries one language at a time. Here are the result below for some languages.<\/p>\n<h2 class=\"wp-block-heading\">Arabic<\/h2>\n<pre class=\"wp-block-code\"><code>&amp;gt;&amp;gt;&amp;gt; show_top_10(&#039;\u064a\u062d\u0645\u064a \u0645\u0646 \u0627\u0644\u0639\u064a\u0646 \u0627\u0644\u0634\u0631\u064a\u0631\u0629&#039;) # Arabic\n1   \ud83e\uddff   0.442  NAZAR AMULET                                           \n2   \ud83d\udc53    0.430  GLASSES                                                \n3   \ud83d\udc41    0.414  EYE                                                    \n4   \ud83e\udd7d   0.403  GOGGLES                                                \n5   \ud83d\udc40    0.403  EYES                                                   \n6   \ud83e\uddb9\ud83c\udffb     0.398  SUPERVILLAIN LIGHT SKIN TONE                           \n7   \ud83d\ude48    0.394  SEE-NO-EVIL MONKEY                                     \n8   \ud83e\udee3   0.387  FACE WITH PEEKING EYE                                  \n9   \ud83e\udddb\ud83c\udffb     0.385  VAMPIRE LIGHT SKIN TONE                                \n10  \ud83e\uddb9\ud83c\udffc    0.383  SUPERVILLAIN MEDIUM-LIGHT SKIN TONE<\/code><\/pre>\n<h2 class=\"wp-block-heading\">German<\/h2>\n<pre class=\"wp-block-code\"><code>&amp;gt;&amp;gt;&amp;gt; show_top_10(&#039;Vor dem b\u00f6sen Blick sch\u00fctzen&#039;) # Deutsch \n1   \ud83d\ude37    0.369  FACE WITH MEDICAL MASK                                 \n2   \ud83e\udee3   0.364  FACE WITH PEEKING EYE                                  \n3   \ud83d\udee1 \ufe0f   0.360  SHIELD                                                 \n4   \ud83d\ude48    0.359  SEE-NO-EVIL MONKEY                                     \n5   \ud83d\udc40    0.353  EYES                                                   \n6   \ud83d\ude49    0.350  HEAR-NO-EVIL MONKEY                                    \n7   \ud83d\udc41    0.346  EYE                                                    \n8   \ud83e\uddff   0.345  NAZAR AMULET                                           \n9   \ud83d\udc82\ud83c\udfff  \u200d\u2640\ufe0f   0.345  WOMAN GUARD DARK SKIN TONE                             \n10  \ud83d\udc82\ud83c\udfff  \u200d\u2640   0.345  WOMAN GUARD DARK SKIN TONE<\/code><\/pre>\n<h2 class=\"wp-block-heading\">Greek<\/h2>\n<pre class=\"wp-block-code\"><code>&amp;gt;&amp;gt;&amp;gt; show_top_10(&#039;\u03a0\u03c1\u03bf\u03c3\u03c4\u03b1\u03c4\u03ad\u03c8\u03c4\u03b5 \u03b1\u03c0\u03cc \u03c4\u03bf \u03ba\u03b1\u03ba\u03cc \u03bc\u03ac\u03c4\u03b9&#039;) #Greek\n1   \ud83d\udc53    0.497  GLASSES                                                \n2   \ud83e\udd7d   0.484  GOGGLES                                                \n3   \ud83d\udc41     0.452  EYE                                                    \n4   \ud83d\udd76  \ufe0f   0.430  SUNGLASSES                                             \n5   \ud83d\udd76     0.430  SUNGLASSES                                             \n6   \ud83d\udc40    0.429  EYES                                                   \n7   \ud83d\udc41  \ufe0f   0.415  EYE                                                    \n8   \ud83e\uddff   0.411  NAZAR AMULET                                           \n9   \ud83e\udee3   0.404  FACE WITH PEEKING EYE                                  \n10  \ud83d\ude37    0.391  FACE WITH MEDICAL MASK<\/code><\/pre>\n<h2 class=\"wp-block-heading\">Bulgarian<\/h2>\n<pre class=\"wp-block-code\"><code>&amp;gt;&amp;gt;&amp;gt; show_top_10(&#039;\u0417\u0430\u0449\u0438\u0442\u0435\u0442\u0435 \u043e\u0442 \u043b\u043e\u0448\u043e\u0442\u043e \u043e\u043a\u043e&#039;) # Bulgarian\n1   \ud83d\udc53    0.475  GLASSES                                                \n2   \ud83e\udd7d   0.452  GOGGLES                                                \n3   \ud83d\udc41     0.448  EYE                                                    \n4   \ud83d\udc40    0.418  EYES                                                   \n5   \ud83d\udc41  \ufe0f   0.412  EYE                                                    \n6   \ud83e\udee3   0.397  FACE WITH PEEKING EYE                                  \n7   \ud83d\udd76  \ufe0f   0.387  SUNGLASSES                                             \n8   \ud83d\udd76     0.387  SUNGLASSES                                             \n9   \ud83d\ude1d    0.375  SQUINTING FACE WITH TONGUE                             \n10  \ud83e\uddff   0.373  NAZAR AMULET<\/code><\/pre>\n<h2 class=\"wp-block-heading\">Chinese<\/h2>\n<pre class=\"wp-block-code\"><code>&amp;gt;&amp;gt;&amp;gt; show_top_10(&#039;\u9632\u6b62\u90aa\u773c&#039;) # Chinese\n1   \ud83d\udc53    0.425  GLASSES                                                \n2   \ud83e\udd7d   0.397  GOGGLES                                                \n3   \ud83d\udc41    0.392  EYE                                                    \n4   \ud83e\uddff   0.383  NAZAR AMULET                                           \n5   \ud83d\udc40    0.380  EYES                                                   \n6   \ud83d\ude48    0.370  SEE-NO-EVIL MONKEY                                     \n7   \ud83d\ude37    0.369  FACE WITH MEDICAL MASK                                 \n8   \ud83d\udd76  \ufe0f   0.363  SUNGLASSES                                             \n9   \ud83d\udd76     0.363  SUNGLASSES                                             \n10  \ud83e\udee3   0.360  FACE WITH PEEKING EYE<\/code><\/pre>\n<h2 class=\"wp-block-heading\">Japanese<\/h2>\n<pre class=\"wp-block-code\"><code>&amp;gt;&amp;gt;&amp;gt; show_top_10(&#039;\u90aa\u773c\u304b\u3089\u5b88\u308b&#039;) # Japanese \n1   \ud83d\ude48    0.379  SEE-NO-EVIL MONKEY                                     \n2   \ud83e\uddff   0.379  NAZAR AMULET                                           \n3   \ud83d\ude49    0.370  HEAR-NO-EVIL MONKEY                                    \n4   \ud83d\ude37    0.363  FACE WITH MEDICAL MASK                                 \n5   \ud83d\ude4a    0.363  SPEAK-NO-EVIL MONKEY                                   \n6   \ud83e\udee3   0.355  FACE WITH PEEKING EYE                                  \n7   \ud83d\udee1 \ufe0f   0.355  SHIELD                                                 \n8   \ud83d\udc41    0.351  EYE                                                    \n9   \ud83e\uddb9\ud83c\udffc    0.350  SUPERVILLAIN MEDIUM-LIGHT SKIN TONE                    \n10  \ud83d\udc53    0.350  GLASSES<\/code><\/pre>\n<p class=\"wp-block-paragraph\">For languages as diverse as Arabic, German, Greek, Bulgarian, Chinese, and Japanese, the \ud83e\uddff emoji always appears in the top 10! This is pretty fascinating since these languages have different linguistic features and writing scripts, thanks to the massive multilinguality of our \ud83e\udd17 sentence Transformer.<\/p>\n<h2 class=\"wp-block-heading\">Limits of AI \ud83d\ude48<\/h2>\n<p class=\"wp-block-paragraph\">The last thing I want to mention is that no technology, no matter how advanced, is perfect. Semantic search is great for improving the recall of information retrieval systems. This means we can retrieve more relevant items even if there is no keyword overlap between the query and the items in the search index. However, this comes at the expense of precision. Remember from the \ud83e\uddff emoji example that in some languages, the emoji we were looking for didn&#8217;t show up in the top 5 results. For this application, this is not a big problem since it&#8217;s not cognitively demanding to quickly scan through emojis to find the one we desire, even if it&#8217;s ranked at the 50th position. But in other cases such as searching through long documents, users may not have the patience nor the resources to skim through dozens of documents. Developers need to keep in mind user cognitive as well as resource constraints when building search engines. Some of the design choices I made for the Emojeez \ud83d\udc8e  search engine may not be work as well for other applications.<\/p>\n<p class=\"wp-block-paragraph\">Another thing to mention is that AI models are known to learn s<strong>ocio-cultural biases<\/strong> from their training data. There is a large volume of documented research showing how modern language technology can amplify <strong>gender stereotypes<\/strong> and be unfair to <strong>minorities<\/strong>. So, we need to be aware of these issues and do our best to tackle them when deploying AI in the real world. If you notice such unwanted biases and unfair behaviors in Emojeez \ud83d\udc8e , please let me know and I will do my best to address them.<\/p>\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n<p class=\"wp-block-paragraph\">Working on the Emojeez \ud83d\udc8e  project was a fascinating journey that taught me a lot about how modern AI and NLP technologies can be employed to address the limitations of traditional keyword search. By harnessing the power of Large Language Models for enriching emoji metadata, multilingual transformers for creating semantic embeddings, and Qdrant for efficient vector search, I was able to create a search engine that makes emoji search more fun and accessible across 50+ languages. Although this project focuses on emoji search, the underlying technology has potential applications in multimodal search and recommendation systems.<\/p>\n<p class=\"wp-block-paragraph\">For readers who are proficient in languages other than English, I am particularly interested in your feedback. Does Emojeez \ud83d\udc8e  perform equally well in English and your native language? Did you notice any differences in quality or accuracy? Please give it a try and let me what you think. Your insights are quite invaluable.<\/p>\n<p class=\"wp-block-paragraph\">Thank you for reading, and I hope you enjoy exploring Emojeez \ud83d\udc8e  as much as I enjoyed building it.<\/p>\n<p class=\"wp-block-paragraph\">Happy Emoji search! \ud83d\udcc6\ud83d\ude0a\ud83c\udf0d\ud83d\ude80<\/p>\n<p class=\"wp-block-paragraph\"><em>Note: Unless otherwise noted, all images are created by the author.<\/em><\/p>","protected":false},"excerpt":{"rendered":"<p>If you are on social media like Twitter or LinkedIn, you have probably noticed that emojis are creatively used in both informal and professional text-based communication. For example, the Rocket emoji \ud83d\ude80 is often used on LinkedIn to symbolize high aspirations and ambitious goals, while the Bullseye \ud83c\udfaf emoji is used in the context of [&hellip;]<\/p>\n","protected":false},"author":18,"featured_media":10460,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"is_member_only":false,"sub_heading":"Develop an AI-powered semantic search for emojis using Python and open-source NLP libraries","footnotes":""},"categories":[17,21,22],"tags":[463,447,450,446,2342],"sponsor":[],"coauthors":[27119],"class_list":["post-10459","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence","category-large-language-models","category-machine-learning","tag-ai","tag-artificial-intelligence","tag-large-language-models","tag-machine-learning","tag-semantic-search"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80 | Towards Data Science<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80 | Towards Data Science\" \/>\n<meta property=\"og:description\" content=\"If you are on social media like Twitter or LinkedIn, you have probably noticed that emojis are creatively used in both informal and professional text-based communication. For example, the Rocket emoji \ud83d\ude80 is often used on LinkedIn to symbolize high aspirations and ambitious goals, while the Bullseye \ud83c\udfaf emoji is used in the context of [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/\" \/>\n<meta property=\"og:site_name\" content=\"Towards Data Science\" \/>\n<meta property=\"article:published_time\" content=\"2024-07-17T10:11:58+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-11-25T11:36:27+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2024\/07\/1G7N_wy3HPArjDdP-tTno-Q.png\" \/>\n\t<meta property=\"og:image:width\" content=\"877\" \/>\n\t<meta property=\"og:image:height\" content=\"630\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Badr Alabsi, PhD\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@TDataScience\" \/>\n<meta name=\"twitter:site\" content=\"@TDataScience\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Badr Alabsi, PhD\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"14 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/\"},\"author\":{\"name\":\"TDS Editors\",\"@id\":\"https:\/\/towardsdatascience.com\/#\/schema\/person\/f9925d336b6fe962b03ad8281d90b8ee\"},\"headline\":\"Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80\",\"datePublished\":\"2024-07-17T10:11:58+00:00\",\"dateModified\":\"2024-11-25T11:36:27+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/\"},\"wordCount\":2244,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/towardsdatascience.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2024\/07\/1G7N_wy3HPArjDdP-tTno-Q.png\",\"keywords\":[\"AI\",\"Artificial Intelligence\",\"Large Language Models\",\"Machine Learning\",\"Semantic Search\"],\"articleSection\":[\"Artificial Intelligence\",\"Large Language Models\",\"Machine Learning\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/\",\"url\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/\",\"name\":\"Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80 | Towards Data Science\",\"isPartOf\":{\"@id\":\"https:\/\/towardsdatascience.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2024\/07\/1G7N_wy3HPArjDdP-tTno-Q.png\",\"datePublished\":\"2024-07-17T10:11:58+00:00\",\"dateModified\":\"2024-11-25T11:36:27+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#primaryimage\",\"url\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2024\/07\/1G7N_wy3HPArjDdP-tTno-Q.png\",\"contentUrl\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2024\/07\/1G7N_wy3HPArjDdP-tTno-Q.png\",\"width\":877,\"height\":630},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/towardsdatascience.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/towardsdatascience.com\/#website\",\"url\":\"https:\/\/towardsdatascience.com\/\",\"name\":\"Towards Data Science\",\"description\":\"Publish AI, ML &amp; data-science insights to a global community of data professionals.\",\"publisher\":{\"@id\":\"https:\/\/towardsdatascience.com\/#organization\"},\"alternateName\":\"TDS\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/towardsdatascience.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/towardsdatascience.com\/#organization\",\"name\":\"Towards Data Science\",\"alternateName\":\"TDS\",\"url\":\"https:\/\/towardsdatascience.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/towardsdatascience.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/tds-logo.jpg\",\"contentUrl\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/tds-logo.jpg\",\"width\":696,\"height\":696,\"caption\":\"Towards Data Science\"},\"image\":{\"@id\":\"https:\/\/towardsdatascience.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/TDataScience\",\"https:\/\/www.youtube.com\/c\/TowardsDataScience\",\"https:\/\/www.linkedin.com\/company\/towards-data-science\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/towardsdatascience.com\/#\/schema\/person\/f9925d336b6fe962b03ad8281d90b8ee\",\"name\":\"TDS Editors\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/towardsdatascience.com\/#\/schema\/person\/image\/23494c9101089ad44ae88ce9d2f56aac\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g\",\"caption\":\"TDS Editors\"},\"description\":\"Building a vibrant data science and machine learning community. Share your insights and projects with our global audience: bit.ly\/write-for-tds\",\"url\":\"https:\/\/towardsdatascience.com\/author\/towardsdatascience\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80 | Towards Data Science","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/","og_locale":"en_US","og_type":"article","og_title":"Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80 | Towards Data Science","og_description":"If you are on social media like Twitter or LinkedIn, you have probably noticed that emojis are creatively used in both informal and professional text-based communication. For example, the Rocket emoji \ud83d\ude80 is often used on LinkedIn to symbolize high aspirations and ambitious goals, while the Bullseye \ud83c\udfaf emoji is used in the context of [&hellip;]","og_url":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/","og_site_name":"Towards Data Science","article_published_time":"2024-07-17T10:11:58+00:00","article_modified_time":"2024-11-25T11:36:27+00:00","og_image":[{"width":877,"height":630,"url":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2024\/07\/1G7N_wy3HPArjDdP-tTno-Q.png","type":"image\/png"}],"author":"Badr Alabsi, PhD","twitter_card":"summary_large_image","twitter_creator":"@TDataScience","twitter_site":"@TDataScience","twitter_misc":{"Written by":"Badr Alabsi, PhD","Est. reading time":"14 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#article","isPartOf":{"@id":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/"},"author":{"name":"TDS Editors","@id":"https:\/\/towardsdatascience.com\/#\/schema\/person\/f9925d336b6fe962b03ad8281d90b8ee"},"headline":"Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80","datePublished":"2024-07-17T10:11:58+00:00","dateModified":"2024-11-25T11:36:27+00:00","mainEntityOfPage":{"@id":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/"},"wordCount":2244,"commentCount":0,"publisher":{"@id":"https:\/\/towardsdatascience.com\/#organization"},"image":{"@id":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#primaryimage"},"thumbnailUrl":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2024\/07\/1G7N_wy3HPArjDdP-tTno-Q.png","keywords":["AI","Artificial Intelligence","Large Language Models","Machine Learning","Semantic Search"],"articleSection":["Artificial Intelligence","Large Language Models","Machine Learning"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/","url":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/","name":"Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80 | Towards Data Science","isPartOf":{"@id":"https:\/\/towardsdatascience.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#primaryimage"},"image":{"@id":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#primaryimage"},"thumbnailUrl":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2024\/07\/1G7N_wy3HPArjDdP-tTno-Q.png","datePublished":"2024-07-17T10:11:58+00:00","dateModified":"2024-11-25T11:36:27+00:00","breadcrumb":{"@id":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#primaryimage","url":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2024\/07\/1G7N_wy3HPArjDdP-tTno-Q.png","contentUrl":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2024\/07\/1G7N_wy3HPArjDdP-tTno-Q.png","width":877,"height":630},{"@type":"BreadcrumbList","@id":"https:\/\/towardsdatascience.com\/semantic-search-for-emojis-in-50-languages-using-ai-f85a36a86f21\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/towardsdatascience.com\/"},{"@type":"ListItem","position":2,"name":"Semantic Search Engine for Emojis in 50+ Languages Using AI \ud83d\ude0a\ud83c\udf0d\ud83d\ude80"}]},{"@type":"WebSite","@id":"https:\/\/towardsdatascience.com\/#website","url":"https:\/\/towardsdatascience.com\/","name":"Towards Data Science","description":"Publish AI, ML &amp; data-science insights to a global community of data professionals.","publisher":{"@id":"https:\/\/towardsdatascience.com\/#organization"},"alternateName":"TDS","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/towardsdatascience.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/towardsdatascience.com\/#organization","name":"Towards Data Science","alternateName":"TDS","url":"https:\/\/towardsdatascience.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/towardsdatascience.com\/#\/schema\/logo\/image\/","url":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/tds-logo.jpg","contentUrl":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/tds-logo.jpg","width":696,"height":696,"caption":"Towards Data Science"},"image":{"@id":"https:\/\/towardsdatascience.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/TDataScience","https:\/\/www.youtube.com\/c\/TowardsDataScience","https:\/\/www.linkedin.com\/company\/towards-data-science\/"]},{"@type":"Person","@id":"https:\/\/towardsdatascience.com\/#\/schema\/person\/f9925d336b6fe962b03ad8281d90b8ee","name":"TDS Editors","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/towardsdatascience.com\/#\/schema\/person\/image\/23494c9101089ad44ae88ce9d2f56aac","url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","caption":"TDS Editors"},"description":"Building a vibrant data science and machine learning community. Share your insights and projects with our global audience: bit.ly\/write-for-tds","url":"https:\/\/towardsdatascience.com\/author\/towardsdatascience\/"}]}},"distributor_meta":false,"distributor_terms":false,"distributor_media":false,"distributor_original_site_name":"Towards Data Science","distributor_original_site_url":"https:\/\/towardsdatascience.com","push-errors":false,"_links":{"self":[{"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/posts\/10459","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/users\/18"}],"replies":[{"embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/comments?post=10459"}],"version-history":[{"count":0,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/posts\/10459\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/media\/10460"}],"wp:attachment":[{"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/media?parent=10459"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/categories?post=10459"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/tags?post=10459"},{"taxonomy":"sponsor","embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/sponsor?post=10459"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/coauthors?post=10459"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}