{"id":85642,"date":"2023-04-10T18:52:40","date_gmt":"2023-04-10T18:52:40","guid":{"rendered":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/"},"modified":"2025-01-27T00:32:57","modified_gmt":"2025-01-27T00:32:57","slug":"customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6","status":"publish","type":"post","link":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/","title":{"rendered":"Customer Satisfaction Measurement with N-gram and Sentiment Analysis"},"content":{"rendered":"<h3 class=\"wp-block-heading\">Product reviews are an excellent source of information for qualified management decisions. Learn more about the right text mining techniques.<\/h3>\n<figure class=\"wp-block-image size-large\"><img data-dominant-color=\"74706c\" data-has-transparency=\"false\" style=\"--dominant-color: #74706c;\" loading=\"lazy\" decoding=\"async\" width=\"2560\" height=\"1703\" src=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg\" alt=\"Photo by Freepik on Freepik\" class=\"wp-image-85643 not-transparent\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg 2560w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-300x200.jpeg 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-1024x681.jpeg 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-768x511.jpeg 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-1536x1022.jpeg 1536w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-2048x1363.jpeg 2048w\" sizes=\"auto, (max-width: 2560px) 100vw, 2560px\" \/><figcaption class=\"wp-element-caption\">Photo by <a href=\"https:\/\/www.freepik.com\/free-photo\/collage-customer-experience-concept_25053683.htm#query=customer%20satisfaction%20data&amp;position=5&amp;from_view=search&amp;track=ais\">Freepik<\/a> on Freepik<\/figcaption><\/figure>\n<h2 class=\"wp-block-heading\">Introduction<\/h2>\n<p class=\"wp-block-paragraph\">Happy customers drive company growth. The five-word sentence explains everything about why we do our best to maximize customer satisfaction. Product reviews are one of the major data sources that large companies like <a href=\"https:\/\/www.amazon.com\/gp\/help\/customer\/display.html?nodeId=GL4WJF8BGV8VL6B8\">Amazon<\/a> and <a href=\"https:\/\/www.apple.com\/contact\/feedback\/#:~:text=To%20comment%20on%20a%20particular,and%20select%20the%20appropriate%20link.\">Apple<\/a>, middle-sized exporters including <a href=\"https:\/\/www.trustpilot.com\/review\/lentiamo.nl\">Lentiamo<\/a>, and local companies running their Facebook pages collect. Reviews are typically collected repeatedly over time, and factors like quality shifts, marketing communications, and customer care friendliness impact the sentiment expressed by customers.<\/p>\n<figure class=\"wp-block-image size-large\"><img data-dominant-color=\"eaeff4\" data-has-transparency=\"true\" style=\"--dominant-color: #eaeff4;\" loading=\"lazy\" decoding=\"async\" width=\"5088\" height=\"1300\" class=\"wp-image-547779 has-transparency\" src=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1V3CPZVr42GG0qnsWXwsOKw.png\" alt=\"Note: Image by author, based on the review of Karim (2011), Baker and Wurgler (2006), Merrin et al. (2013), and Eachempat et al. (2022)\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1V3CPZVr42GG0qnsWXwsOKw.png 5088w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1V3CPZVr42GG0qnsWXwsOKw-300x77.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1V3CPZVr42GG0qnsWXwsOKw-1024x262.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1V3CPZVr42GG0qnsWXwsOKw-768x196.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1V3CPZVr42GG0qnsWXwsOKw-1536x392.png 1536w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1V3CPZVr42GG0qnsWXwsOKw-2048x523.png 2048w\" sizes=\"auto, (max-width: 5088px) 100vw, 5088px\" \/><figcaption class=\"wp-element-caption\">Note: Image by author, based on the review of Karim (2011), Baker and Wurgler (2006), Merrin et al. (2013), and Eachempat et al. (2022)<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">The Business Intelligence (BI) role should be to analyze product reviews, identify potential problems, and develop hypotheses to solve them. In the next stage, these recommendations are scrutinized by other departments based on the company structure. <em><strong>This article will explain more closely the analytics of customer satisfaction measurement with product reviews data.<\/strong><\/em><\/p>\n<p class=\"wp-block-paragraph\">The-end-to end process includes:<\/p>\n<ul class=\"wp-block-list\">\n<li>How to make an exploratory analysis of time-series product reviews<\/li>\n<li>How to quickly evaluate sentiment in product reviews over time<\/li>\n<li>How to display the most frequent satisfaction factors over time<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">We&#8217;ll work in Python, which most BI and data analysts routinely use.<\/p>\n<h2 class=\"wp-block-heading\">2. Data<\/h2>\n<p class=\"wp-block-paragraph\">Getting unscraped product reviews data with a flexible data license is generally difficult. Synthetic data from <strong><a href=\"https:\/\/osf.io\/tyue9\/\">Fake Reviews Dataset<\/a><\/strong> distributed with <a href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">Attribution 4.0 International license<\/a> is an excellent option in this case.<\/p>\n<p class=\"wp-block-paragraph\">The data looks like this:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-dominant-color=\"eeeeee\" data-has-transparency=\"false\" style=\"--dominant-color: #eeeeee;\" loading=\"lazy\" decoding=\"async\" width=\"897\" height=\"269\" class=\"wp-image-547782 not-transparent\" src=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1NILSfG1GiPoxGJ03p5NvSQ.png\" alt=\"Image 1. Fake Reviews Dataset\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1NILSfG1GiPoxGJ03p5NvSQ.png 897w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1NILSfG1GiPoxGJ03p5NvSQ-300x90.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1NILSfG1GiPoxGJ03p5NvSQ-768x230.png 768w\" sizes=\"auto, (max-width: 897px) 100vw, 897px\" \/><figcaption class=\"wp-element-caption\">Image 1. Fake Reviews Dataset<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">The <em>text<\/em> column contains product reviews, and <em>period<\/em> marks the review date. The subset we&#8217;ll work with contains 3848 reviews of clothing, shoes, and jewelry products.<\/p>\n<h2 class=\"wp-block-heading\">3. Exploratory data analysis (EDA)<\/h2>\n<p class=\"wp-block-paragraph\">EDA in this type of dataset should discover the <strong>completeness<\/strong> of the dataset for each period of interest and the <strong>presence of noise<\/strong> that does not bring valuable information \u2013 digits and special characters. We try to avoid the situation of highly imbalanced datasets containing very few reviews in some periods and noisy datasets with too many digits and special characters (\/, @, &amp;,;, ? etc.).<\/p>\n<p class=\"wp-block-paragraph\"><strong>3.1. Checking for data completeness<\/strong><\/p>\n<p class=\"wp-block-paragraph\">First, let&#8217;s check the data for completeness for each period. We make annual comparisons and therefore summarise the reviews for each year:<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import pandas as pd\n\n# Calculate review frequencies by year\ndata[&#039;year&#039;] = pd.DatetimeIndex(data[&#039;period&#039;]).year\nrows_count = data.groupby(&#039;year&#039;, as_index=False).year.value_counts()\nrows_count.columns=[&#039;year&#039;,&#039;reviews&#039;]\n\nimport matplotlib.pyplot as plt\n\n# Generate a line plot\nrows_count.plot.line(x=&#039;year&#039;,y = &#039;reviews&#039;)\nplt.show()<\/code><\/pre>\n<p class=\"wp-block-paragraph\">A simple line pot shows the yearly review frequencies. It is not necessary to make any graph formatting here:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-dominant-color=\"f8f9fa\" data-has-transparency=\"true\" style=\"--dominant-color: #f8f9fa;\" loading=\"lazy\" decoding=\"async\" width=\"552\" height=\"435\" class=\"wp-image-547783 has-transparency\" src=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1pT8b812kyEQhhF0-YkvAbg.png\" alt=\"Image 2. Yearly product review frequencies. Image by author.\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1pT8b812kyEQhhF0-YkvAbg.png 552w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1pT8b812kyEQhhF0-YkvAbg-300x236.png 300w\" sizes=\"auto, (max-width: 552px) 100vw, 552px\" \/><figcaption class=\"wp-element-caption\">Image 2. Yearly product review frequencies. Image by author.<\/figcaption><\/figure>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p>The dataset is not largely imbalanced. We have about 200 and more reviews per year, which makes a solid dataset for sentient and n-gram analyses.<\/p><\/blockquote>\n<p class=\"wp-block-paragraph\"><strong>3.2. Calculating letter-to-other characters ratio<\/strong><\/p>\n<p class=\"wp-block-paragraph\">Next, let&#8217;s check if the data does not consist mainly of numbers and special characters, which could bias later text mining procedures.<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import re\n\n# Convert product reviews to a string\ntext = pd.Series.to_string(data[&#039;text&#039;], index = False)\n\n# Remove newline characters\ntext = re.sub(&#039;n&#039;, &#039;&#039;, text)\n\n# Calculate sum of numbers, letters, and spaces\nnumbers = sum(c.isdigit() for c in text)\nletters = sum(c.isalpha() for c in text)\nspaces  = sum(c.isspace() for c in text)\nothers  = len(text) - numbers - letters - spaces<\/code><\/pre>\n<p class=\"wp-block-paragraph\">We calculate <em>cleanness<\/em> and <em>dirtiness<\/em> (the opposite) metrics that show the ratios of letters to other characters (numbers, spaces, and special characters):<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\"># Calculate metrics\ndirtiness = ((numbers + others) \/ len(text)) * 100\ncleanness = ((letters + spaces) \/ len(text)) * 100\n\nprint(dirtiness)\nprint(cleanness)<\/code><\/pre>\n<p class=\"wp-block-paragraph\">The print output is:<\/p>\n<ul class=\"wp-block-list\">\n<li>dirtiness is 8.77 %<\/li>\n<li>cleanness is 91.22 %<\/li>\n<\/ul>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p>We have around 9 % of digits and special characters in the data. This volume of noise should not bias sentiment analysis in the later stage.<\/p><\/blockquote>\n<h2 class=\"wp-block-heading\">4. Customer satisfaction measurement<\/h2>\n<p class=\"wp-block-paragraph\">Text mining attempts to find out: (1) how the customers&#8217; sentiment developed over time and (2) which factors contributed to changes in customer satisfaction. We&#8217;ll use <a href=\"https:\/\/pypi.org\/project\/arabica\/\">Arabica<\/a>, a Python library for time-series text mining, to explore both.<\/p>\n<p class=\"wp-block-paragraph\"><em><strong>EDIT Jul 2023<\/strong>: Arabica has been updated. Check the <strong>documentation<\/strong> for the full list of parameters.<\/em><\/p>\n<h3 class=\"wp-block-heading\">4.1. Sentiment analysis<\/h3>\n<p class=\"wp-block-paragraph\">Arabica offers <code>coffee_break<\/code> module for sentiment and breakpoint analysis. Its <a href=\"https:\/\/arabica.readthedocs.io\/en\/latest\/Breakpoint%20identification.html\">documentation<\/a> provides more details about the models and methodology of sentiment evaluation.<\/p>\n<p class=\"wp-block-paragraph\">This code cleans data from punctuation and numbers, removes other redundant strings (<em>&quot;<br \/>&quot;, &quot;\/n&quot;,<\/em> and <em>&quot;Another Long String&quot;<\/em> are used as examples), calculates sentiment for each review, aggregates the sentiment by year, and identifies two major break points in the sentiment time series:<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">from arabica import coffee_break\n\ncoffee_break(text = data[&#039;text&#039;],\n             time = data[&#039;period&#039;],\n             date_format = &#039;us&#039;,     # Use US-style date format to read dates\n             preprocess = True,      # Clean data - digits and punctuation\n             skip = [&#039;&lt;br \/&gt;&#039;,       # Remove additional stop words\n                     &#039;\/n&#039;,\n             &#039;Another Long String&#039;],\n             n_breaks = 2,           # 2 breakpoints identified\n             time_freq = &#039;Y&#039;)        # Yearly aggregation<\/code><\/pre>\n<p class=\"wp-block-paragraph\">The line plot with sentiment and breakpoints:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-dominant-color=\"f9f9fa\" data-has-transparency=\"true\" style=\"--dominant-color: #f9f9fa;\" loading=\"lazy\" decoding=\"async\" width=\"8131\" height=\"5648\" class=\"wp-image-547784 has-transparency\" src=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1m_FbgkPqBbnaQjMi6-ws9w.png\" alt=\"Image 3. Sentiment analysis with breakpoints. Image by author.\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1m_FbgkPqBbnaQjMi6-ws9w.png 8131w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1m_FbgkPqBbnaQjMi6-ws9w-300x208.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1m_FbgkPqBbnaQjMi6-ws9w-1024x711.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1m_FbgkPqBbnaQjMi6-ws9w-768x533.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1m_FbgkPqBbnaQjMi6-ws9w-1536x1067.png 1536w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1m_FbgkPqBbnaQjMi6-ws9w-2048x1423.png 2048w\" sizes=\"auto, (max-width: 8131px) 100vw, 8131px\" \/><figcaption class=\"wp-element-caption\">Image 3. Sentiment analysis with breakpoints. Image by author.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">We can see fluctuations in sentiment over time. These changes are not rapid as sentiment varies [0.6: 0.7]. The higher value of sentiment, the better customers perceive the products (and the opposite).<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p>In 1991 and 2000, there were positive and negative breakpoints in sentiment. In the next stage, let&#8217;s check what caused these shifts.<\/p><\/blockquote>\n<h3 class=\"wp-block-heading\">4.2. Factors driving customer satisfaction<\/h3>\n<p class=\"wp-block-paragraph\">We&#8217;ll use a heatmap of n-gram-frequencies from the <code>cappuccino<\/code> module to visually derive the factors that influenced customers&#8217; sentiment. Read the <a href=\"https:\/\/arabica.readthedocs.io\/en\/latest\/Heatmap.html\">documentation<\/a> for more technical details.<\/p>\n<p class=\"wp-block-paragraph\">This code plots a heatmap of the eight most frequent bigrams (i.e., two consecutive words) in each year. The data pre-processing includes cleaning from English stopwords, numbers, and three unwanted strings (<em>&quot;<br \/>&quot;, &quot;\/n&quot;<\/em>, and <em>&quot;Another Long String&quot;<\/em>). Punctuation and blank rows are removed automatically:<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">from arabica import cappuccino\n\ncappuccino(text = data[&#039;text&#039;],\n           time = data[&#039;period&#039;],\n           date_format = &#039;us&#039;,       # Use US-style date format to parse dates\n           plot = &#039;heatmap&#039;,\n           ngram = 2,                # N-gram size, 1 = unigram, 2 = bigram\n           time_freq = &#039;Y&#039;,          # Aggregation period, &#039;M&#039; = monthly, &#039;Y&#039; = yearly\n           max_words = 8,            # Display 8 most frequent bigrams for each period\n           stopwords = [&#039;english&#039;],  # Remove English stopwords\n           skip = [&#039;&lt;br \/&gt;&#039;,         # Remove additional stop words\n                   &#039;\/n&#039;,\n           &#039;Another Long String&#039;],\n           numbers = True,           # Remove numbers\n           lower_case = True)        # Lowercase text<\/code><\/pre>\n<p class=\"wp-block-paragraph\">The bigram heatmap:<\/p>\n<figure class=\"wp-block-image size-large\"><img data-dominant-color=\"eff0f0\" data-has-transparency=\"true\" style=\"--dominant-color: #eff0f0;\" loading=\"lazy\" decoding=\"async\" width=\"8722\" height=\"5654\" class=\"wp-image-547789 has-transparency\" src=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1p-o36MDGHvtUVBXhCdSlTA.png\" alt=\"Image 4. Bigram heatmap. Image by author.\" srcset=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1p-o36MDGHvtUVBXhCdSlTA.png 8722w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1p-o36MDGHvtUVBXhCdSlTA-300x194.png 300w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1p-o36MDGHvtUVBXhCdSlTA-1024x664.png 1024w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1p-o36MDGHvtUVBXhCdSlTA-768x498.png 768w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1p-o36MDGHvtUVBXhCdSlTA-1536x996.png 1536w, https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1p-o36MDGHvtUVBXhCdSlTA-2048x1328.png 2048w\" sizes=\"auto, (max-width: 8722px) 100vw, 8722px\" \/><figcaption class=\"wp-element-caption\">Image 4. Bigram heatmap. Image by author.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">Bigram frequencies indicate that:<\/p>\n<ul class=\"wp-block-list\">\n<li><em><strong>2000s drop<\/strong><\/em> (missing <em>&quot;would recommend&quot;, &quot;fits perfectly&quot;, and &quot;fit perfect&quot;<\/em> in the top 8), and high frequencies of <em>&quot;wide foot&quot;<\/em> and <em>&quot;size fit&quot;<\/em> suggest that we might be selling products to people that do not fit their size of foot. Note that we removed stop words such as <em>&quot;don&#8217;t&quot;<\/em>.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\">Conclusions<\/h2>\n<p class=\"wp-block-paragraph\">This type of text-mining analytics should help make qualified <strong>data-informed decisions<\/strong> and improve the quality of our products and services. Based on the analytical results, we determine a set of hypotheses about what might be wrong and where is space for improvement.<\/p>\n<p class=\"wp-block-paragraph\">From the text mining analysis in this article, we can see potential problems with product quality. The products fit worse our customers&#8217; current needs. Receiving the reviews in 2001, the analysts should formulate a set of problem hypotheses such as:<\/p>\n<ul class=\"wp-block-list\">\n<li><em>The new shoes don&#8217;t fit well.<\/em><\/li>\n<li><em>We don&#8217;t sell wide-foot shoes, but customers demand them.<\/em><\/li>\n<li><em>There&#8217;s a quality problem in the production of new products.<\/em><\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">Many other potential problem hypotheses arise since you know your products better than anyone else. Other departments (Marketing, Customer Care, Logistics, ..) should find out where the problem is, then react with specific improvements, and fix the problem. The response, of course, depends on the company&#8217;s size and structure and other firm-specific factors.<\/p>\n<p class=\"wp-block-paragraph\">The jupyter notebook with the code and data is on my <a href=\"https:\/\/github.com\/PetrKorab\/Customer-Satisfaction-Measurement-with-N-gram-and-Sentiment-Analysis\">GitHub<\/a>.<\/p>\n<p class=\"wp-block-paragraph\"><em>Did you like the article? You can invite me <a href=\"https:\/\/www.buymeacoffee.com\/petrkorab\">for coffee<\/a> and support my writing. You can also subscribe to my <a href=\"https:\/\/medium.com\/subscribe\/@petrkorab\">email list<\/a> to get notified about my new articles. Thanks!<\/em><\/p>\n<h2 class=\"wp-block-heading\">References<\/h2>\n<p class=\"wp-block-paragraph\">Baker, M., Wurgler, J., 2006. Investor sentiment and the cross-section of stock returns. <em>Journal of Finance<\/em> 61 (4).<\/p>\n<p class=\"wp-block-paragraph\">Eachempati, P., Srivastava, P. R., Kumar A., Munoz, J., Dursun D., 2022. <a href=\"https:\/\/www.sciencedirect.com\/science\/article\/abs\/pii\/S0040162521006995?via%3Dihub\">Can customer sentiment impact firm value?<\/a> An integrated text mining approach. <em>Technological Forecasting &amp; Social Change<\/em> 174 (1).<\/p>\n<p class=\"wp-block-paragraph\">Karim, B., 2011. Corporate name change and shareholder wealth effect: empirical evidence in the French stock market. <em>Journal of Asset Management<\/em> 12 (3).<\/p>\n<p class=\"wp-block-paragraph\">Merrin, R. P., Hoffmann, A. O., Pennings, J. M., 2013. Customer satisfaction as a buffer against sentimental stock-price corrections. <em>Marketing Letters<\/em> 24 (1).<\/p>","protected":false},"excerpt":{"rendered":"<p>Product reviews are an excellent source of information for qualified management decisions. Learn more about the right text mining&#8230;<\/p>\n","protected":false},"author":18,"featured_media":85643,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"is_member_only":true,"sub_heading":"Product reviews are an excellent source of information for qualified management decisions. Learn more about the right text mining...","footnotes":""},"categories":[44],"tags":[550,448,467,694,1604],"sponsor":[],"coauthors":[30697],"class_list":["post-85642","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science","tag-business-intelligence","tag-data-science","tag-python","tag-sentiment-analysis","tag-text-mining"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Customer Satisfaction Measurement with N-gram and Sentiment Analysis | Towards Data Science<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Customer Satisfaction Measurement with N-gram and Sentiment Analysis | Towards Data Science\" \/>\n<meta property=\"og:description\" content=\"Product reviews are an excellent source of information for qualified management decisions. Learn more about the right text mining...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/\" \/>\n<meta property=\"og:site_name\" content=\"Towards Data Science\" \/>\n<meta property=\"article:published_time\" content=\"2023-04-10T18:52:40+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-01-27T00:32:57+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg\" \/>\n\t<meta property=\"og:image:width\" content=\"2560\" \/>\n\t<meta property=\"og:image:height\" content=\"1703\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Petr Kor\u00e1b\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@TDataScience\" \/>\n<meta name=\"twitter:site\" content=\"@TDataScience\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Petr Kor\u00e1b\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/\"},\"author\":{\"name\":\"TDS Editors\",\"@id\":\"https:\/\/towardsdatascience.com\/#\/schema\/person\/f9925d336b6fe962b03ad8281d90b8ee\"},\"headline\":\"Customer Satisfaction Measurement with N-gram and Sentiment Analysis\",\"datePublished\":\"2023-04-10T18:52:40+00:00\",\"dateModified\":\"2025-01-27T00:32:57+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/\"},\"wordCount\":1151,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/towardsdatascience.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg\",\"keywords\":[\"Business Intelligence\",\"Data Science\",\"Python\",\"Sentiment Analysis\",\"Text Mining\"],\"articleSection\":[\"Data Science\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/\",\"url\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/\",\"name\":\"Customer Satisfaction Measurement with N-gram and Sentiment Analysis | Towards Data Science\",\"isPartOf\":{\"@id\":\"https:\/\/towardsdatascience.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg\",\"datePublished\":\"2023-04-10T18:52:40+00:00\",\"dateModified\":\"2025-01-27T00:32:57+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#primaryimage\",\"url\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg\",\"contentUrl\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg\",\"width\":2560,\"height\":1703,\"caption\":\"Photo by Freepik on Freepik\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/towardsdatascience.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Customer Satisfaction Measurement with N-gram and Sentiment Analysis\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/towardsdatascience.com\/#website\",\"url\":\"https:\/\/towardsdatascience.com\/\",\"name\":\"Towards Data Science\",\"description\":\"Publish AI, ML &amp; data-science insights to a global community of data professionals.\",\"publisher\":{\"@id\":\"https:\/\/towardsdatascience.com\/#organization\"},\"alternateName\":\"TDS\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/towardsdatascience.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/towardsdatascience.com\/#organization\",\"name\":\"Towards Data Science\",\"alternateName\":\"TDS\",\"url\":\"https:\/\/towardsdatascience.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/towardsdatascience.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/tds-logo.jpg\",\"contentUrl\":\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/tds-logo.jpg\",\"width\":696,\"height\":696,\"caption\":\"Towards Data Science\"},\"image\":{\"@id\":\"https:\/\/towardsdatascience.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/TDataScience\",\"https:\/\/www.youtube.com\/c\/TowardsDataScience\",\"https:\/\/www.linkedin.com\/company\/towards-data-science\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/towardsdatascience.com\/#\/schema\/person\/f9925d336b6fe962b03ad8281d90b8ee\",\"name\":\"TDS Editors\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/towardsdatascience.com\/#\/schema\/person\/image\/23494c9101089ad44ae88ce9d2f56aac\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g\",\"caption\":\"TDS Editors\"},\"description\":\"Building a vibrant data science and machine learning community. Share your insights and projects with our global audience: bit.ly\/write-for-tds\",\"url\":\"https:\/\/towardsdatascience.com\/author\/towardsdatascience\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Customer Satisfaction Measurement with N-gram and Sentiment Analysis | Towards Data Science","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/","og_locale":"en_US","og_type":"article","og_title":"Customer Satisfaction Measurement with N-gram and Sentiment Analysis | Towards Data Science","og_description":"Product reviews are an excellent source of information for qualified management decisions. Learn more about the right text mining...","og_url":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/","og_site_name":"Towards Data Science","article_published_time":"2023-04-10T18:52:40+00:00","article_modified_time":"2025-01-27T00:32:57+00:00","og_image":[{"width":2560,"height":1703,"url":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg","type":"image\/jpeg"}],"author":"Petr Kor\u00e1b","twitter_card":"summary_large_image","twitter_creator":"@TDataScience","twitter_site":"@TDataScience","twitter_misc":{"Written by":"Petr Kor\u00e1b","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#article","isPartOf":{"@id":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/"},"author":{"name":"TDS Editors","@id":"https:\/\/towardsdatascience.com\/#\/schema\/person\/f9925d336b6fe962b03ad8281d90b8ee"},"headline":"Customer Satisfaction Measurement with N-gram and Sentiment Analysis","datePublished":"2023-04-10T18:52:40+00:00","dateModified":"2025-01-27T00:32:57+00:00","mainEntityOfPage":{"@id":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/"},"wordCount":1151,"commentCount":0,"publisher":{"@id":"https:\/\/towardsdatascience.com\/#organization"},"image":{"@id":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#primaryimage"},"thumbnailUrl":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg","keywords":["Business Intelligence","Data Science","Python","Sentiment Analysis","Text Mining"],"articleSection":["Data Science"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/","url":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/","name":"Customer Satisfaction Measurement with N-gram and Sentiment Analysis | Towards Data Science","isPartOf":{"@id":"https:\/\/towardsdatascience.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#primaryimage"},"image":{"@id":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#primaryimage"},"thumbnailUrl":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg","datePublished":"2023-04-10T18:52:40+00:00","dateModified":"2025-01-27T00:32:57+00:00","breadcrumb":{"@id":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#primaryimage","url":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg","contentUrl":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/04\/1x7PfPrnhhCZ_f9nwJ0FePw-scaled.jpeg","width":2560,"height":1703,"caption":"Photo by Freepik on Freepik"},{"@type":"BreadcrumbList","@id":"https:\/\/towardsdatascience.com\/customer-satisfaction-measurement-with-n-gram-and-sentiment-analysis-547e291c13a6\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/towardsdatascience.com\/"},{"@type":"ListItem","position":2,"name":"Customer Satisfaction Measurement with N-gram and Sentiment Analysis"}]},{"@type":"WebSite","@id":"https:\/\/towardsdatascience.com\/#website","url":"https:\/\/towardsdatascience.com\/","name":"Towards Data Science","description":"Publish AI, ML &amp; data-science insights to a global community of data professionals.","publisher":{"@id":"https:\/\/towardsdatascience.com\/#organization"},"alternateName":"TDS","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/towardsdatascience.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/towardsdatascience.com\/#organization","name":"Towards Data Science","alternateName":"TDS","url":"https:\/\/towardsdatascience.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/towardsdatascience.com\/#\/schema\/logo\/image\/","url":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/tds-logo.jpg","contentUrl":"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2025\/02\/tds-logo.jpg","width":696,"height":696,"caption":"Towards Data Science"},"image":{"@id":"https:\/\/towardsdatascience.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/TDataScience","https:\/\/www.youtube.com\/c\/TowardsDataScience","https:\/\/www.linkedin.com\/company\/towards-data-science\/"]},{"@type":"Person","@id":"https:\/\/towardsdatascience.com\/#\/schema\/person\/f9925d336b6fe962b03ad8281d90b8ee","name":"TDS Editors","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/towardsdatascience.com\/#\/schema\/person\/image\/23494c9101089ad44ae88ce9d2f56aac","url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","caption":"TDS Editors"},"description":"Building a vibrant data science and machine learning community. Share your insights and projects with our global audience: bit.ly\/write-for-tds","url":"https:\/\/towardsdatascience.com\/author\/towardsdatascience\/"}]}},"distributor_meta":false,"distributor_terms":false,"distributor_media":false,"distributor_original_site_name":"Towards Data Science","distributor_original_site_url":"https:\/\/towardsdatascience.com","push-errors":false,"_links":{"self":[{"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/posts\/85642","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/users\/18"}],"replies":[{"embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/comments?post=85642"}],"version-history":[{"count":0,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/posts\/85642\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/media\/85643"}],"wp:attachment":[{"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/media?parent=85642"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/categories?post=85642"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/tags?post=85642"},{"taxonomy":"sponsor","embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/sponsor?post=85642"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/towardsdatascience.com\/wp-json\/wp\/v2\/coauthors?post=85642"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}