Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles Predicting Movie Ratings: NLP Tools is What Film Studios Need

Predicting Movie Ratings: NLP Tools is What Film Studios Need

Key Takeaways

  • Predict the success of a new film as well as box offices using Natural Language Processing (NLP) techniques
  • Use movie viewers' comments to predict the movie ratings
  • List of sources for movie reviews include the social media sources of movie-related data
  • Sentiment analysis of movie reviews and opinions shared on social media platforms can help marketers to predict ratings
  • Analysis of movie reviews can be also used to classify movies into different genres and to improve the movie recommendation systems


Movie Reviews and Ratings

Everyone knows how profitable the film industry is. According to PwC, the global box office revenue amounted to about 38 billion U.S. dollars in 2015. People are now spoiled for choice: North America alone released more than 690 movies in 2015. Yet only a small number of movies have a long lifetime. Most of them get to the top of the list quickly, but then they are dethroned as new starlets are not long in coming.

Studios understand that the competition is high, and their movies may not meet expectations of the box office (e.g. "Superman Returns" released in 2006). They work hard to enhance the likelihood of success and movie industry players are increasingly interested in gaining access to success and failure predictions. Some studies show that there is a relation between a movie's ratings and its subsequent sales. For instance, Gilad Mishne and Natalie Glance proved that there is a good correlation between references to movies in blog posts and the financial success of the movies.

People tend to rely on opinions of others when they look for a movie to watch. Taking into account the fact that not only movie critics, but also ordinary people share their reviews on the Internet, such reviews can become a fertile source of predictions of movies' ratings and box offices. Natural language processing (NLP) tools are a good choice when it comes to conducting the movie review analysis. This article focuses on how these tools can be used for the analysis and challenges that their developers face.

Movie Review Data

There are many websites that specialize in the reviews of movies and TV shows. Rotten Tomatoes and IMDb are some of the most popular review hubs. Movie reviews are not limited to these websites: people post their opinions on movie forums, publish their reviews in online magazines and journals. So, researchers get an ocean of extractable data for free.

Posts from social media (e.g. Twitter) should also be considered since around 6,000 tweets are sent every second. Many of the tweet messages are movies-related. The role of Twitter as a source of data was demonstrated by Bernard J. Jansen et al. in the study where the power of tweets as electronic word of mouth was investigated.

It is not hard to find tweets that mention movies, as people use hashtags to make their posts searchable. But researchers don’t need to search for tweets manually. They can take the full advantage of the automatic search thanks to Twitter's Search API and Streaming APIs. One more option to get the desired data is to purchase it from a reseller.

YouTube has the potential to become a rich bank of data for researchers, as well. Users actively express their opinions about movies in comments under movie trailers (official or not). See below for some of the comments posted under the official La La Land trailer.

(Click on the image to enlarge it)

Once a movie is available in theaters, YouTube vloggers and other Youtubers upload their reviews to their channels. These reviews can also be used by researchers. Speech recognition software can transform the speech into text and then analyze it with linguistic tools. And it goes without saying that comments posted under such reviews should be utilized by experts, too.

Why Natural Language Processing

It is clear that all movie reviews cannot be analyzed without computers. But the machines are used to deal with highly structured languages. That is why they are not able to understand the context of natural language (spoken by humans) by themselves.

Technological advances changed the situation: new approaches and algorithms gave computers a chance to understand natural speech. For instance, machine learning and natural language processing (NLP) make use of different techniques (e.g. Bayesian and hidden Markov model-based ones) to recognize speech and "understand" natural speech.

For which purposes is NLP is now used? For instance, it’s employed in different question and answer systems like Cortana and Siri. Summarizers based on NLP can process texts to create their short summaries. Text Summarizer is one of such solutions: users can input the page url they want to summarize or paste texts directly into the text box. NLP tools are used to identify languages, recognize named entities, search for related facts.

Sentiment analysis is one of the major areas of NLP. It helps machines detect general sentiment of a text message. Technology tools can easily detect emotions when a video or a recording are analyzed. The task is a little bit more difficult when it comes to the analysis of text. Marketers often use NLP tools in opinion mining to learn what people think about a product/service. It’s evident that film studios can use sentiment analysis to find out people's views about a film.

Sentiment Analysis Accuracy

When it comes to the automatic classification of movie reviews, researchers may choose one of existing approaches or combine two of them or more. Each approach is quite precise, and some experts claimed that they could achieve approximately 65% of sentiment classification accuracy. They also showed that higher accuracy (67.931%) could be reached by combining statistical-based, bag-of-words-based, content-based, and lexicon-based approaches.

Similar results (75-83% accuracy) were achieved by combining three components (namely, Categorizer, Comparator and Sentiment Analyzer) of Intellexer SDK to analyze hotel and restaurant reviews. You can see how it works here.

Intellexer Sentiment Analyzer is a linguistic tool that utilizes linguistic and statistical information along with a set of semantic rules.

How Sentiment Analyzer Works

Let’s see how Intellexer Sentiment Analyzer can extract sentiments from Rotten Tomatoes' reviews of "Fifty Shades Darker". The sample code is available here. To run it by yourself you need to have access to Intellexer cloud API and a Python interpreter installed.

You should take the following steps to start using the API:

  1. Create an account.
  2. Read the documentation to choose the method appropriate for your task (the analyzeSentiments method is suitable for the analysis of film reviews).
  3. Execute a GET/POST HTTP request and parse response results.

Movie reviews are transferred to the POST body in the form of JSON array, where each array item contains ‘id’ - the review ID and ‘text’ - the review text.

There are two types of weight (‘w’):

  • the sentiment weight of the opinion (negative or positive values are used for opinion phrases, zero values – for objects or ontology categories);
  • the sentiment weight of the review. This parameter is used to classify the whole text of a review as expressing a positive, neutral or negative opinion.

The code below illustrates how Sentiment Analyzer works:

import json
import urllib
import urllib2
# list of reviews in JSON format
reviews = """[
\"id\": \"snt1\",
\"text\": \"I know that “Fifty Shades Darker” isn’t supposed to be good — it’s supposed to be bad, in need of a spanking. This sequel is almost so bad that it’s good, and if only the filmmakers would submit to making campy comedy of E.L. James’ naughty novels, this just might be quality trash cinema.\"
\"id\": \"snt2\",
\"text\": \"Fifty Shades Darker opens with a smack. Not the erotic sound of palm hitting rump, but of junkies brawling as their 4-year-old son, BDSM-billionaire-to-be Christian Grey, cowers under a table. Months later, his birth mother dies of a heroin overdose. Doing the math, she could have been shooting up with fellow Seattle addict Kurt Cobain. The orphaned boy will be adopted by tycoons and upgrade from grunge to glam. His childhood pain will mutate into a fetish for whips, slaps, and sad-eyed brunettes who look like his mommy — a pathology diagnosed by a college kid who skipped most of Psychology 101. And so, in the film's first five minutes, Fifty Shades author E.L. James sets up the series's strange sanctimony: You're screwed up if you think this sex-torture stuff is hot. But hey, isn't it kinda hot?\"
# set the URL for POST request, specify url, parameters for information processing and API key for authorization purposes (change YourAPIKey to the Intellexer API key)
api_url = ""
# print categorized opinions
def print_tree(node, height):
for i in range(0, height):
print "\t",
print node.get("t"),
if node.get('w') != 0:
print "\t", node.get('w')
print "\t"
children = node.get('children')
height += 1
for child in children:
print_tree(child, height)
# print response results
def print_response(response):
print "Sentences with sentiment objects and phrases:";
sentences = response.get('sentences')
for sent in sentences:
print "Sentence Weight = ", sent.get('w'), "\t", sent.get('text').encode('utf-8')
#print categorized opinions
print "\nCategorized Opinions with sentiment polarity (positive/negative)"
print_tree(response.get('opinions'), 0)
# create request to the Sentiment Analyzer API service
def request_api(url, data):
header = { 'Content-Type' : "application/json" }
req = urllib2.Request(url, data, header)
conn = urllib2.urlopen(req)
json_response = json.loads(
# perform the request
request_api(api_url, reviews)
except urllib2.HTTPError as error:
print 'HTTP error - %s' %

Here is the output:

(Click on the image to enlarge it)

(Click on the image to enlarge it)

Film Review Analysis: Challenges

There are some challenges that custom software development companies have to address. I listed below some of the most common ones:

  1. One review may contain multiple opinions (even about the same entities). Sentence-level approaches, as a rule, are not able to discover opinions about each entity and (or) its aspects. The aspect-based approach is more suitable in such a case since it can evaluate two opinion targets of the same entity.
  2. Neutral or objective tweets may change the overall rating. Such tweets are believed to be "just a fact, without any sentiment or opinions associated with them".
  3. Polysemy and homographs. For example, the word "firm" can mean something secure/solid or a business organization/company depending on the context.
  4. Distinguishing the name from the description. It means that a movie title may include such words as "war" or "monster" that an NLP solution may recognize as negative ones and the total rating may be skewed.
  5. The use of anaphora. NLP solutions may experience certain difficulties while determining what a pronoun, a noun or a phrase refers to. E.g. "I ate my lunch and watched the movie. It was great".
  6. Slang is another challenge. People do use slang in their reviews and tweets. For instance, they may say "That's a bad shirt, man" when they mean it as a compliment to a friend.
  7. Sarcasm and subtlety: People like playing with words; and sarcasm and irony are some of the types of this game. Big data solutions are not always able to recognize a deeply buried meaning. What is more, there are cross-cultural differences pertinent to sarcasm.
  8. Special characters: Some movie titles contain accents (foreign movies, in particular). That is why the movies that have apostrophes in their titles may cause encoding problems.
  9. Misspelling: People make mistakes in their reviews and social media posts, and NLP tools may not classify such words correctly. E.g., Google found out that people living in California often confuse "dessert" and "desert", while people from Alaska often misspell "Hawaii".
  10. Geographic restrictions: A movie may be very popular in one region and panned in other regions. So, ratings may be mixed as only a small number of tweets have geotagging.

How Else Can Sentiment Analysis be Applied?

The role of NLP tools is not limited to classification of reviews into negative and positive ones. Negative and positive reviews can be grouped on the basis of a subject under discussion: script, actors, atmosphere (i.e. a special mood or feeling it creates among viewers. For instance, a film may have a mysterious atmosphere), etc. The reviews can be further analyzed to extract information on what exactly viewers liked and disliked in a movie.

Owners of film review websites will be able to create a more flexible movie rating system, thus offering users a chance to access opinions of others on each aspect of the movie to find out why it has such a rating. For instance, they will be able to learn that other people liked the leading actor due to emotions they experienced while watching the film, but they did not like the soundtrack since it did not correspond to the topic.

Some steps in this direction has already been taken: Subhabrata Mukherjee and Pushpak Bhattacharyya explored how to identify feature-specific expressions of opinion in product reviews describing different features and containing mixed emotions.

Movie Reviews and Genre Classification

At present, movie genres are mainly identified manually by those people who moderate websites. These people may have a strong passion for movies, but they may fail to identify a movie genre correctly.

As I mentioned previously, NLP tools can give researchers a helping hand in the identification of movie genres as reviews of movies belonging to the same genre will have some common features that enable NLP tools to group them together effectively and in a time-saving manner.

Nevertheless, developers of such tools will have to solve one issue: they need to choose a movie genre scheme they are going to use. Nowadays, movies do not belong to one genre, they represent a combination of multiple genres. E.g. IMDb says that "Star Trek Beyond" released in 2016 belongs to the following genres: action, adventure, sci-fi, and thriller. And this is true, as this movie contains features of all these genres (and some others that are not mentioned). This publication explores issues related to the genre classification more deeply (from the machine learning perspective).

NLP and Similar Movies

Movies can belong to different genres, but have an analogous impact on viewers. For instance, you may like "X-Men" (classified as an action, adventure and sci-fi movie by IMDb) thanks to the love story described in it. But if you try to find similar movies using existing review websites, you will be advised another sci-fi movie, not a love story you are looking for.

NLP tools can do much more than just sentiment analysis and movies' classification by genre. NLP solutions like Comparator can compare reviews and set the degree of similarity between them. This case study describes how NLP solutions can help to manage media content.


NLP is a powerful solution that can take the movie review system to the next level. The information obtained by these tools can be used by site owners to create extended movie reviews with a focus on specific aspects, to classify movies on the basis of both genres and their similarity. This can be also used to make targeted advertising work properly.

Do not limit the sources of data for the research to the ones devoted to movie reviews. Social media like Twitter and YouTube can vie with such websites in terms of squeezable data.


1. Amolik, A., Jivane, N., Bhandari, M., Venkatesan, M. (2015). Twitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques. International Journal of Engineering and Technology, Volume 7, Issue 6. Retrieved  April 26, 2017 from this link.

2. Brennan, M. W. (2016, November). Performance Comparison of 10 Linguistic APIs for Entity Recognition. ProgrammableWeb.

3. EffectiveSoft, Ltd. (2014) Intellexer Sentiment Analyzer SDK WP [White paper]. Retrieved April 26, 2017, from Intellexer.

4. Kitin, Y. (2016, August). Will Google NL kill the market? Linguistic APIs review. LinkedIn. Retrieved  April 26, 2017 from this link.

5. Kitin, Y. (2016, November). Online Summarizers overview. LinkedIn. Retrieved  April 26, 2017 from this link.

6. Manning, C.D., Raghavan, P., Schütze, H. (2008). Introduction to Information Retrieval.

7. Turney, P. D. (2002, July). Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews.

About the Author

Tatsiana Levdikova is a Tech Journalist at EffectiveSoft. She writes about software development, UI and UX, natural language processing, Big Data, AI, and other IT-related topics.

Rate this Article