Problem 5 - Twitter analysis

Q (Optional): What is the fraction of tweets that are neutral i.e. neither happy nor sad.

How do I calculate this? In problem set, there are about 10 tweets. 6 of them being happy, while 2 of them being sad. Now, how do I find tweets that are neutral?
Is it 2 tweets that are neutral? because from 10 tweets, 6 are happy and 2 are sad making other left 2 tweets neutral.?
Will we have neutral tweets
if number of happy tweets == number of sad tweets?

I’m not sure if this is the most efficient way to calculate the number of neutral tweets, but it is how I approached the problem:

It is easier to classify happy/sad tweets than those that are not happy/sad
i.e. filtering by

for i in range(number_of_tweets):
    for h_word in happy_words:

as opposed to

for i in range(number_of_tweets):
    for h_word not in happy_words:

By definition, we have that

number_of_neutral_tweets = number_of_tweets - (number_of_happy_tweets + number_of_sad_tweets)

This can be achieved by first assuming that all tweets are neutral, then removing happy/sad tweets from this number in a for loop using the happy/sad filters (which I will leave for you to complete :slightly_smiling_face:).

You are correct though, there are 2 neutral tweets in the dataset.

1 Like

This question induces you to use a for loop, but I think that is the most expensive solution in terms of computational cost.

neutral_fraction = 1 - ((number_of_happy_tweets + number_of_sad_tweets) / number_of_tweets)

Plain simple. Am I wrong?

As we know that happy, sad and neutral tweets are mutually exclusive (if it is a happy tweet then how is it supposed to be a sad or neutral ? ).So your first concept is okay to be implemented.
Yes , we may have neutral tweets even though happy tweets == sad tweets.
for instance,
total number of tweets = 10 ;
number of happy tweets = 2
number of sad tweets = 2
neutral tweets = total - happy -sad = 10 - 2- 2 = 6.