This function takes a tweet dataframe as input. The input dataframe should contain a column named 'tweet' that contains tweet text information. The function cleans the text inn 'tweet' column by removing http component, punctuation, end words, reduce the letters to lowercase and word stemming. Then the function matches each word to the sentiment 'positive' or 'negative' using the tidytext 'Bing' lexicons. And output is a dataframe that contains words used in the tweet texts and assign each word with either 'positive' or 'negative' sentiment plus sorting by the numbers of appearence of that word.
sentiment_analysis(tweet)
tweet | data.frame |
---|
tweet_result data.frame
#>#>#> # A tibble: 126 x 3 #> word sentiment n #> <chr> <chr> <int> #> 1 love positive 40 #> 2 happy positive 25 #> 3 awesome positive 4 #> 4 killed negative 4 #> 5 beautiful positive 3 #> 6 blow negative 3 #> 7 cool positive 3 #> 8 fuck negative 3 #> 9 magic positive 3 #> 10 merry positive 3 #> # ... with 116 more rows