Training with More Data Boosts Accuracy of Sentiment Classification by 89%
The article explores how the amount of training data affects the accuracy of sentiment classification using Naïve Bayes. Sentiments are categorized as positive, negative, or neutral, with users classifying tweets based on specific keywords. The study uses five different training data sets (5, 10, 25, 50, and 100 tweets) and measures the accuracy of sentiment classification. The results show that as the number of training data sets increases, the accuracy of sentiment classification also improves. For instance, the accuracy ranges from 46% for 5 training data sets to 89% for 25 training data sets.