Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method

Authors

DOI:

https://doi.org/10.9781/ijimai.2018.12.005

Keywords:

Classification, Sentiment Analysis, Hybrid Features, Short Text
Supporting Agencies
H M Keerthi Kumar has been financially supported by UGC under Rajiv Gandhi National Fellowship (RGNF) Letter no: F1-17.1/2016-17/RGNF-2015-17-SC-KAR-6370/(SA-III Website), JSSRF (University of Mysore), Karnataka, India.

Abstract

Social Networking sites have become popular and common places for sharing wide range of emotions through short texts. These emotions include happiness, sadness, anxiety, fear, etc. Analyzing short texts helps in identifying the sentiment expressed by the crowd. Sentiment Analysis on IMDb movie reviews identifies the overall sentiment or opinion expressed by a reviewer towards a movie. Many researchers are working on pruning the sentiment analysis model that clearly identifies and distinguishes between a positive review and a negative review. In the proposed work, we show that the use of Hybrid features obtained by concatenating Machine Learning features (TF, TF-IDF) with Lexicon features (Positive-Negative word count, Connotation) gives better results both in terms of accuracy and complexity when tested against classifiers like SVM, Naïve Bayes, KNN and Maximum Entropy. The proposed model clearly differentiates between a positive review and negative review. Since understanding the context of the reviews plays an important role in classification, using hybrid features helps in capturing the context of the movie reviews and hence increases the accuracy of classification.

Downloads

Download data is not yet available.

References

C. D. Santos and M. Gatti. “Deep convolutional neural networks for sentiment analysis of short texts.” In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, 2014, pp. 69-78.

A. Ortigosa, J. M. Martín, and R. M. Carro. “Sentiment analysis in Facebook and its application to e-learning.” Computers in Human Behavior Vol. 31, pp.527-541. 2014.

R. Ahmad, A. Pervaiz, P. Mannan, and F. Zaffar. “Aspect Based Sentiment Analysis for Large Documents with Applications to US Presidential Elections 2016.” Social Technical and Social Inclusion Issues (SIGSI), 2017, pp. 13.

K. Xu, S. S. Liao, J. Li, and Y. Song. “Mining comparative opinions from customer reviews for Competitive Intelligence.” Decision support systems, Vol. 50, no. 4, pp.743-754. 2011.

A. Tripathy, A. Agrawal, and S.K. Rath. “Classification of sentiment reviews using n-gram machine learning approach.” Expert Systems with Applications, Vol. 57, pp. 117-126. 2016.

M. E. Moussa, E. H. Mohamed, and M. H. Haggag. “A survey on Opinion Summarization Techniques for Social Media.” Future Computing and Informatics Journal (2018). In press.

I. Hemalatha, G. P. S. Varma, and A. Govardhan. “Preprocessing the informal text for efficient sentiment analysis.” International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) 1, no. 2: pp.58-61. 2012.

A. S. Manek, P. D. Shenoy, M. C. Mohan, and K. R. Venugopal. “Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier.” World Wide Web Vol. 20, no. 2, pp.135-154. 2017.

A. Kennedy and D. Inkpen. “Sentiment classification of movie reviews using contextual valence shifters.” Computational intelligence, Vol. 22, no. 2, .pp.110-125. 2006.

M. Z. Asghar, A. Khan, S. Ahmad, and F. M. Kundi. “A review of feature extraction in sentiment analysis.” Journal of Basic and Applied Scientific Research, Vol. 4, no. 3, pp.181-186. 2012.

A. Sharma and S. Dey. “A comparative study of feature selection and machine learning techniques for sentiment analysis.” In Proceedings of the 2012 ACM research in applied computation symposium, pp. 1-7. ACM, 2012.

M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede. “Lexicon-based methods for sentiment analysis.” Computational linguistics, Vol. 37, no. 2, pp.267-307. 2011.

A. Mudinas, D. Zhang, and M. Levene. “Combining lexicon and learning based approaches for concept-level sentiment analysis.” In Proceedings of the first international workshop on issues of sentiment discovery and opinion mining, pp. 5. ACM, 2012.

L. Zheng, H. Wang, and S. Gao. “Sentimental feature selection for sentiment analysis of Chinese online reviews.” International journal of machine learning and cybernetics, Vol. 9, no. 1, pp.75-84. 2018.

D. Cai, X. He, W. V. Zhang, and J. Han. “Regularized locality preserving indexing via spectral regression.” In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pp. 741-750, ACM, 2007.

B. Pang, L. Lee, and S. Vaithyanathan. “Thumbs up?: sentiment classification using machine learning techniques.” In Proceedings of the ACL-02 conference on Empirical methods in natural language processingVolume 10, Association for Computational Linguistics, pp. 79-86. 2002.

M. S. Mubarok, Adiwijaya, and M. D. Aldhi. “Aspect-based sentiment analysis to review products using Naïve Bayes.” In AIP Conference Proceedings, vol. 1867, AIP Publishing, no. 1, pp 1-8.2017.

G. Gautam, and D. Yadav. “Sentiment analysis of twitter data using machine learning approaches and semantic analysis.” In Contemporary computing (IC3), 2014 seventh international conference on, pp. 437-442. IEEE, 2014.

A. G. Prasad, S. Sanjana, S. M. Bhat, and B. S. Harish. “Sentiment analysis for sarcasm detection on streaming short text data.” In Knowledge Engineering and Applications (ICKEA), 2017, 2nd International Conference on, pp. 1-5. IEEE, 2017.

P. Melville, W. Gryc, and R. D. Lawrence. “Sentiment analysis of blogs by combining lexical knowledge with text classification.” In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp. 1275-1284. 2009.

O. Kolchyna, T. T. P. Souza, P. Treleaven, and T. Aste. “Twitter sentiment analysis: Lexicon method, machine learning method and their combination.” arXiv preprint arXiv:1507.00955. 2015.

Y. Bao, C. Quan, L. Wang, and F. Ren. “The role of pre-processing in twitter sentiment analysis.” In International Conference on Intelligent Computing, pp. 615-624. Springer, 2014. Cham.

J. Brooke, M. Tofiloski, and M. Taboada. “Cross-linguistic sentiment analysis: From English to Spanish.” In Proceedings of the international conference RANLP-2009, pp. 50-54. 2009.

L. Deng, Y. Hu, J. P. Y. Cheung, and K. D. K. Luk. “A Data-Driven Decision Support System for Scoliosis Prognosis.” IEEE Access 5, pp. 7874-7884. 2017.

F. K. Ahmad. “Comparative Analysis of Feature Extraction Techniques for Event Detection from News Channels’ Facebook Page.” Journal of Telecommunication, Electronic and Computer Engineering (JTEC) Vol. 9, no. 1-2 , pp.13-17. 2017.

H. M. Kumar, B. S. Harish, S. V. Kumar, and V. N. Aradhya. “Classification of sentiments in short-text: an approach using mSMTP measure”. In Proceedings of the 2nd International Conference on Machine Learning and Soft Computing. pp. 145-150. ACM. 2018.

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts. “Learning word vectors for sentiment analysis.” In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1, Association for Computational Linguistics, pp. 142-150. 2011.

M Hu, and B. Liu. “Mining and summarizing customer reviews.” In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp. 168-177. 2004.

S. Feng, J. S. Kang, P. Kuznetsova, and Y. Choi. “Connotation lexicon: A dash of sentiment beneath the surface meaning.” In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Vol. 1, pp. 1774-1784. 2013.

K. S. Srujan, S. S. Nikhil, H. Raghav Rao, K. Karthik, B. S. Harish, and H. M. Kumar. “Classification of Amazon Book Reviews Based on Sentiment Analysis.” In Information Systems Design and Intelligent Applications, pp. 401-411. Springer, Singapore, 2018.

M. B. Revanasiddappa, B. S. Harish. A New Feature Selection Method based on Intuitionistic Fuzzy Entropy to Categorize Text Documents, International Journal of Interactive Multimedia and Artificial Intelligence, (2018), http://dx.doi.org/10.9781/ijimai.2018.04.002

R. Dehkharghani. Building Phrase Polarity Lexicons for Sentiment Analysis, International Journal of Interactive Multimedia and Artificial Intelligence, (2018), http://dx.doi.org/10.9781/ijimai.2018.10.004

H. M. Kumar and B. S. Harish. “Classification of Short Text Using Various Preprocessing Techniques: An Empirical Evaluation.” In Recent Findings in Intelligent Computing Techniques pp. 19-30. Springer, Singapore, 2018.

Downloads

Published

2019-06-01
Metrics
Views/Downloads
  • Abstract
    204
  • PDF
    68

How to Cite

Keerthi Kumar, H. M., Harish, B. S., and Darshan, H. K. (2019). Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method. International Journal of Interactive Multimedia and Artificial Intelligence, 5(5), 109–114. https://doi.org/10.9781/ijimai.2018.12.005

Most read articles by the same author(s)