Automatic Irony Detection using Feature Fusion and Ensemble Classifier
DOI:
https://doi.org/10.9781/ijimai.2019.07.002Keywords:
Classification, Clustering, Feature Selection, Ensemble Methods, Sentiment Analysis, Feature Fusion, Irony, K-meansAbstract
With the advent of micro-blogging sites, users are pioneer in expressing their sentiments and emotions on global issues through text. Automatic detection and classification of sentiments like sarcastic or ironic content in microblogging reviews is a challenging task. It requires a system that manages some kind of knowledge to interpret the sentiment expressed in text. The available approaches are quite limited in their capabilities and scope to detect ironic utterances present in the text. In this regards, the paper propose feature fusion to provide knowledge to the system by alternative sets of features obtained using linguistic and content based text features. The proposed work extracts five sets of linguistic features and fuses with features selected using two stages of a feature selection method. In order to demonstrate the effectiveness of the proposed method, we conduct extensive experimentation by selecting different feature subsets. The performances of the proposed method are evaluated using Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Decision Tree (DT) and ensemble classifiers. The experimental result shows the proposed approach significantly out-performs the conventional methods.Downloads
References
M. P. S. Bhatia and A. K. Khalid, “Information retrieval and machine learning: Supporting technologies for web mining research and practice,” Webology, 2008, vol. 5, pp. 2–19.
G. Wang, J. Sun, J. Ma, K. Xu, and J. Gu, “Sentiment classification: The contribution of ensemble learning,” Decision Support Systems, 2014, vol. 57, pp. 77–93.
E. Fersini, E. Messina, and F. A. Pozzi, “Sentiment analysis: Bayesian ensemble learning,” Decision Support Systems, 2014, vol. 68, pp. 26–38.
A. Onan, S. Korukoğlu, and H. Bulut, “A hybrid ensemble pruning approach based on consensus clustering and multi-objective evolutionary algorithm for sentiment classification,” Information Processing & Management, 2017, vol. 53(4), pp. 814–833.
V. David, “Towards a sociology of attitudes,” Sociological Research Online, 2011, vol. 19(1), p. 12.
R. J. Davidson, K. R. Scherer, and H. H. Goldsmith, Handbook of Affective Sciences, Oxford University Press, 2009.
J. R. Martin and R. Peter White, The Language of Evaluation, 2003, vol. 2, Basingstoke: Palgrave Macmillan.
B. Pang and L. Lee, “Opinion mining and sentiment analysis,” Foundations and Trends in Information Retrieval, 2008, vol. 2(1–2), pp. 1–135.
C. C. Florian, F. A. Kunneman, and A. P. J. Van den Bosch, “The perfect solution for detecting sarcasm in tweets #not,” in Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA 2013), 2013.
I. Hernández-Farías, J. M. Benedí, and P. Rosso, “Applying basic features from sentiment analysis for automatic irony detection,” in Iberian Conference on Pattern Recognition and Image Analysis, 2015, pp. 337–344. Springer, Cham.
S. Attardo, “Irony markers and functions: Towards a goal-oriented theory of irony and its processing,” Rask, 2000, vol. 12(1), pp. 3–20.
R. González-Ibánez, S. Muresan, and N. Wacholder, “Identifying sarcasm in Twitter: a closer look,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (Short Papers, Volume 2), 2011, pp. 581–586. Association for Computational Linguistics.
S. G. Wicana, T. Y. İbisoglu, and U. Yavanoglu, “A Review on Sarcasm Detection from Machine-Learning Perspective,” in 2017 IEEE 11th International Conference on Semantic Computing (ICSC), 2017, pp. 469–476. IEEE.
A. D. Dave and N. P. Desai, “A comprehensive study of classification techniques for sarcasm detection on textual data,” in 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), 2016, pp. 1985–1991. IEEE.
S. K. Bharti, B. Vachha, R. K. Pradhan, K. S. Babu, and S. K. Jena, “Sarcastic sentiment detection in tweets streamed in real time: A big data approach,” Digital Communications and Networks, 2016, vol. 2(3), pp. 108–121.
A. Joshi, P. Bhattacharyya, and M. J. Carman, “Automatic Sarcasm Detection: A Survey,” ACM Computing Surveys (CSUR), 2017, vol. 50(5), p. 73.
K. Ravi and V. Ravi, “A novel automatic satire and irony detection using ensembled feature selection and data mining,” Knowledge-Based Systems, 2017, vol. 120, pp. 15–33.
K. Buschmeier, P. Cimiano, and R. Klinger, “An Impact Analysis of Features in a Classification Approach to Irony Detection in Product Reviews,” in WASSA@ACL, 2014, pp. 42–49.
E. Filatova, “Sarcasm Detection Using Sentiment Flow Shifts,” Florida Artificial Intelligence Research Society Conference, 2017, pp. 264–269.
E. Filatova, “Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing,” in LREC, 2012, pp. 392–398.
R. Justo, T. Corcoran, S. M. Lukin, M. Walker, and M. I. Torres, “Extracting relevant knowledge for the detection of sarcasm and nastiness in the social web,” Knowledge-Based Systems, 2014, vol. 69, pp. 124–133.
D. Davidov, O. Tsur, and A. Rappoport, “Semi-supervised recognition of sarcastic sentences in Twitter and Amazon,” in Proceedings of the Fourteenth Conference on Computational Natural Language Learning, 2013, pp. 107–116. Association for Computational Linguistics.
M. Hu and B. Liu, “Mining and summarizing customer reviews,” in Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2004.
R. W. Gibbs, “Irony in talk among friends,” Metaphor and Symbol, 2000, vol. 15(1–2), pp. 5–27.
B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up?: sentiment classification using machine learning techniques,” in Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, 2002, vol. 10, pp. 79–86. Association for Computational Linguistics.
S. Dey Sarkar, S. Goswami, A. Agarwal, and J. Aktar, “A Novel Feature Selection Technique for Text Classification Using Naïve Bayes,” International Scholarly Research Notices, 2014.
A. Reyes and P. Rosso, “Mining subjective knowledge from customer reviews: A specific case of irony detection,” in Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, 2011, pp. 118–124. Association for Computational Linguistics.
J. Fürnkranz, “A study using n-gram features for text categorization,” Austrian Research Institute for Artificial Intelligence, 1998, vol. 3, pp. 1–10.
G. Li, X. Hu, X. Shen, X. Chen, and Z. Li, “A novel unsupervised feature selection method for bioinformatics data sets through feature clustering,” in 2008 IEEE International Conference on Granular Computing, 2008, pp. 41–47. IEEE.
A. N. Reganti, T. Maheshwari, U. Kumar, A. Das, and R. Bajpai, “Modeling Satire in English Text for Automatic Detection,” in 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), 2016, pp. 970–977. IEEE.
L. Rokach, “Ensemble-based classifiers,” Artificial Intelligence Review, 2010, vol. 33, no. 1–2, pp. 1–39.
A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R. Passonneau, “Sentiment analysis of Twitter data,” in Proceedings of the Workshop on Languages in Social Media, 2011, pp. 30–38. Association for Computational Linguistics.
A. B. Musa, “Comparative study on classification performance between support vector machine and logistic regression,” International Journal of Machine Learning and Cybernetics, 2013, vol. 4(1), pp. 13–24.
I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2016.
A. Bouaziz, C. Dartigues-Pallez, C. da Costa Pereira, F. Precioso, and P. Lloret, “Short text classification using semantic random forest,” in International Conference on Data Warehousing and Knowledge Discovery, Springer, 2014, pp. 288–299.
Y. Al Amrani, M. Lazaar, and K. E. El Kadiri, “Random Forest and Support Vector Machine based Hybrid Approach to Sentiment Analysis,” Procedia Computer Science, 2018, vol. 127, pp. 511–520.
T. G. Dietterich, “Ensemble methods in machine learning,” in International Workshop on Multiple Classifier Systems, 2000, pp. 1–15. Springer, Berlin, Heidelberg.
A. Onan, S. Korukoğlu, and H. Bulut, “A hybrid ensemble pruning approach based on consensus clustering and multi-objective evolutionary algorithm for sentiment classification,” Information Processing & Management, 2017, vol. 53(4), pp. 814–833.
Y. Zhang, H. Zhang, J. Cai, and B. Yang, “A weighted voting classifier based on differential evolution,” Abstract and Applied Analysis, 2014, vol. 2014. Hindawi.
L. I. Kuncheva and J. J. Rodríguez, “A weighted voting framework for classifiers ensembles,” Knowledge and Information Systems, 2014, vol. 38(2), pp. 259–275.
A. Ekbal and S. Saha, “Weighted vote-based classifier ensemble for named entity recognition: A genetic algorithm-based approach,” ACM Transactions on Asian Language Information Processing (TALIP), 2011, vol. 10(2), p. 9.
R. J. Kreuz and R. M. Roberts, “Two cues for verbal irony: Hyperbole and the ironic tone of voice,” Metaphor and Symbol, 1985, vol. 10(1), pp. 21–31.
A. Utsumi, “Verbal irony as implicit display of ironic environment: Distinguishing ironic utterances from nonirony,” Journal of Pragmatics, 2000, vol. 32(12), pp. 1777–1806.
A. Strehl, J. Ghosh, and R. Mooney, “Impact of similarity measures on web-page clustering,” in Workshop on Artificial Intelligence for Web Search (AAAI 2000), 2000, pp. 58–64.
L. Sahu and B. R. Mohan, “An improved K-means algorithm using modified cosine distance measure for document clustering using Mahout with Hadoop,” in 2014 9th International Conference on Industrial and Information Systems (ICIIS), 2014, pp. 1–5. IEEE.
K. Chakraborty, S. Bhattacharyya, R. Bag, and A. E. Hassanien, “Comparative Sentiment Analysis on a Set of Movie Reviews Using Deep Learning Approach,” in International Conference on Advanced Machine Learning Technologies and Applications, 2018, pp. 311–318.
O. Tsur, D. Davidov, and A. Rappoport, “ICWSM—A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews,” in ICWSM, 2010, pp. 162–169.
B. S. Harish and M. B. Revanasiddappa, “A comprehensive survey on various feature selection methods to categorize text documents,” International Journal of Computer Applications, 2017, vol. 164, pp. 1–7.
A. Yousefpour, R. Ibrahim, and H. N. A. Hamed, “Ordinal-based and frequency-based integration of feature selection methods for sentiment analysis,” Expert Systems with Applications, 2017, vol. 75, pp. 80–93.
A. Tripathy, A. Agrawal, and S. K. Rath, “Classification of sentiment reviews using n-gram machine learning approach,” Expert Systems with Applications, 2016, vol. 57, pp. 117–126.
A. Joshi, V. Sharma, and P. Bhattacharyya, “Harnessing context incongruity for sarcasm detection,” in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP), 2015, vol. 2, pp. 757–762.
M. Bouazizi and T. O. Ohtsuki, “A pattern-based approach for sarcasm detection on Twitter,” IEEE Access, 2016, vol. 4, pp. 5477–5488.
W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams Engineering Journal, 2014, vol. 5(4), pp. 1093–1113.
H. Toman, L. Kovacs, A. Jonas, L. Hajdu, and A. Hajdu, “Generalized weighted majority voting with an application to algorithms having spatial output,” in International Conference on Hybrid Artificial Intelligence Systems, 2011, pp. 56–67.
U. Yavanoglu, T. Y. Ibisoglu, and S. G. Wıcana, “Technical Review: Sarcasm Detection Algorithms,” International Journal of Semantic Computing, 2018, vol. 12(03), pp. 457–478.
H. M. Kumar, B. S. Harish, and H. K. Darshan, “Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method,” International Journal of Interactive Multimedia and Artificial Intelligence, 2019, vol. 5(5), pp. 109–114.
H. M. Kumar and B. S. Harish, “Classification of Short Text Using Various Preprocessing Techniques: An Empirical Evaluation,” in Recent Findings in Intelligent Computing Techniques, pp. 19–30. Springer, Singapore, 2018.
Downloads
Published
-
Abstract55
-
PDF23






