Classifying Professional Photographers on Instagram: Data Collection and Processing for Computational Learning
DOI:
https://doi.org/10.9781/ijimai.2026.2211Keywords:
Computational Social Science, Data-Driven Evaluation, Data Mining, Instagram, Photography Capabilities, User ExpertiseAbstract
Nowadays, the surge in open data on the internet allows researchers to investigate and broaden the understanding of numerous significant disciplines. However, there remains a notable deficiency in the advancement of methodologies for identifying artistic skills, particularly in the field of expertise finding, due to their subjectivity and the shortage of available datasets. Thus, we saw an opportunity in the popularity of photo sharing platforms to create a dataset for the identification of professional photographers’ profiles. Our first contribution is a comprehensive, multimodal dataset that encompasses a wide array of attributes from 29 679 Instagram posts, originating from 1042 corresponding user profiles labelled as professional or not professional photographers. Employing this extensive dataset, we explored different machine learning (ML) models to assess their efficacy in classifying these profiles into their respective categories. The Random Forest (RF) model showed the best performance, being able to understand the common structure for professional photographers Instagram profiles. Further statistical analysis revealed significant distinctions between both types of profiles. The most important features for identifying a professional photographer are the number of users tagged, the technical score in their posts, and the height variance of the pictures made. The results obtained in this work hold the potential to significantly inform future research and offer practical applications across multiple real-world scenarios.
Downloads
References
[1] S. Strukova, J. A. Ruipérez-Valiente, A Framework for Data-Driven Computer-Based Diagnostics of Competencies and Capabilities Across Contexts, pp. 57–81. Cham: Springer Nature Switzerland, 2025, https://doi.org/10.1007/978-3-031-87740-7_4
[2] J. J. Van Bavel, C. E. Robertson, K. Del Rosario, J. Rasmussen, S. Rathje, “Social media and morality,” Annuual Review of Psychology, vol. 75, pp. 311–340, Jan. 2024, doi: https://doi.org/10.1146/annurevpsych022123-110258
[3] A. Whiting, D. Williams, “Why people use social media: a uses and gratifications approach,” Qualitative market research: an international journal, vol. 16, no. 4, pp. 362–369, 2013, doi: https://doi.org/10.1108/ QMR06-2013-0041
[4] E. Lee, J.-A. Lee, J. H. Moon, Y. Sung, “Pictures speak louder than words: Motivations for using instagram,” Cyberpsychology, behavior, and social networking, vol. 18, no. 9, pp. 552–556, 2015, doi: https://doi.org/10.1089/cyber.2015.0157
[5] S. Kemp, “Digital 2023 april global statshot report,” 2023. [Online]. Available: https://datareportal.com/reports/digital-2023-aprilglobal-statshot
[6] S. Strukova, R. G. Marco, F. G. Mármol, J. A. Ruipérez-Valiente, “Identifying professional photographers through image quality and aesthetics in flickr,” Expert Systems, Dec. 2023, doi: https://doi.org/10.1111/exsy.13526
[7] J. Kim, S. Lee, “Deep learning of human visual sensitivity in image quality assessment framework,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1676–1684, doi: https://doi.org/10.1109/CVPR.2017.213
[8] A. Mittal, A. K. Moorthy, A. C. Bovik, “Noreference image quality assessment in the spatial domain,” IEEE Transactions on image processing, vol. 21, no. 12, pp. 4695–4708, 2012, doi: https://doi.org/10.1109/TIP.2012.2214050
[9] D. Sánchez, S. Strukova, J. A. Ruipérez-Valiente, “Instagram profile database,” 2023. [Online]. Available: https://github.com/strukovas/DatasetInstagramProfiles
[10] L. Chen, A. Roy, “Event detection from flickr data through waveletbased spatial analysis,” in Proceedings of the 18th ACM conference on Information and knowledge management, 2009, pp. 523–532, doi: https://doi.org/10.1145/1645953.1646021
[11] Y. Hu, L. Manikonda, S. Kambhampati, “What we instagram: A first analysis of instagram photo content and user types,” Proceedings of the International AAAI Conference on Web and Social Media, vol. 8, pp. 595–598, May 2014, doi: https://doi.org/10.1609/icwsm.v8i1.14578
[12] D. Lekkas, R. J. Klein, N. C. Jacobson, “Predicting acute suicidal ideation on instagram using ensemble machine learning models,” Internet Interventions, vol. 25, p. 100424, 2021, doi: https://doi.org/10.1016/j.invent.2021.100424
[13] A. Zohourian, H. Sajedi, A. Yavary, “Popularity prediction of images and videos on instagram,” in 2018 4th International Conference on Web Research (ICWR), 2018, pp. 111–117, IEEE, doi: https://doi.org/10.1109/ICWR.2018.8387246
[14] W. H. Lim, M. J. Carman, S.-M. J. Wong, “Estimating relative user expertise for content quality prediction on reddit,” in Proceedings of the 28th ACM Conference on Hypertext and Social Media, HT ’17, 2017, p. 55–64, Association for Computing Machinery, doi: https://doi.org/10.1145/3078714.3078720
[15] S. Patil, K. Lee, “Detecting experts on quora: by their activity, quality of answers, linguistic characteristics and temporal behaviors,” Social Network Analysis and Mining, vol. 6, 12 2015, doi: https://doi.org/10.1007/s13278-015-0313-x
[16] V. Ha-Thuc, G. Venkataraman, M. Rodriguez, S. Sinha, S. Sundaram, L. Guo, “Personalized expertise search at linkedin,” in 2015 IEEE International Conference on Big Data (Big Data), 2015, pp. 1238–1247, IEEE, doi: https://doi.org/10.1109/BigData.2015.7363878
[17] P. Wesołowski, “Enhancing architectural engineering students’ acquisition of artistic technical competences and soft skills,” Cogent Arts & Humanities, vol. 9, no. 1, p. 2043997, 2022, doi: https://doi.org/10.1080/23311983.2022.2043997
[18] V. S. Pagolu, K. N. Reddy, G. Panda, B. Majhi, “Sentiment analysis of twitter data for predicting stock market movements,” in 2016 international conference on signal processing, communication, power and embedded system (SCOPES), 2016, pp. 1345–1350, IEEE, doi: DOI:10.1109/SCOPES.2016.7955659
[19] S. M. Idrees, M. A. Alam, P. Agarwal, “A prediction approach for stock market volatility based on time series data,” IEEE Access, vol. 7, pp. 17287–17298, 2019, doi: https://doi.org/10.1109/ACCESS.2019.2895252
[20] D. van Dijk, M. Tsagkias, M. de Rijke, “Early detection of topical expertise in community question answering,” in Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’15, New York, NY, USA, 2015, p. 995–998, Association for Computing Machinery, doi: https://doi.org/10.1145/2766462.2767840
[21] M. Gil-Ramírez, R. Gómez-de TravesedoRojas, A. Almansa-Martínez, “Political debate on youtube: revitalization or degradation of democratic deliberation?,” Profesional de la información, vol. 29, no. 6, 2020, doi: https://doi.org/10.3145/epi.2020.nov.38
[22] P. P. Tricomi, S. Kumar, M. Conti, V. Subrahmanian, “Climbing the influence tiers on tiktok: A multimodal study,” Proceedings of the International AAAI Conference on Web and Social Media, vol. 18, pp. 1503–1516, May 2024, doi: https://doi.org/10.1609/icwsm.v18i1.31405
[23] M. Kostic, H. F. Witschel, K. Hinkelmann, M. Spahic-Bogdanovic, “Llms in automated essay evaluation: A case study,” Proceedings of the AAAI Symposium Series, vol. 3, pp. 143–147, May 2024, doi: https://doi.org/10.1609/aaaiss.v3i1.31193
[24] A. K.-K. Alexander Graf, “Instaloader: Instagram scraper repository,” 2016. [Online]. Available: https://github.com/althonos/InstaLooter
[25] K. Seshadrinathan, T. N. Pappas, R. J. Safranek, J. Chen, Z. Wang, H. R. Sheikh, A. C. Bovik, “Image quality assessment,” in The Essential Guide to Image Processing, Boston: Academic Press, 2009, pp. 553 595, doi: https://doi.org/10.1016/B978-0-12-374457-9.00021-4
[26] L. Kang, P. Ye, Y. Li, D. Doermann, “Convolutional neural networks for no-reference image quality assessment,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1733–1740, doi: https://doi.org/10.1109/CVPR.2014.224
[27] H. Talebi, P. Milanfar, “Nima: Neural image assessment,” IEEE transactions on image processing, vol. 27, no. 8, pp. 3998–4011, 2018, doi: https://doi.org/10.1109/TIP.2018.2831899
[28] N. Murray, L. Marchesotti, F. Perronnin, “Ava: A large-scale database for aesthetic visual analysis,” in 2012 IEEE conference on computer vision and pattern recognition, 2012, pp. 2408–2415, IEEE, doi: https://doi.org/10.1109/CVPR.2012.6247954
[29] N. Ponomarenko, L. Jin, O. Ieremeiev, V. Lukin, K. Egiazarian, J. Astola, B. Vozel, K. Chehdi, M. Carli, F. Battisti, C.-C. Jay Kuo, “Image database tid2013: Peculiarities, results and perspectives,” Signal Processing: Image Communication, vol. 30, pp. 57–77, 2015, doi: https://doi.org/10.1016/j.image.2014.10.009
[30] Z. Wang, J. Chen, S. C. Hoi, “Deep learning for image super-resolution: A survey,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 10, pp. 3365–3387, 2020, doi: https://doi.org/10.1109/TPAMI.2020.2982166
[31] A. Ly, B. Uthayasooriyar, T. Wang, “A survey on natural language processing (nlp) and applications in insurance,” 2020, doi: https: https://doi.org/10.48550/arXiv.2010.00462
[32] S. Loria, “Textblob: Simplified text processing,” 2013. [Online]. Available: https://github.com/sloria/textblob
[33] S. Bird, “Natural language toolkit (nltk),” 2006. [Online]. Available: https://github.com/nltk/nltk
[34] S. Bird, E. Klein, E. Loper, Natural Language Processing with Python. O’Reilly Media, Inc., 1st ed., 2009.
[35] A. Ward, “Textstat: Nlp python package,” 2014. [Online]. Available: https://github.com/textstat/textstat
[36] Y. Karaca, M. Moonis, “Chapter 14 -shannon entropybased complexity quantification of nonlinear stochastic process: diagnostic and predictive spatiotemporal uncertainty of multiple sclerosis subgroups,” in MultiChaos, Fractal and Multi-Fractional Artificial Intelligence of Different Complex Systems, Academic Press, 2022, pp. 231–245, doi: 10.1016/B978-0-323-90032-4.00018-3
Downloads
Published
-
Abstract72
-
PDF42






