Classifying Professional Photographers on Instagram: Data Collection and Processing for Computational Learning

Authors

DOI:

https://doi.org/10.9781/ijimai.2026.2211

Keywords:

Computational Social Science, Data-Driven Evaluation, Data Mining, Instagram, Photography Capabilities, User Expertise

Abstract

Nowadays, the surge in open data on the internet allows researchers to investigate and broaden the understanding of numerous significant disciplines. However, there remains a notable deficiency in the advancement of methodologies for identifying artistic skills, particularly in the field of expertise finding, due to their subjectivity and the shortage of available datasets. Thus, we saw an opportunity in the popularity of photo sharing platforms to create a dataset for the identification of professional photographers’ profiles. Our first contribution is a comprehensive, multimodal dataset that encompasses a wide array of attributes from 29 679 Instagram posts, originating from 1042 corresponding user profiles labelled as professional or not professional photographers. Employing this extensive dataset, we explored different machine learning (ML) models to assess their efficacy in classifying these profiles into their respective categories. The Random Forest (RF) model showed the best performance, being able to understand the common structure for professional photographers Instagram profiles. Further statistical analysis revealed significant distinctions between both types of profiles. The most important features for identifying a professional photographer are the number of users tagged, the technical score in their posts, and the height variance of the pictures made. The results obtained in this work hold the potential to significantly inform future research and offer practical applications across multiple real-world scenarios.

Downloads

Download data is not yet available.

Author Biographies

Sofia Strukova, University of Zurich

Sofia Strukova has an interdisciplinary background in computer science and Big Data. She earned her B.Sc. in computer science from Moscow Power Engineering Institute, Russia, and subsequently her M.Sc. in Big Data and Ph.D. in Computational Social Science from the University of Murcia, Spain. Her research interests revolve around computational social science, expertise finding, educational technology, data mining and data science in general. More info at https://strukovas.github.io/

Daniel Sánchez-Rodríguez, Universidad de Murcia

Daniel Sánchez-Rodríguez received his B.Sc. degree in computer science from the University of Murcia. His research interests include artificial intelligence, software development, data analysis, and computer science in general.

José A. Ruipérez-Valiente, Universidad de Murcia

José A. Ruipérez-Valiente received his B.Eng. degree in telecommunications from Universidad Católica de San Antonio de Murcia in 2011 and a M.Eng. degree in telecommunications in 2013, together with his M.Sc. and Ph.D. degrees (2014 and 2017) in telematics from Universidad Carlos III of Madrid while conducting research with Institute IMDEA Networks in the area of learning analytics and educational data mining. He was a postdoctoral associate at MIT. He has received more than 20 academic/research awards and fellowships, has published more than 130 scientific publications in highimpact venues, and participated in over 24 funded projects. He is currently an Associate Professor of Computer Science and Artificial Intelligence at the University of Murcia. More info at https://webs.um.es/jruiperez

References

[1] S. Strukova, J. A. Ruipérez-Valiente, A Framework for Data-Driven Computer-Based Diagnostics of Competencies and Capabilities Across Contexts, pp. 57–81. Cham: Springer Nature Switzerland, 2025, https://doi.org/10.1007/978-3-031-87740-7_4

[2] J. J. Van Bavel, C. E. Robertson, K. Del Rosario, J. Rasmussen, S. Rathje, “Social media and morality,” Annuual Review of Psychology, vol. 75, pp. 311–340, Jan. 2024, doi: https://doi.org/10.1146/annurevpsych022123-110258

[3] A. Whiting, D. Williams, “Why people use social media: a uses and gratifications approach,” Qualitative market research: an international journal, vol. 16, no. 4, pp. 362–369, 2013, doi: https://doi.org/10.1108/ QMR06-2013-0041

[4] E. Lee, J.-A. Lee, J. H. Moon, Y. Sung, “Pictures speak louder than words: Motivations for using instagram,” Cyberpsychology, behavior, and social networking, vol. 18, no. 9, pp. 552–556, 2015, doi: https://doi.org/10.1089/cyber.2015.0157

[5] S. Kemp, “Digital 2023 april global statshot report,” 2023. [Online]. Available: https://datareportal.com/reports/digital-2023-aprilglobal-statshot

[6] S. Strukova, R. G. Marco, F. G. Mármol, J. A. Ruipérez-Valiente, “Identifying professional photographers through image quality and aesthetics in flickr,” Expert Systems, Dec. 2023, doi: https://doi.org/10.1111/exsy.13526

[7] J. Kim, S. Lee, “Deep learning of human visual sensitivity in image quality assessment framework,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1676–1684, doi: https://doi.org/10.1109/CVPR.2017.213

[8] A. Mittal, A. K. Moorthy, A. C. Bovik, “Noreference image quality assessment in the spatial domain,” IEEE Transactions on image processing, vol. 21, no. 12, pp. 4695–4708, 2012, doi: https://doi.org/10.1109/TIP.2012.2214050

[9] D. Sánchez, S. Strukova, J. A. Ruipérez-Valiente, “Instagram profile database,” 2023. [Online]. Available: https://github.com/strukovas/DatasetInstagramProfiles

[10] L. Chen, A. Roy, “Event detection from flickr data through waveletbased spatial analysis,” in Proceedings of the 18th ACM conference on Information and knowledge management, 2009, pp. 523–532, doi: https://doi.org/10.1145/1645953.1646021

[11] Y. Hu, L. Manikonda, S. Kambhampati, “What we instagram: A first analysis of instagram photo content and user types,” Proceedings of the International AAAI Conference on Web and Social Media, vol. 8, pp. 595–598, May 2014, doi: https://doi.org/10.1609/icwsm.v8i1.14578

[12] D. Lekkas, R. J. Klein, N. C. Jacobson, “Predicting acute suicidal ideation on instagram using ensemble machine learning models,” Internet Interventions, vol. 25, p. 100424, 2021, doi: https://doi.org/10.1016/j.invent.2021.100424

[13] A. Zohourian, H. Sajedi, A. Yavary, “Popularity prediction of images and videos on instagram,” in 2018 4th International Conference on Web Research (ICWR), 2018, pp. 111–117, IEEE, doi: https://doi.org/10.1109/ICWR.2018.8387246

[14] W. H. Lim, M. J. Carman, S.-M. J. Wong, “Estimating relative user expertise for content quality prediction on reddit,” in Proceedings of the 28th ACM Conference on Hypertext and Social Media, HT ’17, 2017, p. 55–64, Association for Computing Machinery, doi: https://doi.org/10.1145/3078714.3078720

[15] S. Patil, K. Lee, “Detecting experts on quora: by their activity, quality of answers, linguistic characteristics and temporal behaviors,” Social Network Analysis and Mining, vol. 6, 12 2015, doi: https://doi.org/10.1007/s13278-015-0313-x

[16] V. Ha-Thuc, G. Venkataraman, M. Rodriguez, S. Sinha, S. Sundaram, L. Guo, “Personalized expertise search at linkedin,” in 2015 IEEE International Conference on Big Data (Big Data), 2015, pp. 1238–1247, IEEE, doi: https://doi.org/10.1109/BigData.2015.7363878

[17] P. Wesołowski, “Enhancing architectural engineering students’ acquisition of artistic technical competences and soft skills,” Cogent Arts & Humanities, vol. 9, no. 1, p. 2043997, 2022, doi: https://doi.org/10.1080/23311983.2022.2043997

[18] V. S. Pagolu, K. N. Reddy, G. Panda, B. Majhi, “Sentiment analysis of twitter data for predicting stock market movements,” in 2016 international conference on signal processing, communication, power and embedded system (SCOPES), 2016, pp. 1345–1350, IEEE, doi: DOI:10.1109/SCOPES.2016.7955659

[19] S. M. Idrees, M. A. Alam, P. Agarwal, “A prediction approach for stock market volatility based on time series data,” IEEE Access, vol. 7, pp. 17287–17298, 2019, doi: https://doi.org/10.1109/ACCESS.2019.2895252

[20] D. van Dijk, M. Tsagkias, M. de Rijke, “Early detection of topical expertise in community question answering,” in Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’15, New York, NY, USA, 2015, p. 995–998, Association for Computing Machinery, doi: https://doi.org/10.1145/2766462.2767840

[21] M. Gil-Ramírez, R. Gómez-de TravesedoRojas, A. Almansa-Martínez, “Political debate on youtube: revitalization or degradation of democratic deliberation?,” Profesional de la información, vol. 29, no. 6, 2020, doi: https://doi.org/10.3145/epi.2020.nov.38

[22] P. P. Tricomi, S. Kumar, M. Conti, V. Subrahmanian, “Climbing the influence tiers on tiktok: A multimodal study,” Proceedings of the International AAAI Conference on Web and Social Media, vol. 18, pp. 1503–1516, May 2024, doi: https://doi.org/10.1609/icwsm.v18i1.31405

[23] M. Kostic, H. F. Witschel, K. Hinkelmann, M. Spahic-Bogdanovic, “Llms in automated essay evaluation: A case study,” Proceedings of the AAAI Symposium Series, vol. 3, pp. 143–147, May 2024, doi: https://doi.org/10.1609/aaaiss.v3i1.31193

[24] A. K.-K. Alexander Graf, “Instaloader: Instagram scraper repository,” 2016. [Online]. Available: https://github.com/althonos/InstaLooter

[25] K. Seshadrinathan, T. N. Pappas, R. J. Safranek, J. Chen, Z. Wang, H. R. Sheikh, A. C. Bovik, “Image quality assessment,” in The Essential Guide to Image Processing, Boston: Academic Press, 2009, pp. 553 595, doi: https://doi.org/10.1016/B978-0-12-374457-9.00021-4

[26] L. Kang, P. Ye, Y. Li, D. Doermann, “Convolutional neural networks for no-reference image quality assessment,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1733–1740, doi: https://doi.org/10.1109/CVPR.2014.224

[27] H. Talebi, P. Milanfar, “Nima: Neural image assessment,” IEEE transactions on image processing, vol. 27, no. 8, pp. 3998–4011, 2018, doi: https://doi.org/10.1109/TIP.2018.2831899

[28] N. Murray, L. Marchesotti, F. Perronnin, “Ava: A large-scale database for aesthetic visual analysis,” in 2012 IEEE conference on computer vision and pattern recognition, 2012, pp. 2408–2415, IEEE, doi: https://doi.org/10.1109/CVPR.2012.6247954

[29] N. Ponomarenko, L. Jin, O. Ieremeiev, V. Lukin, K. Egiazarian, J. Astola, B. Vozel, K. Chehdi, M. Carli, F. Battisti, C.-C. Jay Kuo, “Image database tid2013: Peculiarities, results and perspectives,” Signal Processing: Image Communication, vol. 30, pp. 57–77, 2015, doi: https://doi.org/10.1016/j.image.2014.10.009

[30] Z. Wang, J. Chen, S. C. Hoi, “Deep learning for image super-resolution: A survey,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 10, pp. 3365–3387, 2020, doi: https://doi.org/10.1109/TPAMI.2020.2982166

[31] A. Ly, B. Uthayasooriyar, T. Wang, “A survey on natural language processing (nlp) and applications in insurance,” 2020, doi: https: https://doi.org/10.48550/arXiv.2010.00462

[32] S. Loria, “Textblob: Simplified text processing,” 2013. [Online]. Available: https://github.com/sloria/textblob

[33] S. Bird, “Natural language toolkit (nltk),” 2006. [Online]. Available: https://github.com/nltk/nltk

[34] S. Bird, E. Klein, E. Loper, Natural Language Processing with Python. O’Reilly Media, Inc., 1st ed., 2009.

[35] A. Ward, “Textstat: Nlp python package,” 2014. [Online]. Available: https://github.com/textstat/textstat

[36] Y. Karaca, M. Moonis, “Chapter 14 -shannon entropybased complexity quantification of nonlinear stochastic process: diagnostic and predictive spatiotemporal uncertainty of multiple sclerosis subgroups,” in MultiChaos, Fractal and Multi-Fractional Artificial Intelligence of Different Complex Systems, Academic Press, 2022, pp. 231–245, doi: 10.1016/B978-0-323-90032-4.00018-3

Downloads

Published

2026-03-10
Metrics
Views/Downloads
  • Abstract
    72
  • PDF
    42

How to Cite

Strukova, S., Sánchez-Rodríguez, D., and Ruipérez-Valiente, J. A. (2026). Classifying Professional Photographers on Instagram: Data Collection and Processing for Computational Learning. International Journal of Interactive Multimedia and Artificial Intelligence. https://doi.org/10.9781/ijimai.2026.2211

Issue

Section

Regular Articles