A Practical Cybersecurity Ontology Generator Based on Hierarchical Clustering and Multi-way Tree

Authors

DOI:

https://doi.org/10.9781/ijimai.2026.6499

Keywords:

Cybersecurity, Hierarchical Clustering, Multi-way Tree, Ontology Engineering, Ontology Learning
Supporting Agencies
This work is supported by the National Natural Science Foundation of China under grant 62572356 and the Innovation Funding Plan by Beijing TOPSEC Technologies Science and Technology Inc.

Abstract

Cybersecurity ontology development is typically carried out by cybersecurity experts and ontology engineers. While some existing works focus on extracting cybersecurity knowledge from either textual or structured data, few address the challenge of handling both types of data simultaneously. This paper presents Locust, a tool integrating structured data and domain corpus for comprehensive cybersecurity ontology generation. We use open source cybersecurity specifications as structured input to build the skeleton of the ontology, and use the domain corpus to enrich and finalise the ontology. Additionally, we propose a methodology for filtering and simplifying the ontology using hierarchical clustering and multi-way tree. Experimental results demonstrate the effectiveness of our approach in acquiring a cybersecurity ontology from specific domain data sources. Locust is implemented in Java and is available as an open source tool.

Downloads

Download data is not yet available.

Author Biographies

Yixuan Wang, Wuhan University

Yixuan Wang received his M.S. degree in Computer Science from the University of Electronic Science and Technology of China. After graduation, he worked as a software engineer in Nanjing Research Institute of Electronic Engineering. Now, he is a Ph.D candidate in the School of CyberScience and Engineering, Wuhan University. His research interests include semantic web and network security.

Bo Zhao, Wuhan University

Dr. Zhao received his PhD degree in Computer Science from Wuhan University, China. Now he is a professor in School of Cyber Science and Engineering, a member of China Cryptography Society and a senior member of China Computer Society. His current research interests include trusted computing, system security and network security. As the project team leader, he has successfully completed many research projects of high quality, including the projects sponsored by the National Science Fund of China. He has published over 100 journal and conference papers as the first or corresponding author. He is authorized more than 30 patents by the State Intellectual Property Office of China. Also, he has published the book titled Trusted Computing, and it has been adopted as a textbook by many universities.

Xiaofu Song, Wuhan University

Xiaofu Song received her M.S. degree in Computer Science from the Central China Normal University and worked as a security engineer in China Telecommunication Corporation after graduation. Now, she is currently a Ph.D candidate in the the School of Cyber Science and Engineering, Wuhan University. Her research interests include intrusion detection and network security

Jiahui Zhu, University of Electronic Science and Technology of China

Jiahui Zhu received the M.S. degree in Computer Science from the University of Electronic Science and Technology of China. He is currently pursuing the Ph.D. degree with the School of Information and Software Engineering, University of Electronic Science and Technology of China. His research interests include big data and data mining.

References

[1] Z. Syed, A. Padia, T. Finin, M. L. Mathews, A. Joshi, “UCO: A unified cybersecurity ontology,” Arizona, USA, 2016. [Online]. Available: http://www.aaai. org/ocs/index.php/WS/AAAIW16/paper/view/12574

[2] Y. Wang, B. Zhao, W. Li, L. Zhu, “An ontologycentric approach for network security situation awareness,” in 47th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2023, Torino, Italy, 2023, pp. 777–787, IEEE. https: //doi.org/10.1109/COMPSAC57700.2023.00107

[3] A. Oltramari, L. F. Cranor, R. J. Walls, P. D. McDaniel, “Building an ontology of cyber security,” Fairfax VA, USA, 2014. [Online]. Available: https://ceur-ws. org/Vol-1304/STIDS2014_T08_OltramariEtAl.pdf

[4] C. Grigoriadis, A. M. Berzovitis, I. Stellios, P. Kotzanikolaou, “A cybersecurity ontology to support risk information gathering in cyberphysical systems,” in Computer Security. ESORICS 2021 International Workshops - CyberICPS, SECPRE, ADIoT, SPOSE, CPS4CIP, and CDT&SECOMANE, Darmstadt, Germany, October 4-8, 2021, Revised Selected Papers, vol. 13106 of Lecture Notes in Computer Science, 2021, pp. 23–39, Springer. https://doi.org/10.1007/978-3-030-95484-0_2

[5] N. Rastogi, S. Dutta, M. J. Zaki, A. Gittens, C. C. Aggarwal, “Malont: An ontology for malware threat intelligence,” 2020. [Online]. Available: https:// arxiv.org/abs/2006.11446

[6] E. M. Hutchins, M. J. Cloppert, R. M. Amin, et al., “Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains,” 2011. [Online]. Available: https://www.ciosummits.com/media/solution_spotlight/LM_Cyber_Kill_Chain_White_paper_ 2011.pdf

[7] D. Costa, M. Collins, S. J. Perl, M. Albrethsen, G. Silowash, D. Spooner, “An ontology for insider threat indicators: Development and application.,” 2014. [Online]. Available: https://ceur-ws.org/ Vol-1304/STIDS2014_T07_CostaEtAl.pdf

[8] E. G. Specification, “Measurement ontology for ip traffic,” European Telecommunications Standards Institute, 2013. [Online]. Available: https://www.etsi.org/deliver/etsi_gs/moi/001_099/002/01.01.01_60/gs_moi002v010101p.pdf

[9] E. Kiesling, A. Ekelhart, K. Kurniawan, F. J. Ekaputra, “The SEPSES knowledge graph: An integrated resource for cybersecurity,” in The Semantic Web - ISWC 2019 - 18th International Semantic Web Conference, Auckland, New Zealand, October 26-30, 2019, Proceedings, Part II, vol. 11779 of Lecture Notes in Computer Science, 2019, pp. 198–214, Springer. https://doi.org/10.1007/978-3-030-30796-7_13

[10] Y. Park, R. J. Byrd, B. Boguraev, “Automatic glossary extraction: Beyond terminology identification,” in 19th International Conference on Computational Linguistics, COLING 2002, Howard International House and Academia Sinica, Taipei, Taiwan, 2002, pp. 1–7. https:// aclanthology.org/C02-1142/

[11] H. Zhong, Z. Ning, G. Li, Z. Li, “A method of core concept extraction based on semantic-weight ranking,” Concurrency and Computation: Practice and Experience, vol. 34, no. 1, 2022. https://doi.org/10.1002/cpe. 6504

[12] S. Fang, Z. Huang, M. He, S. Tong, X. Huang, Y. Liu, J. Huang, Q. Liu, “Guided attention network for concept extraction,” in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event / Montreal, Canada, 2021, pp. 1449–1455, ijcai.org. https://doi.org/10.24963/ ijcai.2021/200

[13] S. Gul, S. Räbiger, Y. Saygin, “Context-based extraction of concepts from unstructured textual documents,” Information Sciences, vol. 588, pp. 248–264, 2022. https: //doi.org/10.1016/j.ins.2021.12.056

[14] A. Lopes, J. L. Carbonera, D. Schmidt, L. F. Garcia, F. H. Rodrigues, M. Abel, “Using terms and informal definitions to classify domain entities into top-level ontology concepts: An approach based on language models,” Knowledge-Based Systems, vol. 265, p. 110385, 2023. https://doi.org/10.1016/j.knosys.2023. 110385

[15] Y. Guo, Z. Liu, C. Huang, J. Liu, W. Jing, Z. Wang, Y. Wang, “Cyberrel: Joint entity and relation extraction for cybersecurity concepts,” in Information and Communications Security - 23rd International Conference, ICICS 2021, Proceedings, Part I, vol. 12918 of Lecture Notes in Computer Science, Chongqing, China, 2021, pp. 447–463, Springer. https://doi.org/10.1007/978-3-030-86890-1_25

[16] S. Chávez-Feria, R. García-Castro, M. PovedaVillalón, “Chowlk: from uml-based ontology conceptualizations to OWL,” in The Semantic Web - 19th International Conference, ESWC 2022, Proceedings, vol. 13261 of Lecture Notes in Computer Science, Hersonissos, Crete, Greece, 2022, pp. 338– 352, Springer. https://doi.org/10.1007/ 978-3-031-06981-9_20

[17] C.-h. Liao, Y.-f. Wu, G.-h. King, “Research on learning OWL ontology from relational database,” vol. 1176, no. 2, p. 022031, 2019. https://doi.org/10.1088/ 1742-6596/1176/2/022031

[18] M. A. Hazber, R. Li, X. Gu, G. Xu, “Integration mapping rules: Transforming relational database to semantic web ontology,” Applied Mathematics Information Sciences, vol. 10, no. 3, pp. 1–21, 2016. http://dx.doi.org/10.18576/amis/100307

[19] M. Dadjoo, E. Kheirkhah, “An approach for transforming of relational databases to OWL ontology,” 2015. [Online]. Available: http: //arxiv.org/abs/1502.05844.

[20] M. A. G. Hazber, R. Li, Y. Zhang, G. Xu, “An approach for mapping relational database into ontology,” in 12th Web Information System and Application Conference, WISA 2015, Jinan, China, 2015, pp. 120–125, IEEE Computer Society. https://doi.org/10.1109/WISA. 2015.25

[21] M. A. G. Hazber, R. Li, X. Gu, G. Xu, Y. Li, “Semantic SPARQL query in a relational database based on ontology construction,” in 11th International Conference on Semantics, Knowledge and Grids, SKG 2015, Beijing, China, 2015, pp. 25–32, IEEE Computer Society. https: //doi.org/10.1109/SKG.2015.14

[22] E. Jiménez-Ruiz, E. Kharlamov, D. Zheleznyakov, I. Horrocks, C. Pinkel, M. G. Skjæveland, E. Thorstensen, J. Mora, “Bootox: Practical mapping of rdbs to OWL 2,” in The Semantic Web -ISWC 2015 14th International Semantic Web Conference, Proceedings, Part II, vol. 9367 of Lecture Notes in Computer Science, Bethlehem, PA, USA, 2015, pp. 113–132, Springer. https://doi.org/10.1007/978-3-319-25010-6_7

[23] M. A. G. Hazber, B. Li, G. Xu, M. A. S. Mosleh, X. Gu, Y. Li, “An approach for generation of SPARQL query from SQL algebra based transformation rules of RDB to ontology,” Journal of Software, vol. 13, no. 11, pp. 573–599, 2018. https://doi.org/10.17706/jsw.13.11. 573-599.

[24] H. Tissot, C. A. G. Huve, L. M. Peres, M. D. D. Fabro, “Exploring logical and hierarchical information to map relational databases into ontologies,” International Journal of Metadata, Semantics and Ontologies, vol. 13, no. 3, pp. 191–208, 2019. https://doi.org/10.1504/ IJMSO.2019.099834

[25] T. Naz, M. Shuja, S. K. Shahzad, M. Atif, “Fully automatic OWL generator from rdb schema,” International Journal of Advanced and Applied Sciences, vol. 5, no. 4, pp. 79–86, 2018. https://doi.org/10.21833/ijaas.2018.04.010

[26] A. Tissaoui, S. Sassi, R. Chbeir, A. Mechergui, “A top-down enriching approach for ontology learning from text,” Concurrency and Computation: Practice and Experience, vol. 34, no. 19, 2022. https://doi.org/10. 1002/cpe.7036

[27] F. N. AL-Aswadi, H. Y. Chan, K. H. Gan, et al., “Enhancing relevant concepts extraction for ontology learning using domain time relevance,” Information Processing and Management, vol. 60, no. 1, p. 103140, 2023. https://doi.org/10.1016/j.ipm.2022. 103140

[28] W. Gao, J. L. G. Guirao, B. Basavanagoud, J. Wu, “Partial multi-dividing ontology learning algorithm,” Information Sciences, vol. 467, pp. 35–58, 2018. https: //doi.org/10.1016/j.ins.2018.07.049

[29] W. Wang, P. M. Barnaghi, A. Bargiela, “Probabilistic topic models for learning terminological ontologies,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 7, pp. 1028–1040, 2010. https://doi.org/10.1109/TKDE.2009.122

[30] S. Ashury-Tahan, A. D. N. Cohen, N. Cohen, Y. Louzoun, Y. Goldberg, “Data-driven coreferencebased ontology building,” in Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, Florida, USA, November 12-16, 2024, 2024, pp. 14290–14300, Association for Computational Linguistics. https://doi.org/10.18653/v1/2024. findingsemnlp.834

[31] F. B. Mesmia, M. Mouhoub, “Semi-automatic building and learning of a multilingual ontology,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 22, no. 11, pp. 242:1–242:19, 2023. https://doi.org/10.1145/3615864

[32] C. Huang, P. Huang, Y. Kuo, G. Wong, Y. Huang, Y. S. Sun, M. C. Chen, “Building cybersecurity ontology for understanding and reasoning adversary tactics and techniques,” in IEEE International Conference on Big Data, Big Data 2022, Osaka, Japan, December 17-20, 2022, 2022, pp. 4266–4274, IEEE. https://doi.org/10. 1109/BigData55660.2022.10021134

[33] P. Velardi, M. Missikoff, R. Basili, “Identification of relevant terms to support the construction of domain ontologies,” 2001. [Online]. Available: https:// aclanthology.org/W01-1005.pdf

[34] M. A. Musen, “The protégé project: a look back and a look forward,” AI Matters, vol. 1, no. 4, pp. 4–12, 2015. https://doi.org/10.1145/2757001.2757003

[35] M. Shamsfard, A. A. Barforoush, “The state of the art in ontology learning: a framework for comparison,” Knowledge Engineering Review, vol. 18, no. 4, pp. 293–316, 2003. https://doi.org/10.1017/ S0269888903000687

[36] S. Tartir, I. B. Arpinar, M. Moore, A. P. Sheth, B. Aleman-Meza, “Ontoqa: Metric-based ontology quality analysis,” 2005. [Online]. Available: https:// corescholar.libraries.wright.edu/knoesis/660

Downloads

Published

2026-03-13
Metrics
Views/Downloads
  • Abstract
    358
  • PDF
    288

How to Cite

Wang, Y., Zhao, B., Song, X., and Zhu, J. (2026). A Practical Cybersecurity Ontology Generator Based on Hierarchical Clustering and Multi-way Tree. International Journal of Interactive Multimedia and Artificial Intelligence, 1–13. https://doi.org/10.9781/ijimai.2026.6499

Issue

Section

Regular Articles