A Practical Cybersecurity Ontology Generator Based on Hierarchical Clustering and Multi-way Tree
DOI:
https://doi.org/10.9781/ijimai.2026.6499Keywords:
Cybersecurity, Hierarchical Clustering, Multi-way Tree, Ontology Engineering, Ontology LearningAbstract
Cybersecurity ontology development is typically carried out by cybersecurity experts and ontology engineers. While some existing works focus on extracting cybersecurity knowledge from either textual or structured data, few address the challenge of handling both types of data simultaneously. This paper presents Locust, a tool integrating structured data and domain corpus for comprehensive cybersecurity ontology generation. We use open source cybersecurity specifications as structured input to build the skeleton of the ontology, and use the domain corpus to enrich and finalise the ontology. Additionally, we propose a methodology for filtering and simplifying the ontology using hierarchical clustering and multi-way tree. Experimental results demonstrate the effectiveness of our approach in acquiring a cybersecurity ontology from specific domain data sources. Locust is implemented in Java and is available as an open source tool.
Downloads
References
[1] Z. Syed, A. Padia, T. Finin, M. L. Mathews, A. Joshi, “UCO: A unified cybersecurity ontology,” Arizona, USA, 2016. [Online]. Available: http://www.aaai. org/ocs/index.php/WS/AAAIW16/paper/view/12574
[2] Y. Wang, B. Zhao, W. Li, L. Zhu, “An ontologycentric approach for network security situation awareness,” in 47th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2023, Torino, Italy, 2023, pp. 777–787, IEEE. https: //doi.org/10.1109/COMPSAC57700.2023.00107
[3] A. Oltramari, L. F. Cranor, R. J. Walls, P. D. McDaniel, “Building an ontology of cyber security,” Fairfax VA, USA, 2014. [Online]. Available: https://ceur-ws. org/Vol-1304/STIDS2014_T08_OltramariEtAl.pdf
[4] C. Grigoriadis, A. M. Berzovitis, I. Stellios, P. Kotzanikolaou, “A cybersecurity ontology to support risk information gathering in cyberphysical systems,” in Computer Security. ESORICS 2021 International Workshops - CyberICPS, SECPRE, ADIoT, SPOSE, CPS4CIP, and CDT&SECOMANE, Darmstadt, Germany, October 4-8, 2021, Revised Selected Papers, vol. 13106 of Lecture Notes in Computer Science, 2021, pp. 23–39, Springer. https://doi.org/10.1007/978-3-030-95484-0_2
[5] N. Rastogi, S. Dutta, M. J. Zaki, A. Gittens, C. C. Aggarwal, “Malont: An ontology for malware threat intelligence,” 2020. [Online]. Available: https:// arxiv.org/abs/2006.11446
[6] E. M. Hutchins, M. J. Cloppert, R. M. Amin, et al., “Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains,” 2011. [Online]. Available: https://www.ciosummits.com/media/solution_spotlight/LM_Cyber_Kill_Chain_White_paper_ 2011.pdf
[7] D. Costa, M. Collins, S. J. Perl, M. Albrethsen, G. Silowash, D. Spooner, “An ontology for insider threat indicators: Development and application.,” 2014. [Online]. Available: https://ceur-ws.org/ Vol-1304/STIDS2014_T07_CostaEtAl.pdf
[8] E. G. Specification, “Measurement ontology for ip traffic,” European Telecommunications Standards Institute, 2013. [Online]. Available: https://www.etsi.org/deliver/etsi_gs/moi/001_099/002/01.01.01_60/gs_moi002v010101p.pdf
[9] E. Kiesling, A. Ekelhart, K. Kurniawan, F. J. Ekaputra, “The SEPSES knowledge graph: An integrated resource for cybersecurity,” in The Semantic Web - ISWC 2019 - 18th International Semantic Web Conference, Auckland, New Zealand, October 26-30, 2019, Proceedings, Part II, vol. 11779 of Lecture Notes in Computer Science, 2019, pp. 198–214, Springer. https://doi.org/10.1007/978-3-030-30796-7_13
[10] Y. Park, R. J. Byrd, B. Boguraev, “Automatic glossary extraction: Beyond terminology identification,” in 19th International Conference on Computational Linguistics, COLING 2002, Howard International House and Academia Sinica, Taipei, Taiwan, 2002, pp. 1–7. https:// aclanthology.org/C02-1142/
[11] H. Zhong, Z. Ning, G. Li, Z. Li, “A method of core concept extraction based on semantic-weight ranking,” Concurrency and Computation: Practice and Experience, vol. 34, no. 1, 2022. https://doi.org/10.1002/cpe. 6504
[12] S. Fang, Z. Huang, M. He, S. Tong, X. Huang, Y. Liu, J. Huang, Q. Liu, “Guided attention network for concept extraction,” in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event / Montreal, Canada, 2021, pp. 1449–1455, ijcai.org. https://doi.org/10.24963/ ijcai.2021/200
[13] S. Gul, S. Räbiger, Y. Saygin, “Context-based extraction of concepts from unstructured textual documents,” Information Sciences, vol. 588, pp. 248–264, 2022. https: //doi.org/10.1016/j.ins.2021.12.056
[14] A. Lopes, J. L. Carbonera, D. Schmidt, L. F. Garcia, F. H. Rodrigues, M. Abel, “Using terms and informal definitions to classify domain entities into top-level ontology concepts: An approach based on language models,” Knowledge-Based Systems, vol. 265, p. 110385, 2023. https://doi.org/10.1016/j.knosys.2023. 110385
[15] Y. Guo, Z. Liu, C. Huang, J. Liu, W. Jing, Z. Wang, Y. Wang, “Cyberrel: Joint entity and relation extraction for cybersecurity concepts,” in Information and Communications Security - 23rd International Conference, ICICS 2021, Proceedings, Part I, vol. 12918 of Lecture Notes in Computer Science, Chongqing, China, 2021, pp. 447–463, Springer. https://doi.org/10.1007/978-3-030-86890-1_25
[16] S. Chávez-Feria, R. García-Castro, M. PovedaVillalón, “Chowlk: from uml-based ontology conceptualizations to OWL,” in The Semantic Web - 19th International Conference, ESWC 2022, Proceedings, vol. 13261 of Lecture Notes in Computer Science, Hersonissos, Crete, Greece, 2022, pp. 338– 352, Springer. https://doi.org/10.1007/ 978-3-031-06981-9_20
[17] C.-h. Liao, Y.-f. Wu, G.-h. King, “Research on learning OWL ontology from relational database,” vol. 1176, no. 2, p. 022031, 2019. https://doi.org/10.1088/ 1742-6596/1176/2/022031
[18] M. A. Hazber, R. Li, X. Gu, G. Xu, “Integration mapping rules: Transforming relational database to semantic web ontology,” Applied Mathematics Information Sciences, vol. 10, no. 3, pp. 1–21, 2016. http://dx.doi.org/10.18576/amis/100307
[19] M. Dadjoo, E. Kheirkhah, “An approach for transforming of relational databases to OWL ontology,” 2015. [Online]. Available: http: //arxiv.org/abs/1502.05844.
[20] M. A. G. Hazber, R. Li, Y. Zhang, G. Xu, “An approach for mapping relational database into ontology,” in 12th Web Information System and Application Conference, WISA 2015, Jinan, China, 2015, pp. 120–125, IEEE Computer Society. https://doi.org/10.1109/WISA. 2015.25
[21] M. A. G. Hazber, R. Li, X. Gu, G. Xu, Y. Li, “Semantic SPARQL query in a relational database based on ontology construction,” in 11th International Conference on Semantics, Knowledge and Grids, SKG 2015, Beijing, China, 2015, pp. 25–32, IEEE Computer Society. https: //doi.org/10.1109/SKG.2015.14
[22] E. Jiménez-Ruiz, E. Kharlamov, D. Zheleznyakov, I. Horrocks, C. Pinkel, M. G. Skjæveland, E. Thorstensen, J. Mora, “Bootox: Practical mapping of rdbs to OWL 2,” in The Semantic Web -ISWC 2015 14th International Semantic Web Conference, Proceedings, Part II, vol. 9367 of Lecture Notes in Computer Science, Bethlehem, PA, USA, 2015, pp. 113–132, Springer. https://doi.org/10.1007/978-3-319-25010-6_7
[23] M. A. G. Hazber, B. Li, G. Xu, M. A. S. Mosleh, X. Gu, Y. Li, “An approach for generation of SPARQL query from SQL algebra based transformation rules of RDB to ontology,” Journal of Software, vol. 13, no. 11, pp. 573–599, 2018. https://doi.org/10.17706/jsw.13.11. 573-599.
[24] H. Tissot, C. A. G. Huve, L. M. Peres, M. D. D. Fabro, “Exploring logical and hierarchical information to map relational databases into ontologies,” International Journal of Metadata, Semantics and Ontologies, vol. 13, no. 3, pp. 191–208, 2019. https://doi.org/10.1504/ IJMSO.2019.099834
[25] T. Naz, M. Shuja, S. K. Shahzad, M. Atif, “Fully automatic OWL generator from rdb schema,” International Journal of Advanced and Applied Sciences, vol. 5, no. 4, pp. 79–86, 2018. https://doi.org/10.21833/ijaas.2018.04.010
[26] A. Tissaoui, S. Sassi, R. Chbeir, A. Mechergui, “A top-down enriching approach for ontology learning from text,” Concurrency and Computation: Practice and Experience, vol. 34, no. 19, 2022. https://doi.org/10. 1002/cpe.7036
[27] F. N. AL-Aswadi, H. Y. Chan, K. H. Gan, et al., “Enhancing relevant concepts extraction for ontology learning using domain time relevance,” Information Processing and Management, vol. 60, no. 1, p. 103140, 2023. https://doi.org/10.1016/j.ipm.2022. 103140
[28] W. Gao, J. L. G. Guirao, B. Basavanagoud, J. Wu, “Partial multi-dividing ontology learning algorithm,” Information Sciences, vol. 467, pp. 35–58, 2018. https: //doi.org/10.1016/j.ins.2018.07.049
[29] W. Wang, P. M. Barnaghi, A. Bargiela, “Probabilistic topic models for learning terminological ontologies,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 7, pp. 1028–1040, 2010. https://doi.org/10.1109/TKDE.2009.122
[30] S. Ashury-Tahan, A. D. N. Cohen, N. Cohen, Y. Louzoun, Y. Goldberg, “Data-driven coreferencebased ontology building,” in Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, Florida, USA, November 12-16, 2024, 2024, pp. 14290–14300, Association for Computational Linguistics. https://doi.org/10.18653/v1/2024. findingsemnlp.834
[31] F. B. Mesmia, M. Mouhoub, “Semi-automatic building and learning of a multilingual ontology,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 22, no. 11, pp. 242:1–242:19, 2023. https://doi.org/10.1145/3615864
[32] C. Huang, P. Huang, Y. Kuo, G. Wong, Y. Huang, Y. S. Sun, M. C. Chen, “Building cybersecurity ontology for understanding and reasoning adversary tactics and techniques,” in IEEE International Conference on Big Data, Big Data 2022, Osaka, Japan, December 17-20, 2022, 2022, pp. 4266–4274, IEEE. https://doi.org/10. 1109/BigData55660.2022.10021134
[33] P. Velardi, M. Missikoff, R. Basili, “Identification of relevant terms to support the construction of domain ontologies,” 2001. [Online]. Available: https:// aclanthology.org/W01-1005.pdf
[34] M. A. Musen, “The protégé project: a look back and a look forward,” AI Matters, vol. 1, no. 4, pp. 4–12, 2015. https://doi.org/10.1145/2757001.2757003
[35] M. Shamsfard, A. A. Barforoush, “The state of the art in ontology learning: a framework for comparison,” Knowledge Engineering Review, vol. 18, no. 4, pp. 293–316, 2003. https://doi.org/10.1017/ S0269888903000687
[36] S. Tartir, I. B. Arpinar, M. Moore, A. P. Sheth, B. Aleman-Meza, “Ontoqa: Metric-based ontology quality analysis,” 2005. [Online]. Available: https:// corescholar.libraries.wright.edu/knoesis/660
Downloads
Published
-
Abstract358
-
PDF288






