Using Local Grammar for Entity Extraction from Clinical Reports
DOI:
https://doi.org/10.9781/ijimai.2015.332Keywords:
Information Technology, NLP, Medical EntitiesAbstract
Information Extraction (IE) is a natural language processing (NLP) task whose aim is to analyze texts written in natural language to extract structured and useful information such as named entities and semantic relations linking these entities. Information extraction is an important task for many applications such as bio-medical literature mining, customer care, community websites, and personal information management. The increasing information available in patient clinical reports is difficult to access. As it is often in an unstructured text form, doctors need tools to enable them access to this information and the ability to search it. Hence, a system for extracting this information in a structured form can benefits healthcare professionals. The work presented in this paper uses a local grammar approach to extract medical named entities from French patient clinical reports. Experimental results show that the proposed approach achieved an F-Measure of 90. 06%.Downloads
References
[1] A. Ben abacha, P. Zweigenbaum, “A Hybrid Approach for the Extraction of Semantic Relations from MEDLINE Abstracts”, In Computational Linguistics and Intelligent Text Processing, 12th International Conference, volume 6608 of Lecture Notes in Computer Science, pages 139-150, February 20-26, Tokyo, Japan, 2011.
[2] F.Barigou, B.Beldjilali, B. Atmani. Using a cellular automaton to extract medical information from clinical Reports. Journal of information processing system, 8(1), 2012, 67–84.
[3] H. N. Traboulsi, “Named Entity Recognition: A Local Grammar-based Approach”, Ph.D. dissertation, Dept of Computing, Surrey Univ. Guild ford, U.K, 2006.
[4] T. Poibeau, “Boosting the robustness of a named entity recognizer”, International Journal of Semantic Computing, 2009, 32(1), pp 77-98.
[5] D. Nadeau, S. Sekine, “A survey of named entity recognition and classification”, journal of linguistic investigations, 2007, 30(1), p .3-26.
[6] M. Mohammed Oudah, K. Shaalan, “A pipeline Arabic Named Entity Recognition Using a Hybrid Approach”, in proceedings of COLING 2012, Mumbai: Technical Papers, pp 2159–2176.
[7] S. Meystre, G. Savova, K. Kipper-Schuler, J. Hurdle, “Extracting Information from Textual Documents in the Electronic Health Record: A Review of recent Research”, year book of Medical Informatics. 2008, pp. 44-128.
[8] Y. He, M. Kayaalp. “Biological entity recognition with Conditional Random Fields.”, In AMIA Annu Symp Proc, pp 293-297, 2008.
[9] F. Barigou, B. Beldjilali, B. Atmani, “MedIX : A Named Entity Extraction Tool from patient clinical reports”, International Conference on Communication, Computing and Control Application, Hammamet, Tunisia, March 3-5, 2011, pp.488-494 .
[10] M. Chau, J., Xu, H. Chen, “Extracting Meaningful Entity from Polices Narrative Reports”, Proceeding of the National Conference for Digital Government Research, 2002, pp.271-275
[11] L. Kosseim, G. Lapalme, “EXIBUM: un système expérimental d’extraction d’information bilingue”, Rencontre International sur l’extraction, le filtrage et le résumé automatique (RIFRA’98), 1998.
[12] K. Shaalan, “Person Name Entity Recognition for Arabic”, Proceedings of the 5th workshop on important Unresolved Matters, p 24-17, 2007.
[13] C. Friedman, P. Alderson, J. Austin, J. Cimino, S. Johnson, “A general natural language text processor for clinical radiology”, Journal of the American Medical Informatics Association, 1994, 1(2), pp.161-174.
[14] P. Haug, L. Christensen, M. Gundersen, B. Clemons, S. Koehler, K. Bauer, “A natural language parsing system for encoding admitting diagnose ”, American Medical Informatics Association Annual Symposium, AMIA 97, 1997, pp.814-818.
[15] A. R. Aronson, “Effective mapping of biomedical text to the UMLS Meta thesaurus: the MetaMap program”, American Medical Informatics Association Annual Symposium, AMIA’01, Washington, DC, USA, 2001, pp.17-21.
[16] A. Ben Abacha, P. Zweigenbaum, “Medical entity recognition: A comparison of Semantic and Statistical Methods”, In Proceedings of the 2011 Workshop on Biomedical Natural Language Processing, ACLHLT, pages 56–64, Portland, Oregon, USA, June 23-24.
[17] I. Spasic, F. Sarafraz, J. Akeane, G. Nenadic, “Medication information extraction with linguistic pattern matching and semantic rules”, Published by group.bmj.com, 2010.
[18] M. Embarek, O. Ferret, “Learning patterns for building resources about semantic relations in the medical domain”, Proceedings of the International Conference on Language Resources and Evaluation, LREC’08, Marrakech, Morocco, 26 May - 1 June, 2008.
[19] H. Harkema, R. Ian, R. Gaizauskas, M. Hepple (2005). Information Extraction from Clinical Records. In Proceedings of the 4th UK eScience All Hands Meeting http://www.allhands.org.uk/2005/proceedings/,2005
[20] C. A. Knirsch, N. Jain, A. Pablos-Mendez, C. Friedman, G. Hripcsak, “Respiratory Isolation of Tuberculosis Patients Using Clinical Guidelines and an Automated Clinical Decision Support System”. Journal Infection Control and Hospital Epidemiology, 1999, 19(2), pp.94-100.
[21] T. Sibanda, T. He, P. Szolovits, O. Uzuner, “Syntactically-informed semantic category recognition in discharge summaries”, Proceeding of the Fall Symposium of the American Medical Informatics Association; Washington, DC, November, 2006.
[22] S. Zhang, N. Elhadad, “Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts”, Journal of Biomedical Informatics 46, 2013, p 1088-1098.
[23] J. Fan, N. Sood, Y. Huang. “Disorder Concept Identification from Clinical Notes An Experience with the ShARe/CLEF 2013 Challenge”, Online Working Notes of the CLEF 2013 Evaluation Labs and Workshop, 23 - 26 September, 2003, Valencia - Spain.
[24] S. Matos, T. Nunes, J. L. Oliveira. “BioinformaticsUA: Concept Recognition in Clinical Narratives Using a Modular and Highly Efficient Text Processing Framework”, Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland, August 23-24, 2014, pages 135–139.
[25] S. Ramanan, S. Nathan. “RelAgent: Entity Detection and Normalization for Diseases in Clinical Records: a Linguistically Driven Approach”, Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 477–481, Dublin, Ireland, August 23-24, 2014.
[26] Y. Xia, X. Zhong, P. Liu, C. Tan, S. Na, Q. Hu and Y.Huang. “ Combining MetaMap and cTAKES in Disorder Recognition: THCIB at CLEF eHealth Lab 2013 Task 1”, OnlineWorking Notes of the CLEF 2013 Evaluation Labs and Workshop, 23 - 26 September, 2013, Valencia -Spain.
[27] J. D. Osborne, B. Gyawali, T. Solorio. “Evaluation of YTEX and MetaMap for clinical concept recognition”, Online Working Notes of the CLEF 2013 Evaluation Labs andWorkshop, 23 - 26 September, 2013, Valencia - Spain.
[28] P. Pathak, P.Patel, V.Panchal, N. Choudhary, A. Patel, G. Joshi. “ezDI: A Hybrid CRF and SVM based Model for Detecting and Encoding Disorder Mentions in Clinical Notes”, Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 278–283, Dublin, Ireland, August 23-24, 2014.
[29] K. Gojenola, M.Oronoz, A. Pérez, A. Casillas. “ IxaMed: Applying Freeling and a Perceptron Sequential Tagger at the Shared Task on Analyzing Clinical Texts”, Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 361–365, Dublin, Ireland, August 23-24, 2014.
[30] A. Bodnari, L. Deleger, T. Lavergne, A. Neveol, P. Zweigenbaum. “A Supervised Named-Entity Extraction System for Medical Text”, Online Working Notes of the CLEF 2013 Evaluation Labs and Workshop, 23–26 September, Valencia - Spain.
[31] A. Parikh ,Ah PVS, J. Mustafi, L. Agarwalla, A. Mungi. “ThinkMiners: Disorder Recognition using Conditional Random Fields and Distributional Semantics”, Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 652–656, Dublin, Ireland, August 23-24, 2014.
[32] Y. Zhang, J.Wang, B.Tang, Y.Wu, M. Jiang, Y. Chen, H. Xu. “UTH_CCB: A Report for SemEval 2014 – Task 7 Analysis of Clinical
Text”, Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 802–806, Dublin, Ireland, August 23-24, 2014.
[33] G.Attardi, V. Cozza, D.Sartiano. “UniPi: Recognition of Mentions of Disorders in Clinical Text”, Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 754–760, Dublin, Ireland, August 23-24, 2014.
[34] J.Jonnagaddala, M. Kumar, H.J. Dai, E. Rachmani, C.Y. Hsu. “TMUNSW: Disorder Concept Recognition and Normalization in Clinical Notes for SemEval-2014 Task 7”, Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 663–667,Dublin, Ireland, August 23-24, 2014.
[35] G. Omid, R.J. Kate. “UWM: Disorder Mention Extraction from Clinical Text Using CRFs and Normalization Using Learned Edit Distance Patterns”, Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 828-832, Dublin, Ireland, August 23-24, 2014.
[36] J.Cogley, N. Stokes, J. Carthy. “ Medical Disorder Recognition with Structural Support Vector Machines”, Online Working Notes of the CLEF 2013 Evaluation Labs and Workshop, 23 – 26 September, Valencia - Spain.
[37] C. Grouin, P. Zweigenbaum, “Automatic de-identification of French clinical record: comparison of rule based and machine learning approaches”, In Proc MEDINFO 2013, Studies in Health Technology and Informatics, pages 476-480. Amsterdam, IOS Press, 2013.
[38] S. Sarawagi, “Information extraction. Foundations and Trends in Databases”. (2007). Vol. 1, No. 3. 261–377.
[39] J. Jiang,” Information Extraction from Text”. Research Collection School of Information Systems. In Charu C. Aggarwal and ChengXiang Zhai (Eds.), (2012). Mining Text Data, Springer. 11-41.
[40] H. Ware, J. M. Charles, J. Vasudevan, R. Oussama. “Machine learningbased coreference resolution of concepts in clinical documents”. (2012). J Am Med Inform Assoc; 19:883e887. doi:10.1136/amiajnl-2011-000774.
[41] W.Sun, A. Rumshisky , & O. Uzuner, “Evaluating temporal relations in clinical text: 2012 i2b2 Challenge”. (2013). In the Journal of the American Medical Informatics Association. doi:10.1136/amiajnl-2013-001628.
[42] J. R. Hobbs, D.Appelt, M. Tyson, J. Bear, and D. Islael, “SRI International: Description of the FASTUS system used for MUC4”.(1992). In Proceedings fo the 4th Message Understanding Conference (MUC-4), 268–275.
[43] G. Krupka,P. Jacobs, L.Rau, L. Childs, and I.Sider,”GE NLTOOLSET: Description of the system as used for MUC-4”. (1992). In Proceedings of the 4th Message Understanding Conference (MUC-4), 177–185.
[44] D. Ayuso, S.Boisen, H. Fox, H .Gish, R .Ingria, and R. Weischedel,.(1992). “BBN: Description of the PLUM system as used for MUC-4”. In Proceedings of the Fourth Message Understanding Conference (MUC-4), 169–176.
[45] Yangarber, R. & Grishman, R.(1998). NYU: Description of the Proteus/PET system as used for MUC-7 ST. In Proceedings of the 7th Message Understanding Conference: MUC-7, Washington, DC.
[46] Kaiser, K., & Miksch, S.(2005). “Information Extraction. A Survey.Vienna University of Technology”.Asgaard-TR-2005-6.
[47] H Cordobés,., A. Fernández Anta, L. F. Chiroque, F. Pérez, T. Redondo, and A. Santos, “Graph-based Techniques for Topic Classification of Tweets in Spanish”, International Journal of Interactive Multimedia and Artificial Intelligence, vol. 2, issue Special Issue on AI Techniques to Evaluate Economics and Happines, no. 5, pp. 31-37, 03/2014.
[48] K. Khan,and A. Sahai, “A fuzzy c-means bi-sonar-based Metaheuristic Optimization Algorithm”, International Journal of Interactive Multimedia and Artificial Intelligence, vol. 1, issue Regular Issue, no. 7, pp. 26-32, 12/2012.
[49] C. Chang, M. Kayed, M.R. Girgis, K. Shalan, “A survey of web Information Extraction Systems”.(2006). IEEE transactions on knowledge and data engineering, TKDE-0475-1104.R3.
[50] H .Gurulingappa, A. Matteen-rajput, & L. Toldo, “Extraction of Adverse Drug Effects from Medical case Rets”. (2012). In: Courtot M, editor. International Conference Biomedical Ontologies, 22-25. Graz, Austria.
[51] H.Bolivar-Baron,., R. Gonzalez-Crespo, and O. Sanjuan-Martinez, “Ontology of a scene based on Java 3D architecture.”, International Journal of Interactive Multimedia and Artificial Inteligence, vol. 1, issue Special Issue on Business Intelligence and Semantic Web, no. 2, pp. 14-19, 12/2009.
[52] Z. Harris, “Theory of language and Information: A Mathematical Approach”, Oxford & New York: Clarendon Press, 1991
[53] H. N. Traboulsi, “Arabic Named Entity Extraction: A Local Grammar–based Approach”, Proceeding of the International Multiconference on Computer Science and Information Technology, 2009, pp. 139-143.
[54] M. Gross, “The construction of local grammars”, in E.Roche & Y. Schabés (eds), Finite-State Language, Speech, and communication, MIT Press, 1997, pp.329-354.
[55] S. J. Bolaños-Castro, R. G. Crespo, V. H. Medina-García, “Patterns of software development process”, International Journal of Interactive Multimedia and Artificial Intelligence, vol 1. Issue 4, pp. 33-40, 12/2011.
Downloads
Published
-
Abstract66
-
PDF27






