Role of Semantic Web in Overcoming Challenges in Biomedical Data Integration

Author: Krishna Kumar Kookal

Primary Advisor: Muhammad Walji, PhD

Committee Members: Dean F. Sittig, PhD; Todd R. Johnson, PhD

Masters thesis, The University of Texas School of Biomedical Informatics at Houston.


Data integration in the biomedical domain has great potential to improve healthcare by providing researchers and physicians with a large body of knowledge to help them understand diseases and make better treatment decisions. Biomedical data are inherently complex and present various challenges during integration. All data integration techniques in the biomedical domain need to deal with the high volume and diversity of the data. In this paper, the most important and common data integration challenges have been identified from a literature review. Today, there are various data integration approaches which are being used in the biomedical domain. This review specifically focuses on the use of the semantic web to understand its use in overcoming the various data integration challenges. The semantic web provides a robust data and knowledge representation platform to accommodate the rich and complex nature of biomedical data. Considering the dynamic nature of the biomedical domain, semantic web also proves to be a platform that can easily accommodate new changes and discoveries without having to modify the underlying infrastructure, thus encouraging reusability. The most important benefit of semantic web over the current data integration approaches is its ability to deal with the meaning or semantics of the participating resources in addition to the syntax. Many of the semantic web standards or components are still in their early stages of development and they are still evolving. The advances in semantic web technologies in recent years look promising and it is an important step towards solving the data integration problem in the biomedical domain.