Hua Xu, PhD
Associate Professor
Dr. Hua Xu is an associate professor at the School of Biomedical Informatics in The University of Texas Health Science Center at Houston (UTHealth). He directs the Center for Computational Biomedicine at UTHealth. Currently he is the Chair of American Medical Informatics Association (AMIA) Natural Language Processing (NLP) working group. Dr. Xu received his Ph.D. in Biomedical Informatics from Columbian University in 2008. In addition, he holds a B.S. degree in Biochemistry from Nanjing University in China and an M.S. in Computer Science from New Jersey Institute of Technology. Dr. Xu is an expert in biomedical text processing and data mining. His primary research interests include: 1) natural language processing of clinical text; 2) text mining of biomedical literature; and 3) healthcare data mining. He is the author of many publications on biomedical NLP and text mining, and his research on medication extraction received the Homer Warner Award from AMIA in 2009. Dr. Xu has been principal investigator on a number of grants, including R01s from National Library of Medicine (NLM) and National Cancer Institute (NCI).
Contact
Hua.Xu@uth.tmc.edu
School of Biomedical Informatics
7000 Fannin St., Suite 600, Houston, TX 77030
Phone: 713-500-3924
Education
- Ph.D. in Biomedical Informatics, 2008, Columbia University, New York, NY
- M.Phil in Biomedical Informatics, 2007, Columbia University, New York, NY
- M.S. in Computer Science, 2001, New Jersey Institute of Technology, Newark, NJ
- B.S. in Biochemistry, 1998, Nanjing University, Nanjing, P.R. China
Research Areas
NLP Methods:
- named entity recognition (NER)
- abbreviation detection and disambiguation
- syntactic/semantic parsing
- active learning
- temporal information/relation extraction and modeling
NLP Systems:
- Medication information extraction – MedEx
- Development of comprehensive clinical NLP systems
NLP and data mining applications:
- EMR-based epidemiological studies of cancers
- Informatics approaches to Pharmacogenomics
- Drug-ADE detection (pharmacovigilance) from EHR
- Literature mining of genes and environmental factors
Funding
Current Grants:
Repurposing Existing Drugs for Cancer Treatment using Electronic Health Records
CPRIT (Cancer Prevention & Research Institute of Texas), Rising Star (PI – Hua Xu)
2012 - 2017 (actual starting date is pending)
Role: PI
An In-silico Method for Epidemiological Studies Using Electronic Medical Records
NCI R01CA141307 (PI – Hua Xu)
09/03/2009 – 07/31/2013
Role: PI (20%)
Real-time Disambiguation of Abbreviations in Clinical Notes
NLM R01LM010681 (PI – Hua Xu)
05/31/2010 – 5/30/2013
Role: PI (20%)
Autonomic Cardiovascular Regulation
NHLBI P01HL056693 (PI - David Robertson)
05/01/2012 – 4/30/2017
Role: Co-investigator (5%)
Completed Grants:
An Informatics-based Approach to Pharmacogenetic Studies of Warfarin
NIH UL1 RR024975-KL2 Scholar Award (PI – Hua Xu)
07/01/2009 – 06/30/2010
Role: PI (70%)
MOMENT (Monitoring for Outpatient Medication Effects and New Toxicities) in TIME
NLM R01 LM007995 (PI - Randy Miller)
02/01/2004 – 06/14/2010
Role: Co-Investigator (10%)
VGER – Vanderbilt Genomic Electronic Medical Records (eMERGE-I)
NHGRI U01 HG004603 (PI - Dan Roden)
09/28/2007 – 07/31/2011
Role: Co-Investigator (10%)
VESPA - Vanderbilt Electronic Systems for Pharmacogenomic Assessment
NIH RC2GM092618 (PI - Dan Masys & Dan Roden)
09/30/2009 – 08/31/2011
Role: Co-Investigator (20%)
VGER – Vanderbilt Genomic Electronic Records Project (eMERGE-II)
NHGRI U01 HG006378 (PI - Dan Roden)
08/15/2011 – 07/31/2015
Role: Co-investigator (5%), Participated between 2011-2012
PGRN - Pharmacogenomics of Arrhythmia Therapy (PAT)
NHLBI U19HL065962 (PI - Dan Roden)
09/01/2010 – 06/30/2015
Role: Co-investigator (30%), Participated between 2010-2012
Evidence-based diagnostic tools for translational and clinical research (eTfor2)
NLM R01LM010828 (PI – Randy Miller)
9/30/2010-9/29/2013
Role: Co-investigator (10%), Participated between 2010-2012
From GWAS to PheWAS: Scanning the EMR Phenome for Gene-Disease Associations
NLM R01LM010685 (PI - Josh Denny)
09/01/2011 – 8/31/2014
Role: Co-investigator (5%), Participated between 2011-2012
Publications
- Xu H, Wu Y, Elhadad N, Stetson PD, Friedman C. A new clustering method for detecting rare sense of abbreviations in clinical notes. J Biomed Inform. 2012, In Press.
- Liu M, Wu Y, Chen Y, Sun J, Zhao Z, Chen X, and Xu H. Large-scale prediction of adverse drug reaction by integrating chemical, biological, and phenotypic properties of drugs. J Am Med Inform Assoc. 2012. 19(e1): e28-e35. [PMCID:PMC3392844]
- Denny JC, Schildcrout JS, Bowton EA, Gregg W, Pulley JM, Basford MA, Cowan J, Xu H, Ramirez AH, Crawford DC, Ritchie MD, Peterson JF, Masys DR, Wilke RA, Roden DM. Optimizing drug outcomes through pharmacogenetics: A case for preemptive genotyping. Clin Pharmacol Ther. 2012. In Press.
- Wu Y, Levy MA, Micheel CM, Yeh P, Tang B, Cantrell MJ, Cooreman SM, Xu H. Identifying the status of genetic lesions in cancer clinical trial documents using machine learning. BMC Genomics 2012. In Press.
- Han B, Chen XW, Talebizadeh Z, Xu H. Genetic studies of complex human diseases: characterizing SNP-disease associations using Bayesian networks. BMC Syst Biol. 2012. In Press.
- Lu Y, Xu H, Peterson NB, Dai Q, Jiang M, Denny JC, Liu M. Extracting epidemiological exposure and outcome terms from literature using machine learning approaches. International Journal of Data Mining and Bioinformatics, 2012. In Press.
- Roden DM, Xu H, Denny JC, Wilke RA. Electronic Medical Records as a Tool in Clinical Pharmacology: Opportunities and Challenges. Clin Pharmacol Ther. 2012, Apr 25. [PMID:22534870]
- Delaney JT, Ramirez AH, Bowton EA, Pulley JM, Basford MA, Schildcrout JS, Shi Y, Zink R, Oetjens M, Xu H, Cleator JH, Jahangir E, Ritchie MD, Masys DR, Roden DM, Crawford DC, Denny JC. Predicting clopidogrel response using DNA samples linked to an electronic health record. Clin Pharmacol Ther. 2012 Feb;91(2):257-63. [PMID: 22190063]
- Birdwell KA, Grady B, Choi L, Xu H, Bian A, Denny JC, Jiang M, Vranic G, Basford M, Cowan JD, Richardson DM, Robinson MP, Ikizler TA, Ritchie MD, Stein CM, Haas DW. The use of a DNA biobank linked to electronic medical records to characterize pharmacogenomic predictors of tacrolimus dose requirement in kidney transplant recipients. Pharmacogenet Genomics. 2012 22(1):32-42. [PMCID: PMC3237759]
- Ramirez AH, Shi Y, Schildcrout JS, Delaney JT, Xu H, Oetjens MT, Zuvich RL, Basford MA, Bowton EA, Jiang M, Speltz P, Zink R, Cowan J, Pulley JM, Ritchie MD, Masys DR, Roden DM, Crawford DC, Denny JC. Predicting warfarin dosage in European and African Americans using DNA samples linked to an electronic health record. Pharmacogenomics. 2012, 13(4):407-18. [PMCID: PMC3361510]
- Sun J, Xu H, Zhao Z Network-assisted investigation of antipsychotic drugs and their targets. Accepted to Chem Biodivers. 2012, 9(5): 900-10. [PMID:22589091]
- Sun J, Wu Y, Xu H, Zhao Z. DTome: a web-based tool for drug-target interactome construction, BMC Bioinformatics. 2012, 13(Suppl 9): 57.
- Carroll RJ, Thompson WK, Eyler AE, Mandelin AM, Cai T, Zink RM, Pacheco JA, Boomershine CS, Lasko TA, Xu H, Karlson EW, Perez RG, Gainer VS, Murphy SN, Ruderman EM, Pope RM, Plenge RM, Kho AN, Liao KP, Denny JC. Portability of an algorithm to identify rheumatoid arthritis in electronic health records. J Am Med Inform Assoc. 2012 Feb 28. [PMID: 22374935]
- Doan S, Collier N, Xu H, Pham HD, and Tu MP. Recognition of medication information from discharge summaries using ensembles of classifiers. BMC Medical Informatics and Decision Making. 2012, 12(1):36. [PMID: 22564405]
- Chen Y, Mani S, Xu H. Applying active learning to assertion classification of concepts in clinical text. J Biomed Inform 2012, 45(2): 265-272. [PMCID: PMC3306548]
- Wilke RA, Xu H, Denny JC, Roden DM, Krauss RM, McCarty CA, Davis RL, Skaar T, Lamba J, and Savova G. The emerging role of electronic medical records in pharmacogenomics. Clin Pharmacol Ther. 2011, 89(3): 379-86. [PMCID: PMC3204342]
- Rosenbloom ST, Denny JC, Xu H, Lorenzi N, Stead WW, Johnson KB. Data from clinical notes: a perspective on the tension between structure and flexible documentation. J Am Med Inform Assoc. 2011, 18(2):181-6. [PMCID: PMC3116264]
- Jiang M, Chen Y, Liu M, Rosenbloom ST, Mani S, Denny JC, Xu H. A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. J Am Med Inform Assoc. 2011, 18(5):601-6. [PMCID: PMC3168315]
- Xu H, Jiang M, Oetjens M, Bowton EA, Ramirez AH, Jeff JM, Basford MA, Pulley JM, Cowan JD, Wang X, Ritchie MD, Masys DR, Roden DM, Crawford DC, Denny JC. Facilitating pharmacogenetic studies using electronic health records and natural language processing: a case study of warfarin. J Am Med Inform Assoc. 2011; 18(4): 387-91. [PMCID: PMC3128409]
- Xu H, AbdelRahman S, Lu Y, Denny JC, Doan S. Applying semantic-based probabilistic context free grammar to medical language processing – a preliminary study on parsing medication sentences. J Biomed Inform 2011, 44(6): 1068-75. [PMCID: PMC3226929]
- Xu H, Stenner SP, Doan S, Johnson KB, Waitman LR, Denny JC. MedEx – A Medication Information Extraction System for Clinical Narratives. J Am Med Inform Assoc. 2010; 17(1):19-24. [PMCID: PMC2995636]
- Denny JC, Peterson JF, Choma NN, Xu H, Miller RA, Bastarache L, Peterson NB. Development of a Natural Language Processing System to Identify Timing and Status of Colonoscopy Testing in Electronic Medical Records. J Am Med Inform Assoc. 2010; 17(4): 393-8. [PMCID: PMC2815478]
- Doan S, Bastarache L, Klimkowski S, Denny JC, Xu H. Integrating Existing NLP Tools for Medication Extraction from Discharge Summaries. J Am Med Inform Assoc. 2010, 17:528-31. [PMCID: PMC2995674]
- Xu H, Stetson P, Friedman C. Methods for Building Sense Inventories of Abbreviations in Clinical Notes. J Am Med Inform Assoc. 2009 16(1):103-108. [PMCID: PMC2605589]
- Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C. Automated Acquisition of Disease-Drug Knowledge from Biomedical and Clinical Documents. J Am Med Inform Assoc. 2008, 15(1):87-98. [PMCID: PMC2274872]
- Tulipano KP, Tao Y, Millar WS, Zanzonico P, Kolbert K, Xu H, Yu H, Chen L, Lussier YA, Friedman C. Natural language processing and visualization in the molecular imaging domain. J Biomed Inform. 2007; 40:3, 270-281. [PMID: 17084109]
- Fan JW, Xu H, Friedman C. Using Contextual and lexical features to restructure and validate the classification of biomedical concepts. BMC Bioinformatics. 2007; 8: 264. [PMCID: PMC2014782]
- Xu H, Fan JW, Hripcsak G, Mendonça EA, Markatou M, Friedman C. Gene symbol disambiguation using knowledge-based profiles. Bioinformatics, 2007 23(8):1015-1022. [PMID: 17314123]
- Xu H, Markatou M, Dimova R, Liu H, Friedman C. Machine learning and word sense disambiguation in the biomedical domain: design and evaluation issues. BMC Bioinformatics. 2006; 7:334. [PMCID: PMC1550263]
- Lee HT, Krichevsky IE, Xu H, Ota-Setlik A, D'Agati VD, Emala CW. Local anesthetics worsen renal function after ischemia-reperfusion injury in rats. Am J Physiol Renal Physiol. 2004; 286(1):F111-9. [PMID: 14519592]
- Lee HT, Xu H, Nasr SH, Schnermann J, Emala CW. A1 adenosine receptor knockout mice exhibit increased renal injury following ischemia and reperfusion. Am J Physiol Renal Physiol. 2004; 286(2):F298-306. [PMID: 14600029]
- Lee HT, Xu H, Ota-Setlik A, Emala CW. Oxidant preconditioning protects human proximal tubular cells against lethal oxidant injury via p38 MAPK and heme oxygenase-1. Am J Nephrol. 2003; 23(5):324-33. [PMID: 12915776]
- Lee HT, Ota-Setlik A, Xu H, D'Agati VD, Jacobson MA, Emala CW. A3 adenosine receptor knockout mice are protected against ischemia- and myoglobinuria-induced renal failure. Am J Physiol Renal Physiol. 2003; 284(2):F267-73. [PMID: 12388399]
- Lee HT, Xu H, Siegel CD, Krichevsky IE. Local anesthetics induce human renal cell apoptosis. Am J Nephrol. 2003; 23(3):129-39. [PMID: 12586958]
Peer Reviewed Articles - Conference:
- Tang B, Can H, Wu Y, Jiang M, Xu H. Clinical Entity Recognition using Structural Support Vector Machines with Rich Features. ACM Sixth International Workshop on Data and Text Mining in Biomedical Informatics (DTMBIO), 2012, In Press.
- Wiley LK, Shah A, Xu H, Bush WS. ICD-9 Tobacco Use Codes are Effective Identifiers of Smoking Status. Translational Bioinformatics Conference (TBC), 2012, Korea, In Press.
- Xu H, Stetson PD, Friedman C. Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations. AMIA Annu Symp Proc. 2012. In Press.
- Wu Y, Denny JC, Rosenbloom ST, Miller RA, Giuse DA, Xu H. A comparative study of current clinical natural language processing systems on handling abbreviations in discharge summaries. AMIA Annu Symp Proc. 2012. In Press.
- Liu M, Shah A, Min J, Peterson NB, Dai Q, Aldrich MC, Chen Q, Bowton EA, Liu H, Denny JC, Xu H. A study of transportability of an existing smoking status detection module across institutions. AMIA Annu Symp Proc. 2012. In Press.
- Jiang M, Denny JC, Tang B, Cao H, Xu H. Extracting semantic lexicons from discharge summaries using machine learning and c-value method. AMIA Annu Symp Proc. 2012. In Press.
- Wu Y, Liu M, Zheng W, Zhao Z, Xu H. Ranking gene-drug relationships in biomedical literature using latent dirichlet allocation. Pac Symp Biocomput. 2012: 422-33. [PMID: 22174297]
- Xu H, Fu Z, Shah A, Chen Y, Peterson NB, Chen Q, Mani S, Levy MA, Dai Q,Denny JC. Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases. AMIA Annu Symp Proc. 2011, 1564-72. [PMCID: PMC3244156]
- Liu M, Kawai VK, Stein CM, Denny JC, Roden DM, Xu H. Modeling drug exposure data in electronic medical records: an application to warfarin. AMIA Annu Symp Proc. 2011, 815-23. [PMCID: PMC3243123]
- Wu Y, Rosenbloom ST, Denny JC, Miller RA, Mani S, Giuse DA, Xu H. Detecting abbreviations in discharge summaries using machine learning methods. AMIA Annu Symp Proc. 2011, 1541-9. [PMCID: PMC3243185]
- Xu H, AbdelRahman S, Jiang M, Fan JW, Huang Y. An Initial Study of Full Parsing of Clinical Text using the Stanford Parser. International Workshop on Biomedical and Health Informatics, IEEE Conference of Bioinformatics and Biomedicine (BIBM), 2011.
- Xu H, Doan S, Birdwell KA, Cowan JD, Vincz AJ, Haas DW, Basford MA, Denny JC. An automated approach to calculating the daily dose of tacrolimus in electronic health records. AMIA Summits Transl Sci Proc. 2010:71-5. [PMCID: PMC3041548]
- Denny JC, Speltz P, Maddox R, Stein G, Xu H, Spickard A. Comparing Content Coverage in Medical Curriculum to Trainee-Authored Clinical Notes. AMIA Annu Symp Proc. 2010, 157-161. [PMCID: PMC3041398]
- Doan S and Xu H. Recognizing Medication related Entities in Hospital Discharge Summaries using Support Vector Machine. COLING 2010, the 23rd International Conference on Computational Linguistics, 259-266.
- Xu H, Lu Y, Jiang M, Liu M, Denny JC, Dai Q, Peterson NB. Mining Biomedical Literature for Terms related to Epidemiologic Exposures. AMIA Annu Symp Proc. 2010, 897-901. [PMCID: PMC3041399]
- Fan JW, Xu H, Friedman C. Using Distributional Analysis to Semantically Classify UMLS Concepts. In Proceedings of Medinfo. 2007; 519-23. [PMID: 17911771]
- Xu H, Fan JW, Friedman C. Combine multiple evidence for gene symbol disambiguation. ACL 2007, BioNLP Workshop, p41-48.
- Xu H, Stetson P, Friedman C. A Study of Abbreviations in Clinical Notes. AMIA Annu Symp Proc. 2007; 821-5.
- Xu, H, Anderson, K, Grann, V, Friedman, C. Facilitating Cancer Research using Natural Language Processing of Pathology Reports. In Proceedings of Medinfo. 2004; 565-72. [PMID: 15360876]

