General

Community

Users

Developers

PMC

ASF

Apache cTAKES Components

Sentence boundary detection

Apache OpenNLP technology with a model trained on manually annotated clinical data (see Savova et al, 2010)

Tokenization

Rule-based (see Savova et al, 2010)

Morphologic normalization (National Library of Medicine's Lexical Variant Generation tool)

http://www.nlm.nih.gov/research/umls/new_users/online_learning/LEX_004.htm

POS tagging

Apache OpenNLP technology with a model trained on manually annotated clinical data (see Savova et al, 2010; upcoming 2013 publication)

Shallow parsing

Apache OpenNLP technology with a model trained on manually annotated clinical data (see Savova et al, 2010)

Named Entity Recognition (see Savova et al, 2010)

Assertion module

Discovers negation, degree of certainty and the subject/experiencer of the clinical event (upcoming 2013 publication)

Dependency parser

Detects dependency relations between words (machine learning with a model trained on manually annotated clinical data) (see Choi and Palmer, 2011a; Choi and Palmer, 2011b; upcoming 2013 publication)

Constituency parser

Apache OpenNLP technology with a model trained on manually annotated clinical data (see Zheng et al, 2011)

Semantic Role Labeler

Assigns the predicate-argument structure of the sentence (who did what to whom when and where) (see Choi and Palmer, 2011a; Choi and Palmer, 2011b; upcoming 2013 publication)

Co-reference resolver

Resolves co-referring entities. (machine learning with a model trained on manually annotated clinical data) (see Zheng et al, 2011)

Relation extractor

Discovers attributes such as the location and the severity of a clinical condition (machine learning with a model trained on manually annotated clinical data) (upcoming 2013 publication)

Drug Profile module

Discovers drug-specific attributes such as dosage, duration, form, frequency, route, strength (see Sohn et al, 2010; Savova et al, 2011)

Smoking status classifier

Classifies document/patient as past smoker, current smoker, non-smoker, smoker, unknown (see Savova et al, 2008)

Select Publications:

Choi J, Palmer M. Getting the most out of Transition-based Dependency Parsing. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies,. Portland, OR.: ACL-HLT 2011a , 2011.

Choi J, Palmer M. Transition-based Semantic Role Labeling Using Predicate Argument Clustering. RELMS 2011: Relational Models of Semantics, held in conjunction with ACL-HLT 2011. Portland, OR, 2011b.

Savova G, Olson J, Murphy S, Cafourek V, Couch F, Goetz M, Ingle J, Suman V, Chute C and Weinshilboum R. 2011. The electronic medical record and drug response research: automated discovery of drug treatment patterns for endocrine therapy of breast cancer. Journal of American Medical Informatics Association.

Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, and Chute CG. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. Journal of the American Medical Informatics Association : JAMIA 2010;17(5):507-13.

Savova G, Ogren P, Duffy P, Buntrock J and Chute C. 2008. Mayo Clinic System for patient smoking status classification. J Am Med Inform Assoc. 2008; 15(1):25-8. PMID: 17947622

Sohn S, Murphy SP, Masanz JJ, Kocher JP, Savova GK. Classification of medication status change in clinical narratives. AMIA Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 2010;2010:762-6.

Zheng J, Chapman W, Miller T, Lin C, Crowley R and Savova G. 2012. A system for coreference resolution for the clinical narrative. Journal of the American Medical Informatics Association. doi:10.1136/amiajnl-2011-000599

Copyright © 2011-2013 The Apache Software Foundation, Licensed under the Apache License, Version 2.0. Privacy Policy
Apache and the Apache feather logo are trademarks of The Apache Software Foundation.