cDeid is a de-identification tool for clinical letters. Copyright (C) 2015 Azad Dehghan. CONTACT: azad.dehghan@gmail.com SUMMARY: The cDeid v0.1 (US version) is a de-identification tool with state-of-the-art performance. The current version includes the following NERs: PATIENT, DOCTOR, USERNAME, STREET, ZIP, STATE, COUNTRY, PHONE, FAX, URL, EMAIL, AGE, MEDICALRECORD and IDNUM. This tool was developed and validated using i2b2/UTHealth 2014 Track I data. TODO: - Include NERs: DATE, HOSPITAL, ORGANIZATION and PROFESSION - Include ML models - Include priority sorting USAGE: The cDeid is a simple command line tool. The source code should be straight forward to disinsect and integrate (see Controller.java for example) into your own application. In addition, the validationtools.jar can be used to continue further development and validation (see 'TESTING' commented code in Controller.java). Basic usage using cDeid exectuable: Use case 1: Print help screen. java -jar cDeid.jar -h Use case 2: Process a set of plain text files in directory inputdir/ and save results in outputdir/. java -jar cDeid.jar --xml inputdir/ outputdir/ CONTRIBUTORs: ... REFERENCE: I would appreciate it if you would cite the following paper when using or referring to the cDeid: [1] A. Dehghan et al., Combining knowledge- and data-driven methods for de-identification of clinical narratives, J Biomed Inform (2015), http://dx.doi.org/10.1016/j.jbi.2015.06.029