I am attempting to search pathology reports, which are in string format, to identify only those which have positive results for a bacteria, H.pylori. The issue with searching is that the results are not done in a uniform manner, there are misspellings, and the term "H.pylori" is often present when it is tested for, not just when it is positive.
What I have tried so far includes removing spaces, converting all to lower, and using regexm in a number of iterations: replace hp_present = 0 if regexm (`path_report', "h.pylori is not seen), replace hp_present = 0 if regexm (`path_report', "no h.pylori), replace hp_present = 0 if regexm (`path_report', "negative for h.pylori) .... and so on
Then replace hp_present = 1 if regexm (`path_report', "h.pylori is seen), replace hp_present = 1 if regexm (`path_report', "h.pylori positive) , and so on
The issue becomes that when doing an internal validation, sensitivity was only at 50% (I missed a lot of those with H pylori). I am wondering if anyone has any advice on how to approach this issue where spelling errors, non uniformity, and content need to be taken into account.
Examples of strings of the path_report are given below.
Very appreciated.
nal diagnosis 1. "Duodenum biopsies: histologically unremarkable duodenal mucsa with slight vascular congestion. 2. Stomach biopsies: Achtive chronic H. pylori gastritis. 3. Tranverse colon polyp" polypectomy: TUbular adenoma fragmented. 4. Sessile cecum polyp polpyectomy: Hyperplastic poylpfragments. es PATHOLOGIST,MD Date Jun 08 2009 BRIEF CLINICAL HISTORY: GERD, hx of H pylori chronic gastritis OPERATIVE FINDINGS: POSTOPERATIVE DIAGNOSIS: Surgeon: Surgeon MD GROSS DESCRIPTIO: Specimen is submitted in formalin and labeled biopsies gastric antrum. The one fragment shows mucosal lymphoid aggregate with herminal center. Giemsa stains show organisms the morphology of which is consistent with H. pylori. Gastric antrum biopsies: Active chronic gastritis associated with H. pylori. BRIEF CLINICAL HISTORY: GERD, hx of H pylori chronic gastritis OPERATIVE FINDINGS: POSTOPERATIVE DIAGNOSIS: Surgeon: Surgeon MD GROSS DESCRIPTIO: Specimen is submitted in formalin and labeled biopsies gastric antrum. The one fragment shows mucosal lymphoid aggregate with herminal center. Warthin-Starry stains pending. Gastric antrum biopsies: Active chronic gastritis not seen, no H. pylori. BRIEF CLINICAL HISTORY: GERD, hx of H pylori chronic gastritis OPERATIVE FINDINGS: POSTOPERATIVE DIAGNOSIS: Surgeon: Surgeon MD GROSS DESCRIPTIO: Specimen is submitted in formalin and labeled biopsies gastric antrum. The one fragment shows mucosal lymphoid aggregate with herminal center. Warthin-Starry stains pending. Gastric antrum biopsies: Active chronic gastritis not seen, no H. pylori. ADDENDUM: POSITIVE HPYLORI MEDICAL RECORD: 78907689 SURGEONPHYSICIAN: Surgeon MD PREOPERATIVE DX: r/o Helicobacter pylori Final diagnosis: rare comma shaped organisms seen consistent with H pylori MEDICAL RECORD: 78907689 SURGEONPHYSICIAN: Surgeon MD PREOPERATIVE DX: r/o Helicobacter pylori Final diagnosis: no comma shaped organisms seen consistent with H pylori
0 Response to Free text processing of medical notes
Post a Comment