Hi;

I work with ICD-10 codes regularly. I'm currently involved in a project where several thousand ICD codes are relevant. I use the icd10cm commands for generating category and description, but I'm hoping to find some higher-level categorization schema - like at the level "Diabetes", "Heart disease", "STDs", etc. E.g., the current icd10 cm reduces it to "Type 1 diabetes mellitus", "Type 2 diabetes mellitus", "Other specified diabetes mellitus", "Diabetes mellitus due to underlying condition", etc., whereas I'd prefer just a single category called "Diabetes."

I've been doing manual work with it using string matches and such, with validation after, but it's a rather lengthy and involved process. I was wondering if anyone has found either any Stata code for quickly reducing it to major categories or knows of any datasets that match the ICD codes to these more top-level type of categories.