Medical Dictionary Coding

1204_mz_patient_300348Recording and storing data in a controlled, consistent and reproducible manner for data retrieval and analysis is a necessity for regulatory compliance and clinical study success. To provide control and consistency, a variety of medical coding dictionaries may be used to process, analyze and report collected data. These dictionaries range in size and complexity from simple code lists with a few entries to large and complex dictionary systems containing thousands of entries and related tables. Two examples of commonly used dictionaries are the Medical Dictionary for Regulatory Activities (MedDRA) and the World Health Organization Drug Dictionary (WHO Drug). Processes must be established for managing the release of multiple versions of the same dictionary, handling different dictionaries or versions that have been used, and integrating data coded with different dictionaries or versions.


AE dictionaries are needed to group data for meaningful analysis. MedDRA is the ICH-developed and recommended dictionary for all medical events captured in clinical trials, including, but not limited to, AEs. MedDRA is not just another dictionary. It is a distinct approach to thinking about medical information. Managers of medical information have an imperative to understand the flexibility of MedDRA as well the implications that its storage and implementation can have on safety reporting. Use of MedDRA requires an understanding of its levels of terms and an understanding of its multi-axial functionality. The levels of terms used in MedDRA are the follows: Lowest level term (LLT), Preferred term (PT), High level term (HLT), High level group term (HLGT), System Organ Class (SOC)

Recognizing the increase of global studies and submission of marketing applications to multiple regulatory agencies, the International Conference on Harmonisation (ICH) undertook the development of a global dictionary, which resulted in MedDRA. The US Food and Drug Administration (FDA) is currently using MedDRA in its Adverse Events Reporting Systems (AERS), per the FDA Reviewer Guidance: Conducting a Clinical Safety Review of a New Product Application and Preparing a Report on the Review and Attachment B the Clinical Safety Review of an NDA. Also available is the FDA guidance on classifying significant drug safety issues. Coding specialists can be certified via the MSSO Certified MedDRA Coder (CMC) exam.

The organization responsible for publishing and maintaining MedDRA is MSSO (Maintenance and Support Services Organization). The guidelines developed by ICH MSSO MedDRA® Term Selection: Points to Consider and MedDRA® Terms Selection: Data Retrieval and Presentation, and additional study guide; along with the Recommendations for MedDRA Versioning for Summary Report. Since the initial release of MedDRA, revisions have addressed topics in the following areas: Updated assignments to system organ class (SOC), Consistent use of terminology, Retirement of terms from current status, Addition of new terms identified during implementation of the dictionary, in clinical studies. MedDRA is a multiaxial dictionary, meaning that a preferred term (PT) may be associated with multiple SOCs. Each PT, however, is associated with only one primary SOC, regardless of the number of secondary SOCs with which it is associated.

CTCAE: NCI’s Common Terminology Criteria for Adverse Events for classifying the nature and severity of adverse events. Version 4.0 was released in 2009. Work is currently underway to integrate CTCAE with MedDRA.

WHO Drug: is one of the more commonly used subscription-based dictionaries, and the dictionary was designed by the World Health Organization (WHO) for coding medications in clinical studies. A variety of dictionaries or medication references provide information about prescription, generic, and over-the-counter (OTC) medications, as well as herbal supplements. In 2005, the UMC introduced the WHO Drug Dictionary Enhanced (WHODDE) Browser. The WHO-DDE combines data from the original WHO Drug Dictionary (WHO-DD) with additional country-specific drug information collected through the UMC’s collaboration with IMS Health (an international consulting and data services company). The WHO-DDE is therefore several times larger than the WHO-DD. The dictionary is introduced via the Best Practices for the use of the WHO Drug Dictionaries and Introduction to the WHO Drug Dictionary.

Other Dictionaries: Although MedDRA and WHO Drug are the most commonly used dictionaries for clinical studies and postmarket surveillance, the following list briefly describes a few established but not as widely used dictionaries.

  • WHO ART—WHO Adverse Reactions Terminology is a dictionary
  • COSTART—FDA Coding Symbols for a Thesaurus of Adverse Reaction Terms, has since been replaced by MedDRA.
  • SNOMED CT— College of American Pathologists Systemized Nomenclature of Medicine–Clinical Terms for medical history, treatments and outcomes.
  • ICD-9—WHO in 1977, this dictionary consists of coding for diagnoses and procedures. Updated to ICD-9-CM.
  • ICD-10—WHO in 1992, and while implemented in most of the world, was not adopted in the US. Ooriginally designed to report mortality; modified versions have since been created ICD-10-CM and ICD-10-PCS.


Medical Coding Tools and Methods: 

In addition to the actual dictionaries and software applications used to house them, CDM personnel and dictionary users should be familiar with the following tools and methods used in dictionary management.

  1. AutoencodersA programmatically assisted process for matching a reported term to a dictionary term.
  2. Manual CodingManual coding refers to a situation where a person selects an appropriate dictionary entry for each reported term, either in the patient database or in a module of the dictionary application that deals with discrepancies.  Both integrated and standalone manual coding applications must be fully validated according to current regulatory standards. Additional features to consider for a manual coding application are the ability to review coded terms for accuracy and consistency, the ability to query a term when it cannot be coded, audit trails that record the user and date/time a term was coded, and extensive, easy-to-use search capabilities.
  3. Hybrid Approaches to CodingA hybrid approach to coding uses an autoencoder to first automatically code those reported terms that match a dictionary term or that match a term that has previously been coded (i.e., a synonym list). The terms  that are not autoencoded are then manually coded. Many clinical data management systems and standalone coding applications support this hybrid approach to coding.

Overall auto-encoding is a highly recommended practice to facilitate the execution of a dictionary against AEs. Training should include guidelines such as the following:

  • Avoid use of adjectives as initial words (e.g., “weeping wound” may be coded to “crying”; “faint rash” may be coded to “syncope”).
  • Avoid the use of symbols and abbreviations in the AE text, as they may be interpreted differently.
  • Avoid inclusion of severity in the AE text (e.g., “severe headache” in the AE text inhibits auto-encoding; severity should be recorded in the severity field, not the AE text).
  • Ensure that AE text has a clinical meaning (e.g., “bouncing off the walls” and “feeling weird” are difficult to interpret).
  • Ensure that AE text has a clear meaning (e.g., “cold feeling” may be interpreted as “chills” or “flu symptoms”).
  1. Hard-coding: Hard-coding, or coding outside the clinical database, is generally a dangerous practice. Conventionally, many sponsors make use of quotation marks to indicate verbatim text that is passed through by a program to the preferred-term field. Any use of hard-coding requires careful documentation.
  2. Lumping and Splitting: Coders can be categorized into “lumpers” and “splitters.” No universally agreed-upon method exists for handling AE text with more than one event.

When two events are reported in the same text field (e.g., “indigestion and diarrhea”) and splitting is done by the data management staff rather than the site, inconsistencies within the database may result.  Medical judgment may also be inadvertently introduced into the database by the data manager. If the severity of the compound event is recorded as “severe,” the duplication of the attributes of the AE imputes “severe” to the other event(s). However, this outcome may not reflect the physician’s judgment for that particular component of the AE. Coding of AEs has significant impact on the analysis and interpretation of the safety data for a product. The perspective that coding is a clerical function is naïve and risky.


A computerized tool used to aid in accessing terms in a specified dictionary is called a browser. Browsers are designed to quickly find terms of interest and should be flexible, intuitive, and quick to use.

  • Stand-alone browsers—These are applications that allow for the easy search and review of dictionaries. Some also possess a capability for limited linking to external applications (e.g., study databases), where one may not be able to affect a term or coding change from the browser, but would be able to call (or open) the browser from within the dictionary application.
  • WHO Drug—Several WHO Drug browsers with differing feature sets exist, including one produced by the Uppsala Monitoring Centre (which is an entity of WHO that works with international drug monitoring).
  • MedDRA—An application has been provided by the MSSO for searching the MedDRA dictionary, but other vendor-created browsers also exist, with differing feature sets.
  • Browsers that are contained within dictionary management systems have enhanced capabilities, although the availability of these enhanced capabilities varies across available systems. Some of these systems can act as a browser, as well as a vehicle for importing and exporting individual reported terms or a batch of reported terms. Various coding approaches outlined above can by performed once the terms are imported into the system.


Change Control

Dictionary Version Control is crucial as updated versions of dictionaries frequently change pathways to body systems or organ classes. Such changes in a dictionary can have a substantial effect on conclusions regarding a product’s effects on the body. Thus,  the version of a dictionary used for classification of AEs into body systems can impact the labeling of the product. To ensure reproducibility, the version of the dictionary used in any study should be stored with the database.

The practice of modifying published dictionaries is clearly discouraged by the ICH for the MedDRA dictionary. Coding dictionaries may be available in electronic and/or printed format, and multiple versions may be released or published. The dictionary and version used for a given project, time period, or data set should be clearly documented. Where this information is documented may vary between organizations, but the dictionary and version should be referenced in clinical study reports or integrated summaries that report on the coded terms. For multiple ongoing studies, the study team should determine which dictionary and version will be used for coding each study. A systematic process and instructions should be in place to ensure the consistent use of the appropriate dictionary and version. Processes should be established for evaluation of the extent of changes between versions, the impact of changes on previously coded terms, and criteria for recoding and implementing the latest version. Using different dictionaries or versions over a period of time increases the importance of version control, documentation and standardized data reconciliation processes Dictionary and version information may be maintained within the clinical database, within the autoencoder as the dictionary files are loaded, or within the metadata of data sets containing coded data.  Process steps for installing and upgrading to new dictionary versions may vary between organizations and specific dictionaries.