Background: Unconscious bias within the U.S. health care system has been linked with disparities in the treatment of patients by age, gender, and race (1). While many factors contribute to these disparities, implicit bias may play a significant role. Stigmatizing language often reflects the implicit bias that healthcare providers possess toward patients (2). Recent research suggests that stigmatizing language is prevalent within medical records, more often used in reference to minority patients, and associated with perpetuating bias to other providers (3-5). However, we have limited information on the patterns of stigmatizing language in medical records to aid in the development of future interventions. The objective of this study was to characterize variation in the usage of stigmatizing terms across age, gender, and race.

Methods: We analyzed data from all clinical notes written at an academic medical center between January 1, 2015 and December 31, 2020. Our analysis included a variety of clinical notes such as history & physical examinations and discharge summaries. These notes were written by physicians, residents, advanced practice providers, registered nurses, and pharmacists in inpatient, outpatient, and emergency room settings. We used a simple regular expressions approach to identify twelve terms: compliance, poor historian, drug abuse, addict, lovely, cooperative, good historian, pleasant, defensive, manipulative, refused, and agitated. We selected these words from a list of stigmatizing words and phrases developed by the Center for Disease Control (CDC) and American Psychological Association (APA). We included simple present and past tense in our NLP search. We performed our analyses using the Python programming language and the Spark engine tool. We applied second-order rules to exclude unrelated terms (e.g., ‘lung compliance’).

Results: The corpus included 104,456,653 notes from 1,960,689 unique patients and included 192,688,381,883 characters of text. Our analysis suggests that stigmatizing language varies across age (Figure 1) and gender (Figure 2) )with compliance and cooperative being found more often compared to other terms. However, use of these terms remained relatively consistent by race. Understanding variation in usage by patient characteristics has the potential to aid in future study and inform targeted interventions to reduce the presence of stigmatizing language.

Conclusions: Understanding variation in usage by age, gender identity, and racial identity has the potential to aid in future study of the role of these terms in perpetuating stigma and bias to patients. With the passage of the 21st Century Cures Act hospitals are required to offer patients access to their health records extending the impact of the clinicians’ words to the patient-physician relationship.

IMAGE 1: Figure 1

IMAGE 2: Figure 2