back to Pre-Conference Workshops
Larry Cook, MStat, PhD
Associate Professor
University of Utah
Dr. Lawrence Cook is an Associate Professor at the University of Utah School of Medicine’s Department of Pediatrics. He has two decades’ experience with probabilistic linkage theory and its application to motor vehicle crash and health care databases. As the principal investigator for the Utah Crash Outcome Data Evaluation System (CODES) Project and the CODES Data Network Technical Resource Center, Dr. Cook led an effort to standardize probabilistic linkage practices and coding of data sets among all participating states. He has authored more the 40 papers and technical r eports on probabilistic linkage theory and analysis of linked databases.
Cody Olson, MS
Biostatistician
University of Utah
Cody Olsen is a biostatistician in the Division of Critical Care and Department of Pediatrics at the University of Utah. Cody works with investigators throughout the country to study emergency care for children, rare pediatric diseases, and injury. He provides statistical support for clinical trials carried out by the Pediatric Emergency Care Applied Research Network (PECARN) and research projects within the Utah Crash Outcomes Data Evaluation System (Utah CODES), and the National Pediatric Multiple Sclerosis Centers (NPMSC) research network. Cody has particular interest and experience with probabilistic linkage, multiple imputation, centralized statistical monitoring, and non-parametric methods.
For this introductory course, participants should be familiar with basic concepts of collecting and storing variables in databases. Participants should also have an introductory exposure to statistics.
To provide participants with a foundation for understanding and applying probabilistic linkage methodology.
At the conclusion of the workshop, participants will be able to:
In an Injury Prevention: Editor’s Blog, Dr. Scott Parker states, ‘I am not an expert in data linkage, nor am I up to the challenge of linking various data sources, however I am acutely aware that NOT linking data is a huge obstacle for injury prevention.’ This workshop is targeted to those researchers who are motivated to overcome this obstacle. Participants will gain a firm understanding of basic probabilistic linkage methodology and will learn how using probabilistic linkage can aid injury control research and surveillance efforts.
Often the information required to examine an injury control problem, perform surveillance, or answer a research question resides in separate, disparate databases. For example, event information is often available in motor vehicle crash, poison control or law enforcement databases, while information regarding medical or other outcomes is contained in separate databases, including emergency medical services, hospital billing, vital records or judicial court records. In the era of big data, where many (often large) databases are available electronically, the ability to link data sources will become essential. If the necessary databases do not share a common unique identifier, then obtaining the desired result may seem impossible.
Probabilistic linkage is used in a wide range of injury control areas to successfully link disparate databases when common unique identifiers do not exist. This workshop will cover the essentials of probabilistic linkage for non-statisticians. Using examples, and descriptions of methodological and practical issues, participants will learn the strengths, weaknesses, dos and don’ts of probabilistic linkage. We will begin with several motivational examples highlighting several injury control examples. A brief overview of the history, main concepts and ethical concerns of probabilistic linkage will be covered. Technical details will be explained, including how to calculate match weights and probabilities. We will explain how multiple imputation may be used when commonly used powerful identifiers, such as names or dates of birth, are not available. We will also cover how to determine the feasibility of a linkage based on the available variables. An overview of several different linkage software packages will be provided.