00:00 / 00:00
of complete
of complete
There are four basic types of statistical analyses commonly used in epidemiological research, and the analysis you pick depends on two main criteria.
The first criterion is the type of data you have, which can be either individual data or binned data, which is also called group data.
So, for example, let’s say we want to know how many people out of 100 people developed lung cancer the past 5 years.
With individual data, we have information about each person, so we can tell whether or not each of the 100 people developed lung cancer.
So let’s say that 6 people developed lung cancer. If we have individual data, we can look at the individual characteristics for each of those 6 people, like their sex, age, race, or past history of migraines, and we can compare them to the people that didn’t developed lung cancer.
On the other hand, if we have group data, we don’t actually know which specific individuals out of the 100 people developed lung cancer.
So even though we know that 6 people had them, we don’t know which 6 people they were or any of their individual characteristics.
The second criterion is the type of outcome or y-variable you’re measuring, which can be either quantitative, categorical, or time to event.
Quantitative variables have a numeric value, like a person’s forced expiratory volume, which is the total amount of air, in liters, that a person can exhale in a single forced breath.
A very fit person might have an FEV of 5, while a less fit person might have an FEV of 3.
On the other hand, categorical variables have distinct levels.
For example, we could use a categorical variable to characterize if a person was diagnosed with lung cancer in the past five years or if they were not.
And finally, time to event variables describe how long a person was followed before the event or outcome occurred.
For example, if we started following a person at age 50 and they developed lung cancer at age 53, then their time to event would be 3 years.
Now, one of the simplest and most widely used types of analysis is linear regression.
Linear regression uses individual data, and the outcome variable is always quantitative, while the exposure variable can be either categorical or quantitative.
For example, let’s say we want to figure out if there’s an association between the number of cigarettes smoked and FEV, so we ask 100 people how many cigarettes they smoke in a day and then measure each person’s FEV. In this study, the exposure is the number of cigarettes, so it’s quantitative, and the outcome is FEV, which is also quantitative.
There are a variety of methods of regression analysis, each with its own strengths and weaknesses. The most commonly used methods are linear regression, logistic regression, and Poisson regression.
Linear regression is used when the data is assumed to be linear in nature. Logistic regression is used when the data is assumed to be binary (e.g., success/failure, yes/no), while Poisson regression is used when the data follows a Poisson distribution, and is used for modeling count data.
Copyright © 2024 Elsevier, its licensors, and contributors. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
Cookies are used by this site.
USMLE® is a joint program of the Federation of State Medical Boards (FSMB) and the National Board of Medical Examiners (NBME). COMLEX-USA® is a registered trademark of The National Board of Osteopathic Medical Examiners, Inc. NCLEX-RN® is a registered trademark of the National Council of State Boards of Nursing, Inc. Test names and other trademarks are the property of the respective trademark holders. None of the trademark holders are endorsed by nor affiliated with Osmosis or this website.