00:00 / 00:00
Introduction to biostatistics
Types of data
Fisher's exact test
Kaplan-Meier survival analysis
Mann-Whitney U test
Spearman's rank correlation coefficient
Type I and type II errors
Hypothesis testing: One-tailed and two-tailed tests
Methods of regression analysis
Repeated measures ANOVA
Mean, median, and mode
Normal distribution and z-scores
Range, variance, and standard deviation
Standard error of the mean (Central limit theorem)
0 / 8 complete
0 / 1 complete
Logistic regression is a type of statistical method that’s used to describe the relationship between an outcome variable and one or more exposure variables.
In logistic regression, the outcome variable is always categorical, and the exposure variables can be either categorical or quantitative.
For example, let’s say you want to figure out if smoking more cigarettes increases the chance of having a heart attack. In this case, the number of cigarettes is a quantitative exposure and whether or not a person has a heart attack is a categorical outcome.
Now, to figure this out, you might ask 200 people how many cigarettes they smoke in a day, and then follow that group of people for five years and see who has a heart attack and who doesn’t.
You could organize your data in a table like this—where the first column, or variable, is the number of cigarettes a person smokes, the second column is if they had a heart attack or not, and the rest of the columns are other characteristics, or variables, that you collected about each person, like their age, sex, and body mass index, or BMI.
Usually, for binary variables, like yes or no, we use the numbers zero and 1 to represent the two possible answers.
So, for the heart attack variable, we might say that zero represents “no” and 1 represents “yes”. We could do the same thing for sex, where zero represents females and 1 represents males.
Now, let’s just look at the first two variables, so how many cigarettes they smoke and if they had a heart attack or not. You could plot these measurements, or data points, on a scatterplot, with the number of cigarettes on the x-axis, and heart attack on the y-axis, and where each data point represents one individual.
This scatterplot might seem a little funny looking, and that’s because all of the data points are clustered on two points on the y-axis—they’re either on the zero, which represents no, or the 1, which represents yes.
This scatterplot can help us figure out how the odds of having a heart attack changes for people as they smoke more and more cigarettes.
And that’s the goal of logistic regression.
Now, in statistics, probability and odds are often confused with one another, so let’s break down the difference.
Logistic regression is a statistical method used to describe the relationship between an outcome variable and one or more exposure variables. Logistic regression can help to figure out the effect of an exposure variable (e.g. the number of cigarettes per day) on a categorical outcome variable (e.g. Having a heart attack). Note that the outcome variable is always categorical, but the exposure variables can be either categorical or quantitative.
Latest on COVID-19
Nurse Practitioner (NP)
Physician Assistant (PA)
Create custom content
Raise the Line Podcast
Copyright © 2024 Elsevier, its licensors, and contributors. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
Cookies are used by this site.
Terms and Conditions
USMLE® is a joint program of the Federation of State Medical Boards (FSMB) and the National Board of Medical Examiners (NBME). COMLEX-USA® is a registered trademark of The National Board of Osteopathic Medical Examiners, Inc. NCLEX-RN® is a registered trademark of the National Council of State Boards of Nursing, Inc. Test names and other trademarks are the property of the respective trademark holders. None of the trademark holders are endorsed by nor affiliated with Osmosis or this website.