Let's say I brought people into the lab, and had them sit in rooms of three different colors (red, blue, green). After 20 minutes, I measured how angry they were. That experiment is testing whether anger level is dependent upon the color of the room. Thus "Anger" is the Dependent Variable, because it is the one that depends upon something else.
Note the role of experimental context: I could have done a different experiment. I could have brought people into a white room and tried to make them angry for 20 minutes, then given them a choice of which room to go into. That experiment would only make sense if I thought maybe the color of the selected room might be dependent on how angry the participant was. Thus, in this experiment, "Color of Room" is the dependent variable.
Once you know which variable is "dependent", the other is "independent." Easy as that!
Why that label? The values of the independent variables do not depend on the values of the other variables in your study.
Outside an Experimental ContextWhat happens when it is not an experiment? Sometime the answer is obvious. For example:
Why might you studying the hair length of men and women? Would it be because you think a person's dangley bits depend upon on the length of their hair? Or would it because you think that the length of a person's hair (on average) depends on the type of dangley bits they have? Note that, while it is technically possible to experimentally change people's dangley bits, we are not doing that in our study, nor are we changing their hair length. That said, I think we can all be pretty certain that cutting someone' hair doesn't change what dangly bits they have. Thus, the only thing we could reasonable suspect was that hair length will be dependent upon gender, and so we have to label "Hair Legnth" as the dependent variable.
Sometimes, however, it is much less obvious which variable is which. For example:
If you had national level data on GDP per capita (the average amount of value a worker in a given country produces) and on the proportion of doctor in a country. With that data set, you could make an argument either way. Maybe you think there are more doctors when a country generates more money per citizen. Maybe you think having more doctors allows a workforce to be healthier and more productive. In this context, the the correct way to use the labels will depend on what how you think the causality works. Whichever one you think depends on the other, you call that one you dependent variable.
Thus, when you read a non-experimental study, and someone labels their dependent and independent variables,you should always pause for a second. In that pause, ask yourself, "Does that causal direction make sense? Do I believe that their Dependent Variable might depend on their Independent Variable?" If the story doesn't sound likely, continue reading with caution
Making and Reading GraphsAs a general rule, if you are making a graph, the independent variable is on the horizontal "X-Axis", and the dependent variable is on the vertical "Y-Axis". Thus, if you were reading a paper studying the relationship between "GDP per Capita" and "Doctors per 1,000 workers", and you saw a graph with GDPPC on the Y axis, that suggests that the authors thing GDPPC depends on the proportion of doctors in a given workforce.
|(Garph linked from the Narraganstt School Website)|
Alternative LabelsBecause these terms can be confusing, some authors use alternative terms. Wikipedia lists many options:
Alternatives to "Independent Variable" - predictor variable, regressor, control variable, manipulated variable, explanatory variable, exposure variable, risk factor, feature, input variable.
Alternatives to "Dependent Variable" -response variable, regressand, measured variable, responding variable, explained variable, outcome variable, experimental variable, output variable.
Part 2: Null Hypothesis Testing