MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

# percentage of the variance of the dependent variable explained by multiple independent variables

Asked by cwc on 7 May 2012

Hi,

I am trying to analyze how different independent variables are contributing to a dependent variable. I want to understand how important each independent variable is to the output. Is there any statistical processes and matlab function can do that? possibly compute the percentage of the variance of the dependent variable is explained by the other independent variables.

Thanks

## Products

No products are associated with this question.

Answer by Tom Lane on 8 May 2012

If you have the latest version of the Statistics Toolbox, then it sounds like you want the anova method of LinearModel.

Unfortunately, the problem as you described it isn't uniquely determined. That's why there are various "types" of sums of squares in anova. For example, if x1 and x2 are highly correlated with each other and with y, then it could turn out that each x variable is important individually, but once you have either one, the other is less important.

I'm assuming you already have data. If not, and you have the opportunity to collect data, then you may be able to make the x variables orthogonal so that you can accurately attribute variance percentages to each x variable.

cwc on 8 May 2012

I'm thinking to use linear regression and find out the correlation of determination from one variable to each combination of variables. The R^2 value should indicate the importance of each variable or each combination of variables. How would that differ from the anova method?

and do you use principal component analysis to make the x variables orthogonal?

Thanks

Tom Lane on 9 May 2012

Yes, R^2 is a function of the sum of squared residuals, so it's equivalent to anova in that sense. The basic issue is this. How important is x1 in predicting y? You could compute R^2 for sets of predictors A={none}, B={x1}, C={x2}, D={x1 and x2}. You could measure the importance of x1 by comparing B to A, or D to C. Often you will get different measures.

You can use principal components also, but since every x may contribute to every component, you'd again be stuck trying to devise a measure of importance that depends both on the predictive ability of various models and the extent to which each x contributes to those models.