Look R-squared improved a bit after adding the Weight to the model. The R-squared value for the second model ‘m1’ where explanatory variables were ‘Age’ and ‘Weight’: R_squared2 = sum((fitted(m1) - mean(data$Systolic_blood_pressure))**2) / sum((data$Systolic_blood_pressure - mean(data$Systolic_blood_pressure))**2)ģ9.58% of the systolic blood pressure can be explained by ‘Age’ and ‘Weight’ together. That means 37.95% of the systolic blood pressure can be explained by Age. R_squared1 = sum((fitted(m) - mean(data$Systolic_blood_pressure))**2) / sum((data$Systolic_blood_pressure - mean(data$Systolic_blood_pressure))**2) Here is the R-squared value for the first model ‘m’ where the explanatory variable was only the ‘Age’. I will calculate the R-squared value for all three models. It will be good to see the fit of each model. We have three models, and we saved them in three different variables m, m1, and m2. The R-squared value represents the proportion of the response variable that can be explained by the explanatory variables. Y is the original systolic blood pressures from the dataset Y_mean is the mean of original systolic blood pressure values In this case, the values of systolic blood pressure that are calculated using the linear regression model Y_calc is the calculated value of the response variable. The formula for R-squared is the same as the simple linear regression: One very common and popular way to assess the fit of the data in multiple linear regression is the coefficient of variation (R-squared). How accurate that systolic blood pressure calculation from this equation is? Woo! Our multiple linear regression model is ready! Now if we know the age, weight, and BMI of a person, we will be able to calculate the systolic blood pressure of that person! The slope of the BMI variable is -0.7244. So, After adding the BMI in the model the value beta0, beta1 and beta2 changed pretty significantly. The slope for weight is 0.3209 while it was 0.1386 in the previous model. (Intercept) data$Age data$Weight data$BMI Lm(formula = data$Systolic_blood_pressure ~ data$Age + data$Weight + m2 = lm(data$Systolic_blood_pressure ~ data$Age + data$Weight+data$BMI) Let’s use the ‘lm’ function again and save this model in a variable named ‘m2’. Lastly, we add BMI to this model to see if BMI changes the dynamic of this model.
If you know a person’s Age and Weight you will be able to estimate that person’s systolic blood pressure using this formula. On the other hand, the slope for the Weight variable(beta2) is 0.1386 means that if weight increases by 1 unit, systolic blood pressure will increase by 0.1386 unit on average when the Age variable is controlled or fixed. This slope means if Age increases by 1 unit systolic blood pressure will increase by 0.63 unit on average when the Weight variable is controlled or fixed. This time slope(beta1) for Age variable becomes 0.63 which is not so different than the beta1 in model ‘m’. If you notice it is different than the intercept in ‘m’(94.87). Lm(formula = data$Systolic_blood_pressure ~ data$Age + data$Weight) Coefficients: m1 = lm(data$Systolic_blood_pressure ~ data$Age + data$Weight)
It can be done using the same ‘lm’ function and I will save this model in a variable ‘m1’. This time we will have two explanatory variables: Age and Weight. In the model ‘m’, we considered only one explanatory variable ‘Age’. Now, how correct this estimate is, we will determine that later in this article. For example, if a person is 32 years old, the calculated systolic blood pressure will be: Using this equation you can calculate the systolic blood pressure of a person’s age if you know the age. The slope of 0.635 means that if the age increases by 1 unit the systolic blood pressure will increase by 0.635 unit on average. That’s why it is not so reasonable in this case. So, talking about zero age is far out of the range of this dataset.
In this dataset, the minimum age in the dataset is 18(feel free to check on your own). Here, intercept 94.872 means that if the age is zero or very close to zero systolic blood pressure will still be 94.872. So the linear regression equation becomes:Īs we considered only one explanatory variable, no x2, x3, or beta2, beta3. The output shows that the intercept(beta0) is 94.872 and the slope is 0.635(beta1). Lm(formula = data$Systolic_blood_pressure ~ data$Age) Coefficients: m = lm(data$Systolic_blood_pressure ~ data$Age) I will save this model in a variable ‘m’. In R, we can directly find the linear regression model using the ‘lm’ function.