In week 5 I learned about linear regression.
The concept
If we have data with variables that are in a relationship, the aim of linear regression is to find a linear equation that describes the relationship for the variables.
Why we doing this
Machine learning requires lots of algorithms that can process the data so that the machine can “learn” from it. Linear regression is one of the algorithms that can find the relationship of data (if the variables in the data are direct proportion),
What I did
- Firstly, I tried to learn the equation on the document, which gives me a concept of how to calculate the components in the linear equation result and calculate the cost, which means how close the prediction is from the original dependent variables.
- Secondly, I run through the data provided and successfully generated the graph.
- Thirdly, I downloaded air quality data from the ACT Open Data Portal
- Finally, I selected
PM2.5
andAQI_PM2.5
columns from the data and generated a new result. And the prediction is almost identical to the raw data. Such that it generates a linear equation approximately that equals:AQI_PM2.5 = -0.1497 + 3.9995 * PM2.5
.


Conclusion
It was a little bit rush this week as I was doing my physics report, didn’t have much time to finish the linear regression topic, so I was doing it Sunday midnight and I finished everything 0:30 on Monday week 6. I think I need to spend more time on this task this week. However, in the academic aspect, I am totally fine and I understands all the Maths that can calculate the linear regression now.