In week 2 term 4 (week 12), we supposed to work through the seaborn examples and find a large dataset.
The reason
The reason of doing this is to practice generate different visualisations by using specefic data. This process helps people easier to process what the data tells.
What did I do
As I am very satisfied with my previous week work (as it has a huge database I can use, and I did some visualisations), this week I changed the visualisation part in my previous work. I started using seaborn instead of matplotlib (although matplotlib is a better library but the assignment required searborn), and I made some improvements on my work (i.e. improve the sql command to a flexible way, create more functions to make my code clear, etc).
Conclusion
Everything went well, and the code can be run via here.
As a conclusion I realised random sampling can also be applied into verifying the outcome, as shown in my work I have a random sample with 1000/10000/100000 rows of datas and generate 3 different bar graphs, and all of them look similar with a same trend.