Yr12 Journal 11

In Week 4 I finished my assignment project

What I did

As from the work last week, the program I wrote can crawl news article from NewYork Times, the work this week is mainly focused at to implement NER algorithm. Therefore I used Beautiful Soup to find all article contents and used spaCy to implement NER, that finds all the Name Entities in the fetched article. Also I used a library called locationtagger to sort the locations in the article with groups, such as countries, cities, and states. Finally, I used Faker library to generate fake locations, fake names, fake institutions, and replaced and highlight relavent words from the orginal article. Finally, I generated the original and the fake result out.

Issue I faced

Since I had a M1 chip laptop, the ARM architecture is not compatible with the library that can render HTML code with python, I had to change my idea, therefore I generated the html files out, then I used pdfkit to render html to pdf, and then I called the system method to open the generated pdf files. That means I can either open html file on my browser manually or I can use the pdf it generated to demonstrate how the project works.

Result

The final result was very satisfying.

It successfully replaced key informations however doesn’t effect the readability of the orginal article, and the fake news seems real, as mostly all the information is real except names, locations and institutions are fake.

Plan for the following week

As I finished my project, next week I will start making my presentation.

Conclusion

This week’s progress was amazing, although I faced some challenges, I successfully came through them.