Blog


Stay in the loop with our blog posts! From educational guides to opinion articles about data science in the real world, they’re here for you!

Welcome to another UNSW DataSoc blog! In this post, we are excited to present the intriguing findings of our internal sleep study. With the participation of 22 unique students, we delved into the sleep habits and patterns of our society members. The study involved the collection of sleep data for all 7 days of the week using a digital survey, resulting in a wealth of information that sheds light on the sleep behaviours of our diverse community. 🌟🌙

“F*** it, I’m going to ChatGPT the answer”

Over the last term, I’ve said this phrase countless times in a last ditch effort to scrape any form of marks to answer my quizzes or assignments.

Since it’s launch in November 2022, ChatGPT has taken the world by storm - becoming fastest-growing consumer app in internet history by reaching 100 million users within two months. Of course, this rise does not come without controversy, including:

Major Data and Privacy Breaches

Where OpenAI’s current CEO confirmed numerous data breaches that were caused by a vulnerability in the code’s open-source library.

This data breach is reported to have exposed another active user’s first and last name, email address, payment address, credit card type and the last four digits (only) of a credit card number, and credit card expiration date.

Read more on this data breach here.

data breach

Additionally, ChatGPT’s privacy policy, which saves and logs every conversation, including any personal data you share as training data has resulted in ChatGPT being temporarily banned in Italy. In its statement, Italy’s watchdog states that there was no legal basis to justify

“the mass collection and storage of personal data for the purpose of ‘training’ the algorithms underlying the operation of the platform”.

And

“exposes minors to absolutely unsuitable answers compared to their degree of development and awareness”.

With Italy making headlines as the first Western country to block ChatGPT, a precedent has now been set and the European Union has been reported to work on the world’s first artificial intelligence legislation.

A Revolution in Education & Office Work

Aside from the data and privacy issues that stem from generative AI, controversy has also formed for education - with UNSW’s students being caught for academic misconduct for use of generative AI.

Following this, all UNSW students were emailed a statement regarding Turnitin’s new AI detection tool:

“While the technology changes, our values around academic integrity do not. Your work must be your own and where the use of AI tools like ChatGPT have been permitted by your course convener, they must be properly credited, and your submissions must be substantially your own work. In cases where the use of AI has been prohibited, please respect this and be aware that where unauthorised use is detected, penalties will apply.”

Turnitin

In the workplace, reception towards usage of generative AI has been all over the spectrum - where firms such as Samsung, Amazon, Bank of America, and JP Morgan outright banned the use of ChatGPT, voicing concerns regarding copyright infringement and disclosure of data.

On the other hand, firms like Citadel, are currently negotiating an enterprise-wide license for the tool. Similarly, a case can be made for automating menial office tasks - where generative AI points to a future where many of the boring parts of white-collar life are easily completed.

On the other side of the coin, Goldman Sachs anticipates that 300 million jobs will be displaced by AI as the aforementioned low-skilled tasks and jobs become automated. However, it is likely that jobs which require more ‘complexity’ and ‘creativity’ will be safe from the jaws of automation.

AI

For more information on how generative AI will change the workplace, read here.

TLDR: ChatGPT is revolutionising the workplace, but doesn’t come without its drawbacks, including privacy breaches, and an increase in academic misconduct.

Whether you’ve never entered the casino, or are an experienced or avid gambler. You have likely heard the term “The house always wins”. In my interest of casino games after winning a whopping $35 playing blackjack, I got curious to know how lucky I truly got in order to turn a profit whilst gambling in a casino. Today we will be exploring the statistics behind the most popular casino games and why the catch phrase “The house always wins” holds true.

Learning Machine Learning... with Kaggle

09 Aug 2022 | William Dang

Somehow, in my one and a half years as a student studying a Bachelor of Data Science and Decisions, I’ve managed to learn close to nothing about what Machine Learning actually is (unless you count the one or two DataSoc workshops I’ve attended). So, because I’ve had a term off uni (and am in desperate need to upskill myself), I decided to check out Kaggle’s Introduction to Machine Learning course, and see what it’s all about.

In this blog post, Diwa takes a look at two things he’s particularly passionate about - Data Science in the NBA, and Harry Styles’ new album.

Movie Review: AlphaGo documentary

05 Apr 2022 | Ayra Islam

Artificial intelligence (AI) often gets a bad rap in popular culture. In Hollywood films such as The Terminator and the Matrix, we see AI systems going rogue and taking over human, but in reality, the future of AI is less obtuse. Greg Kohs’ 2017 documentary, AlphaGo, offers a fantastic introduction into the world of AI.

Honestly, the first year of university can be intimidating. So if you’re feeling a little overwhelmed (just as I did last year) - we’ve got you covered! Looking back on my first year, I’d like to think I had a pretty good experience. But, looking back on it, there were also many things I would do differently. So, I wanted to see if some people at DataSoc felt the same way, and asked them about the experiences they had in first year - both the good, and the bad. Here are the top 5 tips I was able to come up with!

Interview with Professor Jake Olivier

18 Nov 2021 | Julian Garrett and Gordon Huang

In the first of a series of video blogs by DataSoc, we chat with Professor Jake Olivier who is a researcher and lecturer at the UNSW School of Mathematics and Statistics, as well as the Deputy Director of Transport and Road Safety Research Centre at UNSW. Join us as we find out more about his journey of studying and teaching mathematics, as well as his research in public road safety.

Hey Apple, Take My Money

28 Oct 2021 | William Feng

It was 19th October 2021 and I woke up at 4am. FOUR IN THE MORNING. Was it because I had a plane flight? Imagine having the opportunity to travel during these times. Maybe I had an assignment due the next day? In that case, I probably wouldn’t have been sleeping beforehand (jokes aside, please don’t leave it to the last minute). The reason was ridiculous: there was an Apple event. Exactly why do so many people love Apple products?

Did Somebody Say Crypto?

21 Oct 2021 | Julian Garratt

As a wannabe Data Scientist I love analysing marketing data and especially crypto currency market data. But in a world where Bitcoin’s price is as chaotic as WallStreetBets, how can we possibly get any sort of advantage over the casual trader who puts their life savings in Bitcoin in an effort to become the next Warren Buffet? So instead, let’s look away from technical analysis and towards market sentiment and become the next Jim Simmons.

Meet IT/Pubs

14 Oct 2021 | 2021 DataSoc IT/Publications Portfolio

Check out our first video blog ever - introducing the 🤖 2021 DataSoc IT/Pubs portfolio 🤖 and their near death experiences… 🚽 👀 💀

The Choices We Make

07 Oct 2021 | Aileen Wang

Suppose I get out of bed one morning and the sky is cloudy. I have a date with a friend at a café soon (a pre-COVID indulgence). I get up, I brush my teeth, I dress. As I prepare to leave my house, I am faced with a pressing decision: do I bring my umbrella?

A/B Testing: The basics

30 Sep 2021 | Amber Dang

Have you ever wondered how Google, YouTube, Facebook, or Netflix decide to make small changes to their website now and then, especially after a short trial period? What are their motivations and why are they doing these trials? In this article, we will look into how data analytics and statistics have fuelled the decisions surrounding these changes.

Observing distant intergalactic objects that are hundreds of millions of light years away is like trying to read a newspaper on the moon — from Earth. But radio astronomy makes it possible to capture invisible wavelengths of light, unveiling the most elusive corners of the observable universe.

The world is a complicated place, and every day we see millions of different problems to solve. How can we use one of nature’s secret strengths to power our technology?

'Save the Bees' - with Data!

09 Sep 2021 | Maggie Chan

With the threats of habitat destruction, urbanisation and use of pesticides, various species of the bee population have rapidly declined - threatening the harmony of the ecosystem. Read more to learn about how data and emerging technologies like AI are being used to help ‘Save the Bees’!

Cause or Correlation, That is the Question

02 Sep 2021 | Aileen Wang

Every statistics course will introduce at some point the big statistics taboo: confusing correlation and causation. With the rise of data as a driving force in our world, the joke has become famous. Correlation does not imply causation. But what does this actually mean?

Low Fertility Among Female Graduates

26 Aug 2021 | James Franklin and Sarah Chee Tueno

Australian women who are university graduates have fewer children than non-graduates. In most cases this appears to be the result of circumstantial pressures not preference. Long years of study fill the most fertile years of women students and new graduates need further time to establish their careers. The chance of medical infertility increases with age so, for some, this means that childbearing is not postponed but ruled out.

We all thought that pursuing a degree within computer science would be safe. As we enter a digital era driven by the technology advancements worldwide, an alarming number of occupations are becoming outdated, automated or just rendered redundant. Surely a career in the STEM-related field would be immune to unemployment? But now we’ve gone to the extent of developing something that will write code for us. Could this be the beginning of how “artificial intelligence could spell the end of the human race”?

You wake up.10:00 am. The sound of your mum vacuuming outside and the constant and repeated yell of the kookaburra bleed your ears. You turn to your side and grab your phone.

Have you ever considered getting started with deep learning but instead got frustrated by the steep learning curve and huge body of knowledge required to even start hacking up a simple “hello world” program in Pytorch? Thanks to like minded individuals, fastai was developed to provide high-level api access, making the creation of state of the art deep neural networks as easy as a couple of lines of code.

This piece is intentionally satirical and does not reflect the views or nature of DataSoc.

Dear Internet User,

On behalf of all internet corporates, I thank you for the eagerness with which you have accepted our browser cookies into your life. Since the conception of the “cookie” in 1994, what has resulted is a beautiful partnership that ties us – marketers and consumers – closer than ever before. Today, I wish to acknowledge the role you have played in this wildly enabling relationship.

The new ‘gold rush’ towards a data-driven culture that promotes capitalising leading-edge insights from data has become ever prominent by organisations seeking to gain a competitive advantage in the dynamic business environment. With this, reveals an emerging need for employees across the organisation, not just in data science teams, to be data literate.

The age-old mining industry supplies invaluable natural resources on which our modern technology and infrastructure depend. To meet the ever-increasing global demand for natural resources such as metal, oil, battery lithium and nuclear uranium fuel, mining firms undergo monumental projects in highly unpredictable economic and natural environments. In this article we will explore how the mining industry leverages data to maximise the efficiency and productivity of their operations.

The Visual Design of the Humble Dashboard

17 Jun 2021 | Aileen Wang

Anyone who’s ever had to make a Powerpoint has inevitably come up against the wall of a presentation that is nominally informative, but – to put it bluntly – puts the audience to sleep. If we haven’t been the presenter: well, we’ve all been the audience.

With the rise of Tesla, autonomous cars have been the talk of the car industry for their electric battery advantage and self-driving abilities. Have we ever wondered how exactly a car drives itself?

Modelling Disease Interactions with Networks

03 Jun 2021 | Julian Garratt

As Covid-19 continues to remain in the zeitgeist (hello to everyone reading this in a post-covid world), the rising number of charts depicting infections and deaths over time continues to flood the media.

Yet, I find that charts and single statistics don’t bring to justice the magnitude and scale of disease transmission. More specifically, how can we better visualise the spread and scale of diseases?

We all know about how command F (or crtl+F) allows you to instantly search for a particular word within thousands of lines in a file or webpage. But what if we wanted to search with more parameters on the results? Is there a more powerful alternative to find matches straight from our terminal? The answer is YES – and it’s Regex!

Data visualisation has come a long way since the era of making column and bar charts on excel. New visualisation tools like Tableau and Google Charts offer a user-friendly solution that caters to a range of users of varying skills. Translating bland tables of data into graphical modules of data and information has helped accelerate the trend of ‘data-driven’ decisions across organisations.

Unregulated deforestation practices are destroying the world’s forests at an alarming rate, with devastating effects on ecosystems and the global climate. How can contemporary data science techniques be used to combat unsustainable deforestation on a global level?

The machine learning techniques we’ve got today are great, but are still left with some sizeable problems in their creation and use. Today I’ll be exploring what happens if you have very little data to work with.

In the days of snail mail, the idea of knowing what was happening on the other end of the world in anything close to real time was a pipe dream, or the stuff of adventure novels.

Errors and Bias - Variance Decomposition

22 Apr 2021 | Amber Dang

One of the most popular applications of machine learning or data science in solving real-world problems is building meaningful models that could not only explain the variation in the target variable but also provide reliable predictions for unseen observations. But what if our model has high error metrics, what can we do to mitigate the situation?

Trading Stocks Based on Sentiment

15 Apr 2021 | Julian Garratt

Quantitative Analysis is an ever growing field in finance that draws on Data Science skills to create trading strategies. In fact, the field is often so difficult that institutions often call for Math PhDs to develop their strategies.

But what if we could subvert the intracies of developing complex models and instead trade stocks on the sentiment of the crowd.

Given their predictive accuracy and ease of creation, models trained on massive amounts of data are everywhere, and are increasingly used to inform important decisions. How much your insurance premium costs, whether you get a home loan or how much your next plane ticket costs are reduced to mathematical functions. They input your personal data as parameters and spit out a final value which guides the decision.

Half the term has passed, and you’re scrambling to watch the week two lectures at 2x speed, contemplating your life choices because you’re struggling to keep up with the boring burden of your uni degree. Or maybe you’re that top HD student who goes exploring beyond the scope of the course outline, with plenty of time for work, socialising, and making yourself competitive careers-wise.

Conversation with an AI

11 Mar 2021 | Aileen Wang

As AIs become increasingly powerful, they become more and more able to imitate, and generate human speech. Last time, I went on a long ramble on the fictional hypotheses about the societal, moral and philosophical questions that might eventually arise around AI. This time, I thought I might put the fictional to the test – by personally engaging with an AI myself.

Do Robots Dream?

15 Feb 2021 | Aileen Wang

Since the advent of the technological age, as constant scientific breakthrough became the norm, the unknown future of a world changing more rapidly than ever more has captured the minds of countless artists, poets and writers. The future could no longer reliably be considered the mere extension of the past. Old knowledge was no longer secure in its vault. Anything could change at any moment. Could humankind journey to the centre of the earth? The bottom of the sea? The surface of the moon and the outer rim of space?

So you want to get hired as a data scientist, but just what skills do you need to qualify for the job?

At DataSoc, we care about you and your career opportunities, and we want you to get your dream job in your dream field (assuming it’s data science. This might not be the blog post for you if you’re an aspiring screenwriter). So, today, we present a checklist of essential skills for every data scientists.

Languages are, at their basis, systems for organizing smaller, less meaningful components into larger, more meaningful components. A string of sounds is a word, a string of words makes a sentences, a series of sentences becomes a paragraph, and enough paragraphs make a book, an essay, a speech, or whatever other form one might envision. Language – text, speech, tweets and so on – is one of the most common forms of data we are exposed to and produce. It’s one of the primary forms of human communication – but while we understand each other perfectly fine, is it possible for the computer to understand, and even participate, as well?

“Numbers don’t lie…but people do.

As data-literate individuals in a data-drenched world, we need to keep our wits about us and retain our critical thinking when people present us with studies and visualisations that seem to have all the answers. Here’s why, with some examples.

Why bother with Uni Societies?

18 Jul 2020 | Victor Tsang

Societies are a polarising subject among students - depending on who you ask, you may hear that “You have to participate in a uni society for your career!” or “Societies are toxic and cult-like”. So what’s the big deal? What are societies, why should you get involved, and why shouldn’t you get involved?

In other words, Why bother?”

As a society whose name and purpose revolves around the field of data science, it seems fitting to take a step back and examine data science itself. What constitutes this degree and this career? It might be ‘the sexiest job of the 20th century’, but sexy is hardly precise or specific, especially when the subject is an academic field rather than the aesthetic appreciation of the human (or non-human) body.

More often than not, anything to do with data processing gets grouped under the umbrella term of data science. This can lead to bewilderment when terms such as data analytics come up, which, while sometimes used interchangeably with data science is in actuality reference an interconnected but distinct field. Both areas ripe with opportunity and worthy of study, it is important and useful to distinguish between them, and the differing skillsets they require and roles the play in the industry.