Simply understanding the basics of BIG data

Fasrin Aleem
3 min readApr 11, 2022

There will be around 44 zettabytes of data in the planet by 2020. By 2025, there will most certainly be 175 zettabytes due to the amount of data created each day (Seed Scientific, 2021). Big data has been around for a long time, but there are still many misconceptions about it.

Although we are probably unaware of how big data impacts our lives, we are constantly exposed to them in our daily lives. I’ll do my best to describe Big Data as simply as possible in this blog.

Let’s have a look at a simple example of big data:

When we try to attach a 100TB document to an email, we can’t do it. Because the email system doesn’t support an attachment of that size. So, those 100 terabytes of email-related attachments can be called as big data.

Big data is a word that refers to a collection of datasets that are so large and complicated that they can’t be processed using traditional data processing technologies. It’s a collection of structured, semi-structured, and unstructured data gathered by enterprises that can be analyzed and used in advanced analytics applications like machine learning and predictive modeling (Shahzan, 2019).

“Big data is high volume, high velocity and high variety information assets that demand cost-effective, innovative forms of information processing” (Hanna Wallach, 2014).

In this definition, volume is related to the amount of data being evaluated, velocity refers to the speed at which that data is received and / or processed, and diversity refers to the variety of available data and sources.

Let’s look at some of the major disadvantages of using standard methods for storing and processing large amounts of data.

Expensive -It is an expensive system and requires a lot of investment to implement or upgrade the system, thus small and mid-sized companies will not be able to afford it.
Scalability -As the data grows, scaling this system becomes a challenging task.
Time Consuming -It takes a long time to evaluate and extract important information from data that was designed and produced on legacy computing platforms.

I hope this explains why traditional approaches or big data is not stored or processed on legacy computing platforms.

Conclusion

You cannot hide from the tremendous power of technology. Big data is already transforming the world, quietly seeping into our cities, homes, apartments, and devices. Many large companies rely on big data to differentiate themselves from their competitors. Many industries use data-driven strategies to compete, acquire, and develop new entrants and established organizations. In fact, big data applications are found in almost every industry, from information technology to healthcare.

Hope you found this post to be informative. Please do not hesitate to keep 👏👏👏👏👏 for it (An Open Secret: You can clap up to 50 times for a post, and the best part is, it wouldn’t cost you anything), also feel free to share it across. This really means a lot to me.

Reference

Seed Scientific., 2021. How Much Data Is Created Every Day?
Available at: https://seedscientific.com/how-much-data-is-created-every-day
[Accessed 11th April 2022]

Hanna Wallach., 2014. Big Data, Machine Learning, and the Social Sciences: Fairness, Accountability, and Transparency
Available at: https://hannawallach.medium.com/big-data-machine-learning-and-the-social-sciences-927a8e20460d
[Accessed 11th April 2022]

Shahzan., 2019. Big Data Explained in Plain and Simple English.
Available at: https://medium.com/swlh/big-data-explained-38656c70d15d
[Accessed 11th April 2022]

--

--