You've probably heard of a new buzz phrase making its way across social media for at LEAST the better half of a year: The Modern Data Stack. You also probably are wondering what the heck it is! No worries, this article is going to give you the cliff notes of what it is so you can have confidence whenever you see the phrase. So here goes!!
The Modern Data Stack is a collection of cloud-based tools that enable businesses to manage, store, and analyze data efficiently. It was created as a solution to scalability issues with the traditional data stack. Organizations are realizing that data is becoming increasingly complex and that their traditional data stacks simply cannot cope. As data grows exponentially, traditional tools and methodologies struggle to keep up, which means that data could take days or weeks to come in, on premise software couldn’t handle the massive amount of data coming in, and configuring and managing tool setup could delay receiving insights even more. This made it hard for businesses to keep up with competition. The Modern Data Stack aims to solve all of these problems and help companies consistently derive business value from their data.
What tools are included in the MDS?
At minimum, the MDS Stack typically consists of a:
- Data warehouse/lakehouse
- Data transformation tool
- Data ingestion
- Data viz platform
* All of these components are not set in stone because each company’s data stack will vary based on their individual business needs, goals, and requirements
Okay, so what is the goal of the MDS?
The ideas behind the MDS are the:
1. ability to streamline business data so that companies are able to gain business value quickly from insights
2. Have an easier time managing cloud infrastructure
What are the pros of the Modern Data Stack?
- Scalability: Using a modern data stack allows businesses to handle an influx of data without worrying about storage, performance optimization, and configuration issues that can occur with on-prem infrastructure. Cloud data services often provide pay as you go models for storage and computation, which is often more cost-effective than expanding on-prem services. As a result, data teams can focus on extracting business value from their exponentially growing data, rather than requesting IT support to fix and upgrade their data infrastructure.
- Modularity: One of the best things about the MDS is its ability to be customized. Depending on your organization's needs at a certain time, you can choose which tools to integrate into your data stack. Tools can easily be swapped out if your needs change! Furthermore, as your organization grows, you will be able to add in more tools to optimize your data stack.
- Management is hosted by the tooling company. This makes it easier to set up and maintain. Any outages are handled by the company as well.
- Pre-built integration: Since most data tools have been configured to integrate with other common data tools, Data engineers and analysts don’t need to worry about how to get data from one system to another. This saves engineers a lot of time so the team can focus on analysis, operations efficiency, and technical debt.
What are the cons?
- Ecosystem/Product Bloat: The modern data stack ecosystem feels like it is rapidly growing. With too many tools to choose from, companies can have a tough time deciphering which tools best fit their organization's needs. This becomes even more troublesome since a lot of companies don’t have the culture or support to understand WHY they need certain tools. This problem alone pretty much nixes out the main point of the MDS since business value cannot be derived with a data stack that no one truly understands how to use nor has the capacity to handle business goals.
- Piggybacking off of the previous point, companies tend to adopt the most popular or fancy tools on the market despite popular not always being what makes the most sense for the organization. Some of those tools might not be user friendly and if users find the tools difficult to use, it leads to 2 outcomes:
- Messy systems, due to users not understanding how to use the tool
- Low adoption rates due to the software causing the users to have headaches. This will make them find tools that work for them, which opens up the organization to more problems if people are all using individual software to process and analyze data.
- More data flowing through ELT pipelines does not necessarily mean that more business value will inherently be given. All data is not useful in relation to the goals and needs of a business. In fact, having more data can lead to worse data quality if proper data governance is not in place.
What does the Modern Data Stack mean for data professionals?
- PICK UP CLOUD TECHNOLOGY - The MDS Is cloud based, which means that more and more job listings will require some type of knowledge of the data service's of one(or more) top cloud provider's infrastructure. This can set you a part from other candidates who might not have taken the time to pick up some cloud skills.
- RICHES IN THE NICHES - Earlier in the post, we mentioned that the data product ecosystem is growing. With that growth comes the demand for experts skilled on each of those products. Companies buy tools and immediately need someone who understands how to configure, maintain, and use those tools in a way that will drive value for their business. Those who can do that will be extremely valuable, especially for tools that are either less user friendly and/or less commonly used.
- FUNDAMENTALS ARE KEY - No matter how many data services come on the market, they all have the same underlying fundamental concepts that drive data analysis/data science/machine learning/etc. If you understand the fundamentals of your job role, picking up any tool Iill be easier because you already understand the end goal of said product and how it relates to the issue your company is trying to solve. This leaves you with only needing to understand how the tool works itself. Ex: what button to press, what keyword to enter, etc
- NOTHING IS SET IN STONE, ALWAYS BE LEARNING- As the data field continues to evolve over time, so will the products and services in this space. Therefore, it is important to be flexible with tools and always look out for what trends are emerging in order to stay relevant in the field. You don't have to always have a course in progress or a bunch of projects, but just reading articles and watching youtube videos will help you a lot with knowing what's coming and going on. This will help you to even offer suggestions at your current job on what can be improved or built, helping you build up your reputation as a SME.