Turning Data Noise into a Melody: First Considerations for Your Data Warehouse Planning
In today’s world, data is everywhere. It touches everything we do, and we continue to create more exponentially fast. Perhaps you are overwhelmed. Let’s start by asking ourselves two questions, key things to consider as you explore an effective data strategy.
1. Is Your Data Just Noise?
In our personal lives, we take pictures with our phones, our smart homes accrue use and telemetry data, our purchase histories are preserved by retailers credit card companies and Internet statistic firms, and our cars “phone home” with endless data points.
Professionally we also create data on many levels. In our on-prem and IaaS environments, we create Virtual Machines and Virtual Drives to contain machine and block-level data. In PaaS products, we consume storage, compute, and transport resources that generate database entries, files, logs and monitoring data. Our resources take snapshots of data that can allow us to easily roll back environments to recover from failed deployments, thwart ransomware attacks and recover all of the files that everyone thought were protected from deletion, but were all accidentally removed when someone deleted the wrong folder. We version everything. We track everything. We accumulate databases packed with tables full of data.
At some point, data becomes noise to most people, and that noise resonates at a frequency that is, apparently, easily ignored.
So, if that’s what we’re doing, why do we pay so much to store this “noise” for longer and longer terms? Shouldn’t we be doing something with all this data – to discern some melody out of the noise?
If you are asking yourself those questions, you’re on the right path. Keep reading.
2. What to Do with All the Data?
All of us should figure out how to turn the noise into a melody. All of those 0’s and 1’s stacked up together in an efficient, accessible, relevant way may not be the kind of melody you’re used to; but to a Data Scientist or Business Intelligence professional, those 0’s and 1’s are professional “music to the ears.”
Whatever project, deployment, or changes you are working on – stop for a minute. See if you can answer this question: “What are we going to do with all of the data that we’re collecting?” Click To TweetIf you do not have an answer that includes the words, “and then that transformed data lands in our data warehouse,” you are doing it wrong. Tapping into the Data Economy to not only collect, transform, and store data, but also to learn how to create better data that is more valuable from the outset, must be considered as a part of your development cycle. Your applications are built around functionality. Your infrastructure is built around a solid security infrastructure. Your data needs to be built around usability and efficacy. Click To Tweet
Complicated Questions ≠ Complicated Solutions
For many, this is new territory – defining data lineage standards for current data that will be relevant now and in the future. How do you define a data lineage strategy – a data life cycle policy – on data that is being generated now for one purpose, but may be used for another in the future?
Complicated questions do not need to require complicated solutions. They should, however, include consultation with Data Scientists, internal stakeholders and experienced technologists to create the most usable, efficient and performant data strategy for your applications and architecture.
So, after that stop. Should your deployment continue? Or do you need to take a step back to define a better data strategy? Refuse to accept data strategy as a tech debt category. Today, we have the tools available to create effective data strategies, with proper Extract, Transform, and Load (ETL) processes to give your data the melody that it has been missing.
So, go ahead. Let your data sing.