Learning on the Job: How to Implement a Cloud-Based Data Platform
As the number, variety, and size of business data sources continue to grow, the tooling and infrastructure used to handle them need to evolve equally fast. To accommodate this, we’ve seen cloud-based data platforms rise in popularity. These data platforms are centralized locations that ingest, process, transform, and aggregate data. Being cloud-based, they allow for quick iteration in development, ease of innovation, elasticity, and unprecedented reliability. When an enterprise introduces a new cloud-based data platform, the effects go beyond just new technology to also touch company culture, people, and processes, and can lead to significant improvements across the board — but that doesn’t mean it doesn’t come with a learning curve.
We recently partnered with a global management consultancy to build a cloud-based data platform that would help to improve cross-team knowledge management and access to data. While a data platform typically provides analytics, which makes sense of data by uncovering meaningful trends, the platform we were tasked with building would take data analysis a step further, leading to actionable insights that would create business value.
The journey from our first introduction to our final knowledge transfer offered several big learning opportunities. Our journey and learnings are discussed below.
SPR was approached by the CIO of a Chicago-based consulting firm to build out a data platform. The core platform was to be built in-house by the IT team, while our role was to make the platform accessible and usable by other business units within the firm by centralizing data assets using a common architecture across each unit.
The separation of business units (BU) was a key driver for the project. With 14 different BUs divided by focus—for instance, health care, government, private sector, etc., each had its own so-called knowledge management system and data handling capabilities. This meant that data was fragmented and stored in disparate, disconnected silos, where there was no governance or cross-department communication, and valuable insights were not being derived.
It was our goal to centralize the data in a single platform in order to standardize data quality, integrity, and security, and make it more seamless across the company, thus allowing for more consistent insights.
Guiding the project was a set of business use cases that provided clear expectations of functional and nonfunctional requirements that would help to drive platform adoption across the company. We realized if we built the platform in its entirety, it’d be such a beast that no one would want to touch it. Hence, the pilot phase.
After initial conversations with the CIO, and reviewing the knowledge management systems, data ecosystem, and business use cases, our team identified the top three business units to launch the pilot, with intent to eventually onboard all units.
Over the course of the nine-month stint, we developed an AWS solution to ingest, process, and house data. Harnessing AWS Lambda S3, Glue, and RDS, we built a lightweight and flexible data platform. Everything from resource configuration to security and permissions was captured within the code, allowing the platform to be deployed or updated with a single click in any of the environments. As mentioned above, three departments were onboarded to the platform, each with their own data, development teams, and use cases. By our last day, we managed to document all decisions and processes and provide a full knowledge transfer to the client-side team.
AWS Consulting Partner
What's your challenge in the cloud? SPR is an AWS Consulting Partner with multiple offerings to help you get started with AWS.
The project naturally provided numerous unexpected hurdles both in architecture and process. Thankfully, we were able to quickly adapt by relying on the agility of cloud services; we could deploy a new platform in minutes if needed. In the end, we felt both wins and losses, and learned a few crucial lessons:
- Leverage the benefits of use cases. Not only do use cases drive architecture decisions, but they also give real ROI to the investment, and we were able to speak to this at various milestones. In addition, use cases help to hone the ebb and flow of what the client wants versus what they need. For example, if a client wants the platform to support streaming data, but there is no legitimate use case requiring it, it’s an easy argument to make to forgo implementation. By basing the project on business use cases, our development team had clear functional requirement guidance and provided context and aid to client-side teams.
- Working knowledge of Agile is necessary. Before undertaking a project such as this, it’s important that the entire team understands how to operate in an Agile environment. To have even a 90 percent understanding of the practice essentially equates to zero—either you are Agile or you’re not. This organization, in particular, said they ‘kind of knew’ Agile, and we moved forward. In retrospect, we would have completed a three-day Agile boot camp to get everyone on board before project kickoff.
- A project manager is key. Projects of this magnitude need ongoing support and oversight by a project manager (PM) or similar. If the budget does not justify a PM, at a minimum the team should employ a hybrid role that wears multiple hats and contributes to the management of the project. As a data practice, we learned we need to address our needs in terms of PM support. Doing so helps elevate the right priorities and avoid distractions from interesting, but potentially unnecessary, technology.
- Involve end-users early on. We encountered a lack of adoption and learned it’s not always an “if you build it, they will come” scenario. To combat this, we could have employed a grassroots movement that engaged business users earlier and more frequently. Our communication with end-users was limited to results which became disorienting; because they had not been along for the ride, they did not know how to digest results. Had we kept ongoing interactions, we could have better tailored the tool to their needs and wants, which also naturally promotes adoption.
- Employ a strong internal steering committee. It’s vital for the client to have a project sponsor who helps to sell the new platform to internal stakeholders and team members. We found that most of our progress meetings turned into sales meetings—we had to sell the tool to each business unit multiple times over, justifying that it was the best option for their department and the challenges they were facing. We did not anticipate this need to resell the tool since they had already invested in it, and effectively many conversations were derailed. Our internal project sponsor also checked out early and de-prioritized this project in favor of other pursuits. While he believed in the platform and continued to fund it, he didn’t continue to champion how the details from this project could provide “lessons learned” and cross-departmental adoption. Project sponsors need to be ready to engage in data pipeline projects for the long haul, across many stakeholders, and across varying degrees of adoption (from totally enthusiastic to completely resistant). A strong sponsor will shepherd value through all these scenarios to maximize project investment.
Ultimately, the three business units that did adopt the new platform were pleased with the product, and both our team and the client gained invaluable experience in its development. There is still uncertainty, however, on whether the remaining 11 business units will move forward with platform adoption. As outlined above, this project brought us many significant learning lessons—cloud-based platforms certainly are the future for most companies, and we know our lessons learned will prove helpful to other organizations undergoing a shift from an on-premises to cloud-based data platform.