Here at Velocity Insight, we see a bunch of databases. Like LOTS of databases.
And since most of us have been in the industry for a decade or two, we’ve had a front-row seat to a big change that’s occurring in the data stack of E&P companies – a trend that we call The Shattering.
It used to be that E&P companies all handled their enterprise databases in roughly the same way. There was a consensus in the industry, with relatively minor variations on a theme:
- On-premise servers that were managed end-to-end by an in-house IT department, whether physically at the headquarters office or at a co-located data center
- Infrequent, high-effort upgrade cycles that were planned months (or even years) in advance, requiring tight coordination between software vendors, networking, hardware specialists, and third-party contractors
- Standardization on one of a handful of technologies – Microsoft SQL Server (MSSQL) was dominant, with much lessor market share for tools like Oracle, IBM Db2, Teradata, SAP, Unidata, and Linux flavors. Call it MSSQL and the Seven Also-Rans.
This consensus had some important advantages for digital work. With on-prem MSSQL being so dominant for key enterprise software and data warehousing applications, skills were highly transferrable from E&P to E&P. You could pick up a business analyst that was familiar with a few key apps (especially from the Big Six) and drop them into a new organization with near immediate productivity.
But over the last 10-20 years, that consensus has been broken. We have entered The Shattering. What’s changed?
- Huge volumes of data are being lifted-and-shifted from on-prem servers to cloud databases with existing tech.
- Another big tranche of data is being re-designed on top of cloud-native architectures like microservices, data lakes, and parallelizable Spark Engines
- SaaS vendors are pushing their E&P customers to store data in the SaaS vendor’s preferred technology, which could be any of a dizzying array of back-end architectures hosted on any of a number of cloud providers
So where are E&Ps storing their data now? In short, everywhere and in every format. They’re using AWS and Azure and Google Cloud. They’re using Azure SQL and Unidata and Databricks and Snowflake. They’re using Data Warehouses and Data Lakes and Data Marts. They’re accessing it with SQL endpoints and flat files and APIs. The data is coming from everywhere – even inside the house!
This diversification comes with an enormous soft cost. Since the sources of data are so varied, the skills and competencies required to successfully do Data Engineering have become overwhelmingly broad. It’s not possible for any one person to have all the necessary skills – you can’t get it done with “A Data Guy”. You need a team. That’s part of why we formed VI – in the belief that when modern E&P companies need an enormously wide range of data skills to be successful, outsourcing beats insourcing in most cases.
I believe that The Shattering is eventually going to start consolidating. A few technologies are going to end up winning in terms of the combination of Speed, Cost, and Quality. But as of now, it’s still fairly unclear which will be winners and which will be losers.
At this point, we’re far enough along that I know which horses I’d be willing to place bets on. Cloud is winning. Vanilla SQL is more transferable than the funky flavors like SparkSQL, HiveQL, or RockyRoadQL (okay I made that one up). Simple, transparent relational databases win over more opaque approaches like NOSQL or data swamps.
So what’s your experience? Is your company still on a traditional on-prem architecture? There’s a reason it ruled for so long, but what value propositions are you missing out on? On the flip side, if you’ve gone fully cloud, how are you dealing with the overwhelming menu of choices and connection types?
We love helping our clients with this stuff, so give us a call and let’s start putting the pieces back together.