Reverse ETL – The What, The Why, and The Who!
The industry is making a lot of fuss about reverse ETL nowadays, and for a very good reason; Operationalizing your analytics makes the modern data stack much more powerful than ever before.
Modern no-code automated data stacks have been a dream for user-centric companies all over the world. In fact, it’s starting to gain pace, with more and more companies getting comfortable with the idea of fetching data from operational or transactional platforms such as Facebook and Salesforce into a central data warehouse for analytics and decision-making.
But, just pushing data into a single place does not ensure that it gets used across the organization in their day-to-day work. So, how about pulling processed data from a data warehouse into these platforms so that your teams can use it in tools and settings already familiar to them?
What is reverse ETL?
Traditionally, ETL is about extracting data from a transactional database or platform and putting it into a data warehouse, where it can eventually be used to create reports and analytics.
So, what exactly is the reverse of this?
Well, it’s kind of cool and mind-blowing. To put it simply, what we are doing with reverse ETL is this – we’re taking data that has already been structured and organized for analytical reasons in a data warehouse and putting it back into operational applications such as Facebook or Google Sheets, where your teams can use this data for their day to day work.
Well, that’s interesting.
However, as I went digging into this whole idea of reverse ETL, one thing kept bothering me over and over again — Since reverse ETL is as easy as extracting and storing data, Why didn’t it appear up until now?
Surely we could have just reverse-engineered traditional ETL or ELT and saved a lot of time & money that goes into ineffective manual processes?
To be honest, the issues were really fairly simple.
Platforms or applications, in general, want to prevent sourcing from one big “SINGLE SOURCE OF TRUTH (SSOT).”
Okay! But Why would they do this?
It’s because it’s very rare that you can put all of your data into one place for usage, where you can also establish a common understanding of what different data elements mean for the various platforms that you are trying to put data into. Let that settle in.
Another big headache comes when you try to use analytical platforms that we use for reporting and analytics such as Looker, Power BI, or Tableau for reverse ETL. With these platforms, you need to use a power query to be able to put data in a format that makes sense for the transaction or user application platform that you are trying to put data into.
So, if you need an analyst every time you want to move a single line of data back into say Facebook to make it useful, it really isn’t seamless and it requires you to set context every single time.
It is starting to look scary now, why would someone want to disturb the traditional ways of extracting and loading data when it is just too complicated?
The next sections cover just that. Read on!
Why use Reverse ETL?
So, now that we know a little bit about what reverse ETL is all about, let’s move on to ‘the why’. Why would you want to use reverse ETL?
Let’s begin with two familiar platforms (or applications) that you are currently using in your daily business activities.
Salesforce, and Mailchimp.
If you haven’t heard of them, Salesforce is a CRM tool, and Mailchimp is used for email marketing.
Now think about this.
As a marketer, you’re looking for ways to organize your email outreach for reaching out to potential customers in your target segment. To do this, you’ll pull records of potential customers into a cloud-based CRM – say something like Salesforce, then dance around with this data in an analytical fashion to make sense of things and then put it back into something like Mailchimp for running email marketing campaigns.
This is really what forward and reverse ETL looks like in practice.
Here are some more common use cases where this part of the data stack really shines
- Syncing key metrics and definitions that you built in your data warehouse into operational systems such as CRMs where your teams can actually consume them. These can include lifetime value, propensity score, and lead score, among others.
- Fulfiling ad hoc data needs without the assistance of the data teams. “Can I get a list of *** in excel format please”?, Sure.
- Personalising and hyper-targeting your market segment by syncing customer and audience lists with your marketing platforms.
- Keeping product, sales, and support teams in sync (while using their respective platforms) about customer touchpoints, actions, and overall journey.
I could go on and on, but I’m sure you catch my drift.
The requirements are always evolving and may differ from one company to the next, but the pursuit remains unchanged.
You have the needful data in a data warehouse, and all you need now is a means to push it into the platforms you use every day with ease.
Now, it was alright if it was the 60s and the world did not keep changing overnight, sure you could achieve this manually. But, it’s 2022 and brands are competing against rising privacy restrictions, greater consumer expectations, rising volume and value of data, and so much more.
Here is a screenshot of why manual reverse ETL is not sustainable in the long run.
- It is time-consuming, inefficient, expensive, and above all very SLOW!
- It requires someone with a good amount of knowledge and skills to waste hours in a redundant and boring day in day out processing.
- One bad INSERT, UPDATE, DELETE can ruin the entire setup and force you to start again.
- As you grow, scaling manual reverse ETL (or the usual ETL for that matter) becomes impossible.
- And, above all, if keeping TRACK of data in real-time was that easy, you would never have been struggling with data silos.
Interestingly, a lot of Gartner and world-renowned research backs up the same idea and says that the difference between the leaders and laggards today is highly influenced by how automated, codeless, and real-time their data management practices are.
But, still, we want to either keep it manual (read slow) or in-house (read expensive).
Who offers Reverse ETL
Until recently, companies wrote their own API connectors from various operational platforms such as Salesforce, Facebook, and Mailchimp to their data warehouse, and back from the data warehouse to these SaaS platforms, to pipe their data into an operational or analytical format.
But, we have heard straight from the horse’s mouth – It is no fun!
- Writing data connectors (forward or reverse) is time-consuming, inefficient, expensive, and above all extremely SLOW!
- Most APIs are not built to handle real-time data transfers and hence need constant batching, retries, and checkpointing to bypass rate limits.
- Mapping fields from the data warehouse to the operational platform and vice-versa takes a lot of time and precision.
- API specs vary as they upgrade and it becomes a total nightmare to maintain connectors over time.
- As your company grows, scaling connector volume becomes overwhelming and unsustainable.
No worries at all!
Enter 3rd party solutions such as DataChannel
DataChannel – An integrated ETL & Reverse ETL solution
- 100+ Data Sources. DataChannel’s ever-expanding list of supported data sources includes all popular advertising, marketing, CRM, financial, and eCommerce platforms and apps along with support for ad-hoc files, google sheets, cloud storages, relational databases, and ingestion of real-time data using webhooks. If we do not have the integration you need, reach out to our team and we will build it for you for free.
- Powerful scheduling and orchestration features with granular control over scheduling down to the exact minute.
- Granular control over what data to move. Unlike most tools which are highly opinionated and dictate what data they would move, we allow you the ability to choose down to field level what data you need. If you need to add another dimension or metric down the line, our easy to use UI lets you do that in a single click without any breaking changes to your downstream process.
- Extensive Logging, fault tolerance and automated recovery allows for dependable and reliable pipelines. If we are unable to recover, the extensive notifications will alert you via slack, app and email for taking appropriate action.
- Built to scale at an affordable cost. Our best in class platform is built with all ETL best practices built to handle billions of rows of data and will scale with your business when you need them to, while allowing you to only pay for what you use today.
- Get started in minutes. Get started in minutes with our self-serve UI or with the help of our on-boarding experts who can guide you through the process. We provide extensive documentation support and content to guide you all the way.
- Managed Data Warehouse. While cloud data warehouses offer immense flexibility and opportunity, managing them can be a hassle without the right team and resources. If you do not want the trouble of managing them in-house, use our managed warehouse offering and get started today. Whenever you feel you are ready to do it in-house, simply configure your own warehouse and direct pipelines to it.
- Activate your data with Reverse ETL. Be future-ready and don’t let your data sit idle in the data warehouse or stay limited to your BI dashboards. The unidimensional approach toward data management is now undergoing a paradigm change. Instead, use DataChannel’s reverse ETL offering to send data to the tools your business teams use every day. Set up alerts & notifications on top of your data warehouse and sync customer data across all platforms converting your data warehouse into a powerful CDP (Customer Data Platform). You can even preview the data without ever leaving the platform.