Data stack 2024 | incident.io

Hey there! It’s been almost 2 years since we last talked about our data stack, and boy do we have some exciting updates for you! Let’s dive right in and explore what our current setup looks like, and take a closer look at some key highlights:

  • Why we made the switch to Omni for our internal BI tooling, Explo for customer-facing embedded insights, and how we’re leveraging Hex for ad-hoc analysis
  • The enhancements we’ve implemented in our local development setup
  • How we utilize our own On-call product to detect and notify us of any data pipeline issues


🧱 Our current setup

Let’s kick things off by giving you a snapshot of our data stack. The setup revolves around three main areas: Extract & load (EL), Transform (T), and Analyze.

  • We rely on Fivetran to sync data from external sources and our GCP Postgres database into BigQuery
  • Event data from our product is streamed directly into BigQuery for real-time analysis
  • With Segment, we bring data from website activity, web app, and mobile app into BigQuery

Transform:

  • We are enthusiastic users of dbt for SQL transformations and data modeling
  • CircleCI handles our CI/CD processes and runs our dbt SQL code regularly
  • We employ Docker for running dbt on CircleCI
  • Hightouch helps us sync transformed data to Salesforce and send alerts to Slack

Analyze:

  • We turn to Omni for internal dashboards, Explo for customer insights, and Hex for ad-hoc analysis

Explo

Our Engineering team recently migrated to Explo for our Insights product, offering customers deeper data insights.

Explo’s embedded analytics support fast iteration, greater customization, and drill-down capabilities.

Omni

Earlier this year, we embraced Omni analytics for internal BI needs, finding it a perfect fit for our requirements.

Omni’s model layers concept, including schema, shared, and workbook models, have streamlined our analytics workflow.

Hex

For in-depth analysis and complex transformations, we utilize Hex—a powerful tool for ad-hoc data exploration.

🔧 Our dev setup

We recently made significant improvements to our development setup, enhancing how we run dbt locally for smoother operations.

Our Data Engineer introduced a smart solution of leveraging the latest production manifest file from Google Cloud Storage to streamline local dbt runs.

This approach has significantly boosted productivity and efficiency in our development process.

📟 Using our own On-call product

At incident.io, we practice what we preach by extensively using our own On-call product within the Data team.

Automated incident declaration and intelligent paging mechanisms help us stay on top of any data pipeline issues round the clock.

Whether it’s an actual issue or a transient glitch, our On-call product ensures prompt resolution and detailed incident tracking.

✍️ Summary

Our data stack has evolved over time, with key tools like Fivetran and dbt continuing to play crucial roles in our operations.

As the data landscape evolves, we remain agile in adopting new solutions while maintaining a strong foundation for our data processes.

Stay tuned for more updates on our data journey—we won’t keep you waiting for another 2 years!

Leave a Reply

Your email address will not be published. Required fields are marked *