Jaffle Shop
How to use PipeRider with dbt
Incorporating PipeRider into your current dbt project is a seamless process, thanks to PipeRider's built-in, almost zero-configuration, support for dbt.
This guide utilizes the Jaffle Shop project from dbt as a practical illustration of how to effectively employ PipeRider in conjunction with a dbt project.
In this guide you will do the following:
1. Configure the Jaffle Shop project
Clone the Jaffle Shop repository
Follow the ‘Running this project’ instructions in the Jaffle shop repository to install and configure the dbt project, or use the instructions below to setup the project using DuckDB.
2. Install and add PipeRider to the Jaffle Shop project
Install PipeRider
Install PipeRider with the required connector for the data source you used to configure the Jaffle Shop project in step #1.
For example, to install PipeRider with the DuckDB connector, you would use the following command:
Verify PipeRider configuration
Ensure that PipeRider can connect to the data source by running the diagnose
command.
If everything is configured corrected you’ll see the You are all set! message.
3. Run PipeRider
You can now run PipeRider to generate your first report, which will list all of the sources, seeds, models, and schema definition. However, to profile your models you will need to add the PipeRider tag. Skip to the next section to do this.
Tag models to enable profiling
Enable profiling by adding the piperider
tag to the models you wish to be profiled. Here's an example of how to add tags in the project file:
Alternatively, you can also add the tag config to the top of individual model files, e.g:
After tagging models, verify your configuration by listing the tagged models.
Run PipeRider again. This time, the report will be filled with the data profiling statistics of your tagged models.
The report contains profiling statistics for each of the profiled models.
Add metrics to query
In dbt, you have the ability to define metrics that specify how to query your time series data. PipeRider offers automatic report generation based on these defined metrics.
To add a metric to your project, create a new file, models/revenue.yml
, with the following content.
Note that metrics also require the piperider
tag, indicating that PipeRider should automatically query this metric.
Check the metric is well-configured.
Run PipeRider again
The report includes metric queries for your data. However, please note that since the Jaffle Shop data is only available for the year 2018, you will only be able to view the data in the yearly report. This report displays the data for the last 10 years.
Commit the change
In order to follow the compare tutorial below, you will need to first commit the current changes.
4. Add PipeRider to your development workflow
When you want to develop a new feature, you likely follow the GitHub workflow, which contains the following steps:
Create a branch
Make changes
Create a pull review
Address review comments
Merge your pull request
Based on these steps, PipeRider can integrate with your process and generate a comparison report to aid with code review in your dbt project.
Create a branch for development
Make changes
Add a new column to the customers
table.
Add a filter to the orders table.
Build the project
Test your changes, and ensure that the project can be built without error.
Create the compare report
The PipeRider compare command will compare your data before and after making dbt project changes.
The report will show the following changes to your projectL
Added a new column
Row counts change in the orders table
Metric change due to the orders table definition change
Add the comparison summary to your pull request comment
The compare command also outputs a markdown file, summary.md
which is specifically designed to be pasted into a GitHub pull request (PR) comment.
The pull request comment now contains detailed information about how your code changes have affected the data. This improves the code review process and helps ensure that unexpected changes do not make their way into production.
Next Step: Automate the process in the CI
The process mentioned above is also manual. However, if you wish to automate this action, you can integrate PipeRider in your CI workflow.
Last updated