PipeRider
Search…
Custom Assertions
How to create custom assertions to check the quality of your data.
Piperider provides a few built-in assertions and also supports custom assertions as plugins which can satisfy the data quality check on your demand. Here you will learn the magic and create your first custom assertion.

How to Create an Assertion Function

Plugins

Piperider, by default, will load python files under .piperider/plugins as custom assertion functions automatically. .piperider/plugins is created by piperider init with a scaffolding of a custom assertion function, customized_assertions.py. You can rename the file or create assertion functions in other python file there.
The search path to plugins/ can be overwritten by the environment variable PIPERIDER_PLUGINS. Define your path by setting the variable.

Metrics

piperider run will generate profiling results containing a plenty of metrics that your assertions could refer to those metrics for the data quality measurement. Metrics are saved in the .piperider/outputs/<run>/.profiler.json and the context are categorized into tables/columns according to data types, String, Datetime, and Numeric.
Here is an example of these types of profiling metrics. Table PRICE contains SYMBOL(string), DATE(datetime) and OPEN(numeric).
sample .profiler.json

Assertion

When you first time execute piperider run , You will be prompted for the generation of recommended assertions, if yes, recommended assertions will be generated, otherwise, assertion scaffoldings will be generated at .piperider/assertions.
No assertion found
Do you want to auto generate recommended assertions for this datasource [Yes/no]
The scaffolding of assertion yaml looks like below
your_table_name:
tests: []
columns:
your_column_name_a:
tests: []
your_column_name_b:
tests: []
...
You can add built-in assertions and custom assertions against tables/column here.
e.g.
PRICE: # Table Name
# Test Cases for Table
tests: # assertion functions
- name: assert_row_count_in_range # built-in assertion function takes a parameter
assert:
count: [0, 157881]
columns:
SYMBOL: # Column Name
# Test Cases for Column
tests: # assertion functions
- name: alphanumeric_only # custom assertion function without parameters
- name: distinct_count # custom assertion function takes a parameter
assert:
count: 505

Scaffolding of Assertion Function

This is the context of customized_assertions.py. A custom assertion class has to implement BaseAssertionType __ and its functions, name(), execute() and validate().
Assertion function scaffolding
Copy link
On this page
How to Create an Assertion Function
Plugins
Metrics
Assertion
Scaffolding of Assertion Function