Comment on page
config.yml
Using the PipeRider config.yml file
config.yml
is the main PipeRider project configuration file, and contains data sources and related profiling settings.An example
config.yml
file for a Postgres project:config.yml
dataSources:
- name: salesData
type: postgres
profiler:
# table:
# # the maximum row count to profile. (Default unlimited)
# limit: 1000000
# duplicateRows: false
# The tables to include/exclude
# includes: []
# excludes: []
# Include views or not
# include_views: true
# tables:
# my-table-name:
# # description of the table
# description: "this is a table description"
# columns:
# my-col-name:
# # description of the column
# description: "this is a column description"
telemetry:
id: ABC123
The
config.yml
file is created when a new project is initialized, and stores the following information for your PipeRider project.- The name of your data source
- The type of data source e.g. sqlite, postgres etc.
- (sqlite) The path to the database file
- (dbt) the dbt project information (profile, target, project path)
- Telemetry ID for anonymized tracking
If the data source requires credentials they will be stored separately in
credentials.yml
.For SQLIte projects, it is also possible to store the database path in
credentials.yml
if desired.The following settings enable you to configure the behavior of the profiler. Uncomment and adjust the settings as required.
Set the maximum number of rows to profile per datasource table. In the following example, a maximum number of 10,000 rows will be profiled:
config.yml
...
profiler:
table:
# the maximum row count to profile. (Default unlimited)
limit: 10000
...
Enabling it to let the profiler find the duplicate rows from the table. It is disabled by default due to it could be a time-consuming process according to datasets.
config.yml
...
profiler:
table:
# the maximum row count to profile. (Default unlimited)
duplicateRows: false
...
By default, PipeRider will profile all existing tables. To specifically include or exclude tables from being profiled, add or remove tables from the
includes
and excludes
arrays.PipeRider will profile tables specified in
includes
and ignore tables specified in excludes
....
# The tables to include/exclude
includes: [sales, customers]
excludes: [stg_raw_sales]
...
An empty array means no tables are specified. To profile all tables, leave these options commented.
By default, profiling views is not enabled. To allow PipeRider to profile views, uncomment the following line:
include_views: true
Add table and column descriptions which will be shown on your PipeRider report.
config.yml
...
tables:
sales:
# description of the table
description: "Yearly sales figures by platform"
columns:
NA_Sales:
# description of the column
description: "North American sales figures"
EU_Sales:
# description of the column
description: "EU sales figures"
Global_sales:
# description of the column
description: "Global sales figures"
...
This is the anonymous project id that was created during project initialization.
config.yml
...
telemetry:
id: abc123
Last modified 11mo ago