"You know nothing" : can we start from the basics ?
For now, we haven't connected the Store to the database. To connect the Store to the database and start materializing our views and tables, we need to tell DBT where to look for the database credentials and connection information. We do this by creating a dbt_profile.yml
file.
What the heck is a dbt_profile.yml
?
dbt_profile.yml
file saved under ~/.dbt/dbt_profile.yml
to store the connection information to the database. The database's IP and user credential, live tin the dbt_profile.yml
file.The need for a DBT profile file arises from the fact that dbt needs to know where your data is stored, how to connect to it, and other configuration details. By having a centralized configuration file, you can manage these settings in one place and switch between different environments (e.g., development, production) without modifying your dbt models.
Tere are some key elements you might find in a DBT profile file:
- Database Connection Details: It includes information such as the type of database (e.g., BigQuery, Snowflake, Redshift, SQL Serve), host, database name, schema, and authentication details.
- Profiles: You can define multiple profiles within the file, allowing you to switch between different environments or data warehouses easily.
In summary, the DBT profile file is essential for configuring and connecting dbt to your data warehouse, streamlining the data transformation process.
This file is not located in the project itself, because it contains credentials we don't want exposed through git. Instead, it's usually located in the ~/.dbt
folder.
How does DBT know wchich profile to use ?
profile
key, at the top of cssXX.dashboards_store/dbt_project.yml
refers to the name of the profile you want to use from the ~/.dbt/dbt_profile.yml
file.When you run dbt commands in a specific project, dbt determines which profile to use based on the configuration specified in the profiles.yml
file. The profiles.yml
file is typically located in the ~/.dbt/
directory.
Here's how dbt knows which profile to use for a specific project:
- Default Profile:
- If you have a single profile in your
profiles.yml
file, dbt will use that profile by default. This is the case when you only have one environment or data warehouse connection.
- If you have a single profile in your
- Project Configuration:
- In the
dbt_project.yml
file within your dbt project directory, you can specify the profile to use for that specific project using theprofile
key. If this key is not set, dbt will use the default profile from theprofiles.yml
file.
- In the
Example dbt_project.yml
:
name: my_project
version: 1.0
profile: my_custom_profile
In this example, dbt would use the profile named "my_custom_profile" for the "my_project" project.
By providing the profile information in either the DBT_PROFILE
environment variable, the --profile
command-line flag, or the dbt_project.yml
file, you can instruct dbt on which profile to use for a specific project.
What the difference between a profile
and a target
?
In the context of dbt, a profile and a target serve different purposes:
- Profile:
- A dbt profile is a configuration file named profiles.yml that contains information about your data warehouse connection.
- It includes details such as the type of database (e.g., BigQuery, Snowflake, Redshift), host, database name, schema, and authentication details.
- You can define multiple profiles within the profiles.yml file, allowing you to configure connections for different environments (e.g., development, production).
- Profiles are mainly concerned with specifying how dbt connects to a data warehouse.
- The dashboards_store, schould only use one profile
- Target:
- A dbt target is a specific environment or deployment of your dbt project.
- Targets are configurations for running dbt commands in a specific context, such as development or production.
- Targets are often specified using the --target flag when running dbt commands, allowing you to switch between different environments easily.
- Targets can reference specific profiles within the profiles.yml file, allowing you to use different data warehouse connections for different environments.
- In summary, a profile is the overall configuration for connecting dbt to a data warehouse and is defined in the profiles.yml file. A target, on the other hand, is a specific configuration for running dbt in a particular context or environment, and it can reference a specific profile. Targets help you manage and switch between different deployment environments for your dbt project.
Got it. I need a profile. Can you help me get started ?
cssXX.dashboards_store
, you actually created a profiles.yml
sample that's almost ready to use.When you bootstraped the cssXX.dashboards_store
, you were prompted for some information :
- What's your database IP ?
- What's your database port ?
Thoose information were used to render a profiles.yml
file, that you can find under cssXX.dashboards_store/profiles-sample.yml
.
You now need to populate and move your profiles-sample.yml
. You can do so by :
- Adding your user's database password to connect to the database : edit the
profiles-sample.yml
file, and replace thepassword: dontLookAtMeImSecret
with your user's actual password. - Move the
profiles-sample.yml
to~/.dbt/profiles.yml
:
mv profiles-sample.yml ~/.dbt/profiles.yml
~/.dbt/profiles.yml
, just copy the content of the profiles-sample.yml
to your existing ~/.dbt/profiles.yml
.Why do I have three targets in my profiles.yml
?
Developping
You can see a target as an environnement
. Targets allow DBT to handle the different environments you might have. The profiles.yml
's default target is dev
. Dev is configured to use your own name as a schema, and is the target you will mainly use to develop your models. Since it's using your name as a base database schema
every table you materialize and any code modification you do, won't alter other peoples owns materialization of the same project, since their materialization will live in the dev
schema, prefixed by their name. By using targets
, you can all work on the same project, on the same database, without stepping on each other's toes.
Staging and production
The profiles.yml
contains two other profiles :
- staging : The staging profile is used to test your code before deploying it to production. It's a good practice to test your code before deploying it to production, since you don't want to break your production database. The staging profile is configured to use the
dbo
schema in thestore_dev
database. You can use the staging schema to run your code at night, and preview it before actually runing against the production database. - prod : The prod profile is used to deploy your code to production. It's configured to use the
dbo
schema in thestore
database. You schouldn't explicetely use it unless you know what you are doing.
Ideally, your unreleased Power Bi dashboards schould be plugged to the staging
environnement, and your released Power Bi dashboards schould be plugged to the prod
environnement. By doing so, you can test and try some modifications to the dashboards / scripts without actually affecting your end users.