List of FAQs

#1. What other model configurations are there?

You can also configure:

  • tags to support easy categorization and graph selection
  • custom schemas to split your models across multiple schemas
  • aliases if your view/table name should differ from the filename
  • Snippets of SQL to run at the start or end of a model, known as hooks
  • Warehouse-specific configurations for performance (e.g. sort and dist keys on Redshift, partitions on BigQuery)

Check out the docs on model configurations to learn more.

#2. What materializations are available in dbt?

dbt ships with four materializations: view, table, incremental and ephemeral. Check out the documentation on materializations for more information on each of these options.

You can also create your own custom materializations, if required however this is an advanced feature of dbt.

#3. What tests are available for me to use in dbt?

Out of the box, dbt ships with the following tests:

  • unique
  • not_null
  • accepted_values
  • relationships (i.e. referential integrity)

You can also write your own custom schema tests.

Some additional custom schema tests have been open-sourced in the dbt-utils package, check out the docs on packages to learn how to make these tests available in your project.

#4. What is a beta release?

This is a chance to try out brand-new functionality. You get to start planning for use cases that the next minor version will unlock. We get to hear from you about unexpected behavior and nasty bugs, so that the release candidate has more polish and fewer surprises.

#5. How can I set up the right permissions in BigQuery?

To use this functionality, first create the service account you want to impersonate. Then grant users that you want to be able to impersonate this service account the roles/iam.serviceAccountTokenCreator role on the service account resource. Then, you also need to grant the service account the same role on itself. This allows it to create short-lived tokens identifying itself, and allows your human users (or other service accounts) to do the same. More information on this scenario is available here.

Once you've granted the appropriate permissions, you'll need to enable the IAM Service Account Credentials API. Enabling the API and granting the role are eventually consistent operations, taking up to 7 minutes to fully complete, but usually fully propagating within 60 seconds. Give it a few minutes, then add the impersonate_service_account option to your BigQuery profile configuration.

#6. Why would I want to impersonate a service account?

You may want your models to be built using a dedicated service account that has elevated access to read or write data to the specified project or dataset. Typically, this requires you to create a service account key for running under development or on your CI server. By specifing the email address of the service account you want to build models as, you can use Application Default Credentials or the service's configured service account (when running in GCP) to assume the identity of the service account with elevated permissions.

This allows you to reap the advantages of using federated identity for developers (via ADC) without needing to grant individual access to read and write data directly, and without needing to create separate service account and keys for each user. It also allows you to completely eliminate the need for service account keys in CI as long as your CI is running on GCP (Cloud Build, Jenkins, GitLab/Github Runners, etc).

#7. How do I build one seed at a time?

As of v0.16.0, you can use a --select option with the dbt seed command, like so:

$ dbt seed --select country_codes

There is also an --exclude option.

Check out more in the model selection syntax documentation.

Prior to v0.16.0, there was no way to build one seed at a time.

#8. How can I see the SQL that dbt is running?

To check out the SQL that dbt is running, you can look in:

  • dbt Cloud:
    • Within the run output, click on a model name, and then select "Details"
  • dbt CLI:
    • The target/compiled/ directory for compiled select statements
    • The target/run/ directory for compiled create statements
    • The logs/dbt.log file for verbose logging.

#9. What is the difference between dbt Core, the dbt CLI and dbt Cloud?

dbt Core is the software that takes a dbt project (.sql and .yml files) and a command and then creates tables/views in your warehouse. dbt Core includes a command line interface (CLI) so that users can execute dbt commands using a terminal program. dbt Core is open source and free to use.

dbt Cloud is an application that helps teams use dbt. dbt Cloud provides a web-based IDE to develop dbt projects, a purpose-built scheduler, and a way to share dbt documentation with your team. dbt Cloud offers a number of features for free, as well as additional features in paid tiers (check out the pricing here).

#10. Can I store my seeds in a directory other than the `data` directory in my project?

By default, dbt expects your seed files to be located in the data subdirectory of your project.

To change this, update the data-paths configuration in your dbt_project.yml file, like so:

data-paths: ["seeds"]

#11. Can I store my data tests in a directory other than the `test` directory in my project?

By default, dbt expects your singular test files to be located in the tests subdirectory of your project.

To change this, update the test-paths configuration in your dbt_project.yml file, like so:

test-paths: ["data-tests"]

#12. Can I store my models in a directory other than the `models` directory in my project?

By default, dbt expects your seed files to be located in the models subdirectory of your project.

To change this, update the source-paths configuration in your dbt_project.yml file, like so:

source-paths: ["transformations"]

#13. Can I store my snapshots in a directory other than the `snapshot` directory in my project?

By default, dbt expects your snapshot files to be located in the snapshots subdirectory of your project.

To change this, update the snapshot-paths configuration in your dbt_project.yml file, like so:

snapshot-paths: ["snapshots"]

Note that you cannot co-locate snapshots and models in the same directory.

#14. Can I connect my dbt project to two databases?

The meaning of the term 'database' varies with each major warehouse manager. Hence, the answer to "can a dbt project connect to more than one database?" depends on the warehouse used in your tech stack.

  • dbt projects connecting to warehouses like Snowflake or Bigquery—these empower one set of credentials to draw from all datasets or 'projects' available to an account—are sometimes said to connect to more than one database.
  • dbt projects connecting to warehouses like Redshift and Postgres—these tie one set of credentials to one database—are said to connect to one database only.

Sidestep the 'one database problem' by relying on ELT thinking (i.e. extract -> load -> transform). Remember, dbt is not a loader--with few exceptions, it doesn't move data from sources to a warehouse. dbt is a transformer. It enters the picture after extractors and loaders have funneled sources into a warehouse. It moves and combines data inside the warehouse itself.

Hence, instead of thinking "how do I connect my dbt project to two databases", ask "what loader services will best prepare our warehouse for dbt transformations."

For more on the modern 'ELT-powered' data stack, see the "dbt and the modern BI stack" section of this dbt blog post.

#15. Do I need to create my target schema before running dbt?

Nope! dbt will check if the schema exists when it runs. If the schema does not exist, dbt will create it for you.

#16. How do I create dependencies between models?

When you use the ref function, dbt automatically infers the dependencies between models.

For example, consider a model, customer_orders, like so:

min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders
from {{ ref('stg_orders') }}
group by 1

There's no need to explicitly define these dependencies. dbt will understand that the stg_orders model needs to be built before the above model (customer_orders). When you execute dbt run, you will see these being built in order:

$ dbt run
Running with dbt=0.16.0
Found 2 models, 28 tests, 0 snapshots, 0 analyses, 130 macros, 0 operations, 0 seed files, 3 sources
11:42:52 | Concurrency: 8 threads (target='dev_snowflake')
11:42:52 |
11:42:52 | 1 of 2 START view model dbt_claire.stg_jaffle_shop__orders........... [RUN]
11:42:55 | 1 of 2 OK creating view model dbt_claire.stg_jaffle_shop__orders..... [CREATE VIEW in 2.50s]
11:42:55 | 2 of 2 START relation dbt_claire.customer_orders..................... [RUN]
11:42:56 | 2 of 2 OK creating view model dbt_claire.customer_orders............. [CREATE VIEW in 0.60s]
11:42:56 | Finished running 2 view models in 15.13s.

To get some practice with this, we recommend you complete the tutorial to build your first dbt project

#17. Can I set test failure thresholds?

This is not currently supported in dbt natively.

Instead, consider:

#18. What privileges does my database user need to use dbt?

Your user will need to be able to:

  • select from raw data in your warehouse (i.e. data to be transformed)
  • create schemas, and therefore create tables/views within that schema¹
  • read system views to generate documentation (i.e. views in information_schema)

On Postgres, Redshift, and Snowflake, use a series of grants to ensure that your user has the correct privileges.

On BigQuery, use the "BigQuery User" role to assign these privileges.

¹Alternatively, a separate user can create a schema for the dbt user, and then grant the user privileges to create within this schema. We generally recommend granting your dbt user the ability to create schemas, as it is less complicated to implement.

#19. What parts of Jinja are dbt-specific?

There are certain expressions that are specific to dbt — these are documented in the Jinja function reference section of these docs. Further, docs blocks, snapshots, and materializations are custom Jinja blocks that exist only in dbt.

#20. How do I debug my Jinja?

You should get familiar with checking the compiled SQL in target/compiled/<your_project>/ and the logs in logs/dbt.log to see what dbt is running behind the scenes.

You can also use the log function to debug Jinja by printing objects to the command line.

#21. How do I define a column type?

Your warehouse's SQL engine automatically assigns a datatype to every column, whether it's found in a source or model. To force SQL to treat a columns a certain datatype, use cast functions:

cast(order_id as integer),
cast(order_price as double(6,2)) -- a more generic way of doing type conversion
from {{ ref('stg_orders') }}

Many modern data warehouses now support :: syntax as a shorthand for cast( as ).

order_price::numeric(6,2) -- you might find this in Redshift, Snowflake, and Postgres
from {{ ref('stg_orders') }}

Be warned, reading in data and casting that data may not always yield expected results, and every warehouse has its own subtleties. Certain casts may not be allowed (e.g. on Bigquery, you can't cast a boolean-type value to a float64). Casts that involve a loss in precision loss (e.g. float to integer) rely on your SQL engine to make a best guess or follow a specific schema not used by competing services. When performing casts, it's imperative that you are familiar with your warehouse's casting rules to best label fields in your sources and models.

Thankfully, popular database services tend to have type docs--Redshift and Bigquery.

#22. [Error] Could not find package 'my_project'

If a package name is included in the search_order of a project-level dispatch config, dbt expects that package to contain macros which are viable candidates for dispatching. If an included package does not contain any macros, dbt will raise an error like:

Compilation Error
In dispatch: Could not find package 'my_project'

This does not mean the package or root project is missing—it means that any macros from it are missing, and so it is missing from the search spaces available to dispatch.

#23. Can I render docs for multiple projects?

Yes! To do this, you'll need to create a "super project" that lists each project as a dependent package in a packages.yml file. Then run dbt deps to install the projects as packages, prior to running dbt docs generate.

If you are going down the route of multiple projects, be sure to check out our advice 1 2on the topic.

#24. Do I need to add a yaml entry for column for it to appear in the docs site?

Fortunately, no!

dbt will introspect your warehouse to generate a list of columns in each relation, and match it with the list of columns in your .yml files. As such, any undocumented columns will still appear in your documentation!

#25. Can I document things other than models, like sources, seeds, and snapshots?

Yes! You can document almost everything in your project using the description: key. Check out the reference docs on descriptions for more info!

#26. How do I document macros?


To document macros, use a schema file and nest the configurations under a macros: key


version: 2
- name: cents_to_dollars
description: A macro to convert cents to dollars
- name: column_name
type: string
description: The name of the column you want to convert
- name: precision
type: integer
description: Number of decimal places. Defaults to 2.

#27. Are there any example dbt projects?


  • Getting Started Tutorial: You can build your own example dbt project in the Getting Started Tutorial
  • Jaffle Shop: A demonstration project (closely related to the tutorial) for a fictional ecommerce store (source code)
  • MRR Playbook: A demonstration project that models subscription revenue (source code, docs)
  • Attribution Playbook: A demonstration project that models marketing attribution (source code, docs)
  • GitLab: Gitlab's internal dbt project is open source and is a great example of how to use dbt at scale (source code, docs)

If you have an example project to add to this list, suggest an edit.

#28. How do I exclude a table from a freshness snapshot?

Some tables in a data source may be updated infrequently. If you've set a freshness property at the source level, this table is likely to fail checks.

To work around this, you can set the table's freshness to null (freshness: null) to "unset" the freshness for a particular table:

version: 2
- name: jaffle_shop
database: raw
warn_after: {count: 12, period: hour}
error_after: {count: 24, period: hour}
loaded_at_field: _etl_loaded_at
- name: orders
- name: product_skus
freshness: null # do not check freshness for this table

#29. What happens if one of my runs fails?

If you're using dbt Cloud, we recommend setting up email and Slack notifications (Account Settings > Notifications) for any failed runs. Then, debug these runs the same way you would debug any runs in development.

#30. One of my tests failed, how can I debug it?

To debug a failing test, find the SQL that dbt ran by:

  • dbt Cloud:
    • Within the test output, click on the failed test, and then select "Details"
  • dbt CLI:
    • Open the file path returned as part of the error message.
    • Navigate to the target/compiled/schema_tests directory for all compiled test queries

Copy the SQL into a query editor (in dbt Cloud, you can paste it into a new Statement), and run the query to find the records that failed.

#31. The columns of my seed changed, and now I get an error when running the `seed` command, what should I do?

If you changed the columns of your seed, you may get a Database Error:

$ dbt seed
Running with dbt=0.16.0-rc2
Found 0 models, 0 tests, 0 snapshots, 0 analyses, 130 macros, 0 operations, 1 seed file, 0 sources
12:12:27 | Concurrency: 8 threads (target='dev_snowflake')
12:12:27 |
12:12:27 | 1 of 1 START seed file dbt_claire.country_codes...................... [RUN]
12:12:30 | 1 of 1 ERROR loading seed file dbt_claire.country_codes.............. [ERROR in 2.78s]
12:12:31 |
12:12:31 | Finished running 1 seed in 10.05s.
Completed with 1 error and 0 warnings:
Database Error in seed country_codes (data/country_codes.csv)
000904 (42000): SQL compilation error: error line 1 at position 62
invalid identifier 'COUNTRY_NAME'

In this case, you should rerun the command with a --full-refresh flag, like so:

dbt seed --full-refresh

Why is this the case?

When you typically run dbt seed, dbt truncates the existing table and reinserts the data. This pattern avoids a drop cascade command, which may cause downstream objects (that your BI users might be querying!) to get dropped.

However, when column names are changed, or new columns are added, these statements will fail as the table structure has changed.

The --full-refresh flag will force dbt to drop cascade the existing table before rebuilding it.

#33. If models can only be `select` statements, how do I insert records?

For those coming from an ETL (Extract Transform Load) paradigm, there's often a desire to write transformations as insert and update statements. In comparison, dbt will wrap your select query in a create table as statement, which can feel counter-productive.

  • If you wish to use insert statements for perfomance reasons (i.e. to reduce data that is processed), consider incremental models
  • If you wish to use insert statements since your source data is constantly changing (e.g. to create "Type 2 Slowly Changing Dimensions"), consider snapshotting your source data, and building models on top of your snaphots.

#34. My compiled SQL has a lot of spaces and new lines, how can I get rid of it?

This is known as "whitespace control".

Use a minus sign (-, e.g. {{- ... -}}, {%- ... %}, {#- ... -#}) at the start or end of a block to strip whitespace before or after the block (more docs here). Check out the tutorial on using Jinja for an example.

Take caution: it's easy to fall down a rabbit hole when it comes to whitespace control!

#35. How do I preserve leading zeros in a seed?

If you need to preserve leading zeros (for example in a zipcode or mobile number):

  1. v0.16.0 onwards: Include leading zeros in your seed file, and use the column_types configuration with a varchar datatype of the correct length.
  2. Prior to v0.16.0: Use a downstream model to pad the leading zeros using SQL, for example: lpad(zipcode, 5, '0')

#36. Can I use seeds to load raw data?

Seeds should not be used to load raw data (for example, large CSV exports from a production database).

Since seeds are version controlled, they are best suited to files that contain business-specific logic, for example a list of country codes or user IDs of employees.

Loading CSVs using dbt's seed functionality is not performant for large files. Consider using a different tool to load these CSVs into your data warehouse.

#37. How do I load data into my warehouse?

dbt assumes that you already have a copy of your data, in your data warehouse. We recommend you use an off-the-shelf tool like Stitch or Fivetran to get data into your warehouse.

Can dbt be used to load data?

No, dbt does not extract or load data. It focuses on the transformation step only.

#38. How do I write long-form explanations in my descriptions?

If you need more than a sentence to explain a model, you can:

  1. Split your description over multiple lines (yaml docs), like so:
version: 2
- name: customers
description: >
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
  1. Use a docs block to write the description in a Markdown file.

#39. Can I build my models in a schema other than my target schema?

Or: Can I split my models across multiple schemas?

Yes! Use the schema configuration in your dbt_project.yml file, or using a config block:

name: jaffle_shop
schema: marketing # seeds in the `models/mapping/ subdirectory will use the marketing schema

#40. Should I use separate files to declare resource properties, or one large file?

It's up to you:

  • Some folks find it useful to have one file per model (or source / snapshot / seed etc)
  • Some find is useful to have one per directory, documenting and testing multiple models in one file

Choose what works for your team. We have more recommendations in our guide on structuring dbt project.

#41. Do all my tests go in one file?

No! You can use as many files as you want! Some folks find it useful to have one file per model, we tend to have one per directory.

#42. Why are there "prerelease" docs?

We want to give beta testers the information they need to try out new features, without adding confusion to the current docs site. This is served from a long-lived next branch of the docs repo. Feedback on prerelease docs is also welcome—use the "Edit this page" feature at the bottom.

#43. Can I use environment variables in my profile?

Yes! Check out the docs on environment variables for more information.

#44. What should I name my profile?

We typically use a company name for a profile name, and then use targets to differentiate between dev and prod. Check out the docs on managing environments for more information.

#45. What should I name my dbt project?

The name of your company (in snake_case) often makes for a good project name.

#46. Can I add tests and descriptions in a config block?

In v0.21, dbt added the ability to define node configs in .yml files, in addition to config() blocks and dbt_project.yml. But the reverse isn't always true: there are some things in .yml files that can only be defined there.

Certain properties are special, because:

  • They have a unique Jinja rendering context
  • They create new project resources
  • They don't make sense as hierarchical configuration
  • They're older properties that haven't yet been redefined as configs

These properties are:

#47. Why do I need to quote column names in Jinja?

In the macro example we passed the column name amount quotes:

{{ cents_to_dollars('amount') }} as amount_usd

We have to use quotes to pass the string 'amount' to the macro.

Without the quotes, the Jinja parser will look for a variable named amount. Since this doesn't exist, it will compile to nothing.

Quoting in Jinja can take a while to get used to! The rule is that you're within a Jinja expression or statement (i.e. within {% ... %} or {{ ... }}), you'll need to use quotes for any arguments that are strings.

Single and double quotes are equivalent in Jinja – just make sure you match them appropriately.

And if you do need to pass a variable as an argument, make sure you don't nest your curlies

#48. What tests should I add to my project?

We recommend that every model has a test on a primary key, that is, a column that is unique and not_null.

We also recommend that you test any assumptions on your source data. For example, if you believe that your payments can only be one of three payment methods, you should test that assumption regularly — a new payment method may introduce logic errors in your SQL.

In advanced dbt projects, we recommend using sources and running these source data-integrity tests against the sources rather than models.

#49. How can I reference models or macros in another project?

You can use packages to add another project to your dbt project, including other projects you've created. Check out the docs for more information!

#50. How do I remove deleted models from my data warehouse?

If you delete a model from your dbt project, dbt does not automatically drop the relation from your schema. This means that you can end up with extra objects in schemas that dbt creates, which can be confusing to other users.

(This can also happen when you switch a model from being a view or table, to ephemeral)

When you remove models from your dbt project, you should manually drop the related relations from your schema.

#51. If I can name these files whatever I'd like, what should I name them?

It's up to you! Here's a few options:

  • Default to the existing terminology: schema.yml (though this does make it hard to find the right file over time)
  • Use the same name as your directory (assuming you're using sensible names for your directories)
  • If you test and document one model (or seed, snapshot, macro etc.) per file, you can give it the same name as the model (or seed, snapshot, macro etc.)

Choose what works for your team. We have more recommendations in our guide on structuring dbt project.

#52. How do I run models downstream of a seed?

You can run models downstream of a seed using the model selection syntax, and treating the seed like a model.

For example, the following would run all models downstream of a seed named country_codes:

$ dbt run --select country_codes+

#53. If I rerun dbt, will there be any downtime as models are rebuilt?

Nope! The SQL that dbt generates behind the scenes ensures that any relations are replaced atomically (i.e. your business users won't experience any downtime).

The implementation of this varies on each warehouse, check out the logs to see the SQL dbt is executing.

#54. How do I run one model at a time?

To run one model, use the --select flag (or -s flag), followed by the name of the model:

$ dbt run --select customers

Check out the model selection syntax documentation for more operators and examples.

#55. How do I run one snapshot at a time?

To run one snapshot, use the --select flag, followed by the name of the snapshot:

$ dbt snapshot --select order_snapshot

Check out the model selection syntax documentation for more operators and examples.

#56. How do I run models downstream of one source?

To run models downstream of a source, use the source: selector:

$ dbt run --select source:jaffle_shop+

(You can also use the -s shorthand here instead of --select)

To run models downstream of one source table:

$ dbt run --select source:jaffle_shop.orders+

Check out the model selection syntax for more examples!

#57. What should my profiles.yml file look like for my warehouse?

The structure of a profile looks different on each warehouse. Check out the supported databases page, and navigate to the Profile Setup section for your warehouse.

#58. Does my `.yml` file containing tests and descriptions need to be named `schema.yml`?

No! You can name this file whatever you want (including whatever_you_want.yml), so long as:

  • The file is in your models/ directory¹
  • The file has .yml extension

Check out the docs for more information.

¹If you're declaring properties for seeds, snapshots, or macros, you can also place this file in the related directory — data/, snapshots/ and macros/ respectively.

#59. Can I build my seeds in a schema other than my target schema?

Or: Can I split my seeds across multiple schemas?

Yes! Use the schema configuration in your dbt_project.yml file.

name: jaffle_shop
schema: mappings # all seeds in this project will use the mapping schema by default
schema: marketing # seeds in the `data/mapping/ subdirectory will use the marketing schema

#60. How do I set a datatype for a column in my seed?

dbt will infer the datatype for each column based on the data in your CSV.

You can also explicitly set a datatype using the column_types configuration like so:

jaffle_shop: # you must include the project name
zipcode: varchar(5)

#61. Do hooks run with seeds?

Yes! The following hooks are available:

Configure these in your dbt_project.yml file.

#62. Why are profiles stored outside of my project?

Profiles are stored separately to dbt projects to avoid checking credentials into version control. Database credentials are extremely sensitive information and should never be checked into version control.

#63. How do I share my documentation with my team members?

If you're using dbt Cloud to deploy your project, and have the Team Plan, you can have up to 50 read only users, who will be able access the documentation for your project.

#64. How often should I run the snapshot command?

Snapshots are a batch-based approach to change data capture. The dbt snapshot command must be run on a schedule to ensure that changes to tables are actually recorded! While individual use-cases may vary, snapshots are intended to be run between hourly and daily. If you find yourself snapshotting more frequently than that, consider if there isn't a more appropriate way to capture changes in your source data tables.

#65. Are the results of freshness stored anywhere?


The snapshot-freshness command will output a pass/warning/error status for each table selected in the freshness snapshot.

Additionally, dbt will write the freshness results to a file in the target/ directory called sources.json by default. You can also override this destination, use the -o flag to the snapshot-freshness command.

#66. Do hooks run with snapshots?

Yes! The following hooks are available for snapshots:

#67. What happens if I add new columns to my snapshot query?

When the columns of your source query changes, dbt will attempt to reconcile this change in the destination snapshot table. dbt does this by:

  1. Creating new columns from the source query in the destination table
  2. Expanding the size of string types where necessary (eg. varchars on Redshift)

dbt will not delete columns in the destination snapshot table if they are removed from the source query. It will also not change the type of a column beyond expanding the size of varchar columns. That is, if a string column is changed to a date column in the snapshot source query, dbt will not attempt to change the type of the column in the destination table.

#68. Why is there only one `target_schema` for snapshots?

Snapshots build into the same target_schema, no matter who is running them.

In comparison, models build into a separate schema for each user — this helps maintain separate development and production environments.

So, why the difference?

Let's assume you are running your snapshot regularly. If the model had a different target in dev (e.g. dbt_claire) compared to prod (e.g. analytics), when you ref the model in dev, dbt would select from a snapshot that has not been run regularly. This can make it hard to build models since the data differs from prod.

Instead, in the models that ref your snapshots, it makes more sense to select from the production version of your snapshot, even when developing models. In this way, snapshot tables are more similar to source data than they are to proper dbt models.

For this reason, there is only one target_schema, which is not environment-aware by default.

However, this can create problems if you need to run a snapshot command when developing your models, or during a CI run. Fortunately, there's a few workarounds — check out this Discourse article.

#69. How do I snapshot freshness for one source only?

Use the --select flag to snapshot freshness for specific sources. Eg:

# Snapshot freshness for all Snowplow tables:
$ dbt source freshness --select jaffle_shop
# Snapshot freshness for a particular source table:
$ dbt source freshness --select jaffle_shop.orders
# Snapshot freshness for multiple particular source tables:
$ dbt source freshness --select jaffle_shop.orders jaffle_shop.customers

See the source freshness command reference for more information.

#70. What if my source is in a poorly named schema or table?

By default, dbt will use the name: parameters to construct the source reference.

If these names are a little less-than-perfect, use the schema and identifier properties to define the names as per the database, and use your name: property for the name that makes sense!

version: 2
- name: jaffle_shop
schema: postgres_backend_public_schema
database: raw
- name: orders
identifier: api_orders

In a downstream model:

select * from {{ source('jaffle_shop', 'orders') }}

Will get compiled to:

select * from raw.postgres_backend_public_schema.api_orders

#71. What if my source is in a different database to my target database?

Use the database property to define the database that the source is in.

version: 2
- name: jaffle_shop
database: raw
- name: orders
- name: customers

#72. I need to use quotes to select from my source, what should I do?

This is reasonably common on Snowflake in particular.

By default, dbt will not quote the database, schema, or identifier for the source tables that you've specified.

To force dbt to quote one of these values, use the quoting property:

version: 2
- name: jaffle_shop
database: raw
database: true
schema: true
identifier: true
- name: order_items
- name: orders
# This overrides the `jaffle_shop` quoting config
identifier: false

#73. How do I specify column types?

Simply cast the column to the correct type in your model:

created::timestamp as created
from some_other_table

You might have this question if you're used to running statements like this:

create table dbt_alice.my_table
id integer,
created timestamp;
insert into dbt_alice.my_table (
select id, created from some_other_table

In comparison, dbt would build this table using a create table as statement:

create table dbt_alice.my_table as (
select id, created from some_other_table

So long as your model queries return the correct column type, the table you create will also have the correct column type.

To define additional column options:

  • Rather than enforcing uniqueness and not-null constraints on your column, use dbt's testing functionality to check that your assertions about your model hold true.
  • Rather than creating default values for a column, use SQL to express defaults (e.g. coalesce(updated_at, current_timestamp()) as updated_at)
  • In edge-cases where you do need to alter a column (e.g. column-level encoding on Redshift), consider implementing this via a post-hook.

#74. Which SQL dialect should I write my models in?


Which SQL dialect does dbt use?

dbt can feel like magic, but it isn't actually magic. Under the hood, it's running SQL in your own warehouse — your data is not processed outside of your warehouse.

As such, your models should just use the SQL dialect of your own database. Then, when dbt wraps your select statements in the appropriate DDL or DML, it will use the correct DML for your warehouse — all of this logic is written in to dbt.

You can find more information about the databases, platforms, and query engines that dbt supports in the Supported Adapters docs.

Want to go a little deeper on how this works? Consider a snippet of SQL that works on each warehouse:

select 1 as my_column

To replace an existing table, here's an illustrative example of the SQL dbt will run on different warehouses (the actual SQL can get much more complicated than this!)

-- you can't create or replace on redshift, so use a transaction to do this in an atomic way
create table "dbt_alice"."test_model__dbt_tmp" as (
select 1 as my_column
alter table "dbt_alice"."test_model" rename to "test_model__dbt_backup";
alter table "dbt_alice"."test_model__dbt_tmp" rename to "test_model"
drop table if exists "dbt_alice"."test_model__dbt_backup" cascade;

#75. What happens if the SQL in my query is bad?


I got a Database Error, what does that mean?

If there's a mistake in your SQL, dbt will return the error that your database returns.

$ dbt run --select customers
Running with dbt=0.15.0
Found 3 models, 9 tests, 0 snapshots, 0 analyses, 133 macros, 0 operations, 0 seed files, 0 sources
14:04:12 | Concurrency: 1 threads (target='dev')
14:04:12 |
14:04:12 | 1 of 1 START view model dbt_alice.customers.......................... [RUN]
14:04:13 | 1 of 1 ERROR creating view model dbt_alice.customers................. [ERROR in 0.81s]
14:04:13 |
14:04:13 | Finished running 1 view model in 1.68s.
Completed with 1 error and 0 warnings:
Database Error in model customers (models/customers.sql)
Syntax error: Expected ")" but got identifier `grand-highway-265418` at [13:15]
compiled SQL at target/run/jaffle_shop/customers.sql

Any models downstream of this model will also be skipped. Use the error message and the compiled SQL to debug any errors.

#76. How should I structure my project?

There's no one best way to structure a project! Every organization is unique.

If you're just getting started, check out how we (dbt Labs) structure our dbt projects.

#77. What should I name my target?

We typically use targets to differentiate between development and production runs of dbt, naming the targets dev and prod respectively. Check out the docs on managing environments for more information.

#78. How do I test one model at a time?

Running tests on one model looks very similar to running a model: use the --select flag (or -s flag), followed by the name of the model:

dbt test --select customers

Check out the model selection syntax documentation for full syntax, and test selection examples in particular.

#79. How do I run tests on sources only?

It is possible! You need to use the source: selection method:

$ dbt test --select source:*

Check out the model selection syntax documentation for more operators and examples.

#80. How do I test and document seeds?


To test and document seeds, use a schema file and nest the configurations under a seeds: key


version: 2
- name: country_codes
description: A mapping of two letter country codes to country names
- name: country_code
- unique
- not_null
- name: country_name
- unique
- not_null

#81. How do I run tests on just my sources?

To run tests on all sources, use the following command:

$ dbt test --select source:*

(You can also use the -s shorthand here instead of --select)

To run tests on one source (and all of its tables):

$ dbt test --select source:jaffle_shop

And, to run tests on one source table only:

$ dbt test --select source:jaffle_shop.orders

Yep, we know this syntax is a little less than ideal, so we're hoping to improve it in a future release. Check out the model selection syntax for more examples!

#82. Do model names need to be unique?

Yes! To build dependencies between models, you need to use the ref function. The ref function only takes one argument — the model name (i.e. the filename). As a result, these model names need to be unique, even if they are in distinct folders.

Often, this question comes up because users want to give two models the same name in their warehouse, splitting them across separate schemas (e.g. stripe.users and app.users). Checkout the docs on custom aliases and custom schemas to achieve this.

#83. Can I test the uniqueness of two columns?

Yes, There's a few different options.

Consider an orders table that contains records from multiple countries, and the combination of ID and country code is unique:


Here are some approaches:

1. Create a unique key in the model and test that

country_code || '-' || order_id as surrogate_key,
version: 2
- name: orders
- name: surrogate_key
- unique

2. Test an expression

version: 2
- name: orders
- unique:
column_name: "(country_code || '-' || order_id)"

3. Use the dbt_utils.unique_combination_of_columns test

This is especially useful for large datasets since it is more performant. Check out the docs on packages for more information.

version: 2
- name: orders
- dbt_utils.unique_combination_of_columns:
- country_code
- order_id

#84. I got an "unused model configurations" error message, what does this mean?

You might have forgotten to nest your configurations under your project name, or you might be trying to apply configurations to a directory that doesn't exist. Check out this article to understand more.

#85. When should I run my tests?

You should run your tests whenever you are writing new code (to ensure you haven't broken any existing models by changing SQL), and whenever you run your transformations in production (to ensure that your assumptions about your source data are still valid).

#86. Which docs should I use when writing Jinja or creating a macro?

If you are stuck with a Jinja issue, it can get confusing where to check for more information. We recommend you check (in order):

  1. Jinja's Template Designer Docs: This is the best reference for most of the Jinja you'll use
  2. Our Jinja function reference: This documents any additional functionality we've added to Jinja in dbt.
  3. Agate's table docs: If you're operating on the result of a query, dbt will pass it back to you as an agate table. This means that the methods you call on the table belong to the Agate library rather than Jinja or dbt.

#87. Which materialization should I use for my model?

Start out with views, and then change models to tables when required for performance reasons (i.e. downstream queries have slowed).

Check out the docs on materializations for advice on when to use each materialization.

#88. How did dbt choose which schema to build my models in?

By default, dbt builds models in your target schema. To change your target schema:

  • If you're developing in dbt Cloud, these are set for each user when you first use a development environment.
  • If you're developing with the dbt CLI, this is the schema: parameter in your profiles.yml file.

If you wish to split your models across multiple schemas, check out the docs on using custom schemas.

Note: on BigQuery, dataset is used interchangeably with schema.

#89. Why can't I just write DML in my transformations?


I'm already familiar with DML, and can write these statements manually, why should I use dbt to do this?

select statements make transformations accessible

More people know how to write select statements, than DML, making the transformation layer accessible to more people!

Writing good DML is hard.

If you write the DDL / DML yourself you can end up getting yourself tangled in problems like:

  • What happens if the table already exists? Or this table already exists as a view, but now I want it to be a table?
  • What if the schema already exists? Or, should I check if the schema already exists?
  • How do I replace a model atomically (such that there's no down-time for someone querying the table)
  • What if I want to parameterize my schema so I can run these transformations in a development environment?
  • What order do I need to run these statements in? If I run a cascade does it break other things?

Each of these problems can be solved, but they are unlikely to be the best use of your time.

dbt does more than generate SQL

You can test your models, generate documentation, create snapshots, and more!

You reduce your vendor lock in

SQL dialects tend to diverge the most in DML and DDL (rather than in select statements) — check out the example here. By writing less SQL, it can make a migration to a new database technology easier.

If you do need to write custom DML, there are ways to do this in dbt using custom materializations.

#90. Why does my dbt output have so many macros in it?

The output of a dbt run counts over 100 macros in your project!

$ dbt run
Running with dbt=0.17.0
Found 1 model, 0 tests, 0 snapshots, 0 analyses, 138 macros, 0 operations, 0 seed files, 0 sources

This is because dbt ships with its own project, which also includes macros! You can learn more about this here.

#91. Why do model and source yml files always start with `version: 2`?

Once upon a time, the structure of these .yml files was very different (s/o to anyone who was using dbt back then!). Adding version: 2 allowed us to make this structure more extensible.

Currently, Version 2 is the only supported version for these files. We kept version: around as a required key so that in the future, if we need to introduce a new structure for these files, we'll be able to do this more easily.

#92. Can I use a yaml file extension?

No. At present, dbt will only search for files with a .yml file extension. In a future release of dbt, dbt will also search for files with a .yaml file extension.