Setting Up Your Local Machine for dbt Core: A Comprehensive Guide

Alex Paul Migit

Sr. Data & Analytics Engineer | Athlete | Business Advisor | Musician

Published Feb 1, 2024

dbt™ is an awesome SQL-first transformation workflow that lets teams quickly and collaboratively deploy analytics code following software engineering best practices like modularity, portability, CI/CD, and documentation.

With dbt, anyone on a data team can safely contribute to production-grade data pipelines seamlessly.

Getting Started: Setting Up for Success

This article is primarily for Windows operating systems, but will likely prove as a useful reference for those using various other OS software.

I've also added helpful links throughout this article, but make sure you are holding down the CTRL key before clicking, so that the link opens up in a new tab. At the time of this writing, it does not look like authors are able to add the HTML "_blank" target attribute to links.

Both Python and Git for Windows will need installed before you get started.

If you do not have Python installed on your local workstation, you can download and install in from the python.org website downloads page here.

Check that Python is installed and/or the release version with the following command in your command prompt or terminal:

If you don't already have Git SCM installed, now would be a good time to download and setup. Check out the Git online Reference Manual for The official and comprehensive man pages that are included in the Git package itself.

While you're at it, a good source-code editor like VS Code would probably be good to have installed as well.

Streamlining Your Workflow: Mastering Package Installation for Efficiency

First, you'll need to install dbt Core on your Windows system. You can do this using pip, the Python package manager.

pip is a package-management system written in Python, and it is the preferred installer program. Starting with Python 3.4, pip is included by default with the Python binary installers.

Install dbt Core by running the following in your command prompt or terminal:

Install only the above package(s) you need with their respective dbt-<adapter>, OR install them all.

If prompted to upgrade pip, execute the following command:

I recommend verifying and keeping track of package versions, so that you are aware of what you're working with. Keeping track of software versions can prove to be extremely helpful when troubleshooting and working with various packages.

To upgrade dbt-core to the latest version via the terminal, you can use pip, the Python package manager.

Check the current version of dbt Core by running:

All adapters build on top of dbt-core. Some also depend on other adapters: for example, dbt-redshift builds on top of dbt-postgres. In that case, you would see those adapters included by your specific installation, too.

To upgrade dbt Core to the latest version, use pip:

After the installation completes, you can verify that dbt-core has been upgraded to the latest version by running the dbt --version command.

Note that you may also have to upgrade the dbt-<adapter> packages that you have installed, as you might run into dependency conflicts like in the example below with the SQL Server package.

Navigate to your user directory (e.g. ~/User/YOUR-USERNAME/), and create a ~/.dbt/ folder. This is where you will create and store your profiles.yml file.

Connection profiles

When developing locally, dbt connects to your Data Warehouse (DW) using a profile, which is a YAML file with all of the connection details to your DW.

To use dbt Core, you'll need a profiles.yml file that contains the connection details for your data platform.

In your profiles.yml file, you can store as many profiles as you need. Typically, you would have one profile for each warehouse you use. Most organizations only have one profile.

Note that dbt uses YAML in a few different places. If you're new to YAML, it would be worth learning how arrays, dictionaries, and strings are represented.

This file generally lives outside of your dbt project to avoid sensitive credentials being checked in to version control, but profiles.yml can be safely checked in when using environment variables to load sensitive credentials.

The YAML file below shows the configuration options in the profiles.yml file:

You may want to have your profiles.yml file stored in a different directory than ~/.dbt/ – for example, if you are using environment variables to load your credentials, you might choose to include this file in the root directory of your dbt project.

Note that the file always needs to be called profiles.yml, regardless of which directory it is in.

Unlocking Project Configurations: Optimizing Your Workflow with dbt

Every dbt project needs a dbt_project.yml file — this is how dbt knows a directory is a dbt project. It also contains important information that tells dbt how to operate your project.

By default, dbt will look for dbt_project.yml in your current working directory and its parents, but you can set a different directory using the --project-dir flag or the DBT_PROJECT_DIR environment variable.

The following example is a list of all available configurations in the dbt_project.yml file:

When you run dbt Core from the command line, it reads your dbt_project.yml file to find the profile name, and then looks for a profile with the same name in your profiles.yml file. This profile contains all the information dbt needs to connect to your data platform.

Later, you will see how you can utilize the dbt init command to easily generate both profiles.yml and dbt_project.yml files.

Creating The Repository

Create a repository in your github.com account. Alternatively, you can use the GitHub Desktop App if you prefer a GUI, want to simplify your development workflow, and/or find yourself fighting with Git on a regular basis.

Navigate to your Documents folder [or your preferred working directory for cloning repositories], and create a GitHub folder.

You can also create your working directory with the following command:

Clone the repository you created earlier to your GitHub working directory.

To clone your repository, locally, using the command line using HTTPS, copy the URL displayed in GitHub Web with the HTTPS button selected.

In your CLI, type git clone, and then paste the URL you copied earlier. Your command will look similar to the following example:

Now that you have cloned your GitHub repository to your local machine, you are ready to start structuring your project.

Crafting Successful dbt Projects: Best Practices and Strategies

Analytics engineering, at its core, is about helping groups of human beings collaborate on better decisions at scale.

Humans generally have limited bandwidth for making decisions.

Humans also, as a cooperative social species, rely on systems and patterns to optimize collaboration with others.

This combination of traits means that for collaborative projects it's crucial to establish consistent and comprehensible norms such that a team’s limited bandwidth for decision making can be spent on unique and difficult problems, not deciding where folders should go or how to name files.

Building a great dbt project is an inherently collaborative endeavor, bringing together domain knowledge from every department to map the goals and narratives of the entire company.

As such, it is especially important to establish a deep and broad set of patterns to ensure as many people as possible are empowered to leverage their particular expertise in a positive way, and to ensure that the project remains approachable and maintainable as your organization scales.

Steve Jobs and the Analogous Brilliance of dbt

Famously, Steve Jobs wore the same outfit everyday to reduce decision fatigue. You can think of this overview similarly, as a black turtleneck and New Balance sneakers for your company’s dbt project.

A dbt project’s power outfit, or more accurately its structure, is composed not of material fabric: but of files, folders, naming conventions, and programming patterns.

How you label things, group them, split them up, or bring them together — the system you use to organize the data transformations encoded in your dbt project — this is your project’s structure.

Structuring dbt projects

Structuring the files, folders, and models for our three primary layers in the models directory, which build on each other:

Staging — creating our atoms, our initial modular building blocks, from source data
Intermediate — stacking layers of logic with clear and specific purposes to prepare our staging models to join into the entities we want
Marts — bringing together our modular pieces into a wide, rich vision of the entities our organization cares about

Below is a complete file tree of the jaffle_shop sample project from dbt Docs, and a nice example of the default monolithic dbt structure:

Getting Started: Creating Your New dbt Project

The quickest and easiest way to get started with a new dbt project is with the init command. The dbt init command will begin setting up your profile and prompt for inputs.

Then, it will create a new folder with your project name and sample files, and create a connection profile on your local machine.

To create a new dbt project, execute the dbt init command in your terminal:

If you've already cloned or downloaded an existing dbt project, dbt init can still help you set up your connection profile so that you can start working quickly.

The command will prompt you for connection information, as above, and add a profile (using the profile name from the project) to your local profiles.yml, or create the file if it doesn't already exist.

Harnessing Git: Best Practices for Committing Changes

When finished, it's important to commit your changes, so that the repository contains the latest code.

To commit your changes, link to the GitHub repository you created for your dbt project by running the following commands in Terminal:

Return to your GitHub repository to verify your new files have been added.

Now that you set up your dbt project, you can get to the fun part — building models!

Thanks for reading. Your feedback and suggestions are always welcome. Stay tuned for more engaging content ahead!

Setting Up Your Local Machine for dbt Core: A Comprehensive Guide

Alex Paul Migit

Sr. Data & Analytics Engineer | Athlete | Business Advisor | Musician

Getting Started: Setting Up for Success

Streamlining Your Workflow: Mastering Package Installation for Efficiency

Connection profiles

Unlocking Project Configurations: Optimizing Your Workflow with dbt

Creating The Repository

Crafting Successful dbt Projects: Best Practices and Strategies

Steve Jobs and the Analogous Brilliance of dbt

Structuring dbt projects

Getting Started: Creating Your New dbt Project

Harnessing Git: Best Practices for Committing Changes

More articles by this author

Others also viewed

Forecasting: Principles and Practice - The Pythonic Way, The PostgreSQL VScode Extension

Sweetviz

Using Python to create Custom Graphs - Waterfall Diagram

Building Azure Data Factory pipelines using Python

Automating Weather Data Processing with Airflow, Docker, and Python

Introduction to Polar: A Modern DataFrame Library for Python

Harnessing Prompt Engineering to Build a FHIR Server in Python

Supercharge your development with Python's fastest web framework "FastAPI"

Python_FTP_Server_and_Client : Distributed Log File Automated Archive System

JSON Strings and Python Objects for Data Wrangling: A Beginner's Guide (Part 1)

Explore topics

Getting Started: Setting Up for Success

Streamlining Your Workflow: Mastering Package Installation for Efficiency

Connection profiles

Unlocking Project Configurations: Optimizing Your Workflow with dbt

Creating The Repository

Crafting Successful dbt Projects: Best Practices and Strategies

Steve Jobs and the Analogous Brilliance of dbt

Structuring dbt projects

Getting Started: Creating Your New dbt Project

Harnessing Git: Best Practices for Committing Changes

🧊 Insuring Data Integrity for Downstream Users in Snowflake: A Modern Data Stack Imperative

May 25, 2025

Seamlessly Transitioning from HubSpot to Salesforce: A Step-by-Step Guide to Elevate Your CRM Game

Mar 4, 2025

Unlocking Efficiency: Why Every Data Engineer Should Master Virtual Environments

Nov 20, 2024

Building a Scalable Snowflake ETL Pipeline with Fivetran, dbt Core, VS Code, and GitHub

Apr 30, 2024

Unlocking Cross-Platform Connectivity: Accessing SQL Server on Your Mac with Azure Data Studio

Apr 23, 2024

Simplifying Grid State Persistence in .NET (MVVM) Web Applications w/ Kendo UI: A Step-by-Step Guide

Apr 15, 2024

Decoding SQL: A Concise Historical Guide to Pronunciation

Jan 27, 2024

Upgrade MariaDB 5.5 to Version 10.2 on CentOS 7 (Core)

Feb 16, 2020

How To Redirect a WWW Subdomain Name To a Root Domain With Amazon S3

Feb 10, 2020

How To Deploy a High-Availability WordPress Website with External Amazon RDS Database

Jan 30, 2020

Others also viewed

Forecasting: Principles and Practice - The Pythonic Way, The PostgreSQL VScode Extension

Sweetviz

Using Python to create Custom Graphs - Waterfall Diagram

Building Azure Data Factory pipelines using Python

Automating Weather Data Processing with Airflow, Docker, and Python

Introduction to Polar: A Modern DataFrame Library for Python

Harnessing Prompt Engineering to Build a FHIR Server in Python

Supercharge your development with Python's fastest web framework "FastAPI"

Python_FTP_Server_and_Client : Distributed Log File Automated Archive System

JSON Strings and Python Objects for Data Wrangling: A Beginner's Guide (Part 1)

Explore topics