Using Service Principal (SPN) with the COPY INTO command in Fabric Warehouse
When working with data ingestion in Microsoft Fabric Warehouse, especially from firewall-protected storage accounts, security becomes a central concern. In many organizations, not all users are allowed access to the storage layer due to strict governance policies. So, how do you securely run a COPY INTO command to ingest data from such storage using a Service Principal (SPN)?
Let’s dive into the challenge, understand why it happens, and walk through a detailed step-by-step guide to successfully make it work using best practices.
The problem: Running COPY INTO with firewall enabled on your storage account and SPN
When a firewall is enabled on your Azure Data Lake Storage (ADLS) Gen 2, Workspace Identity becomes mandatory in Microsoft Fabric. This identity acts as the secure conduit for accessing storage resources behind firewalls.
Allow me to clarify some key concepts.
Storage account behind firewalls
Imagine your Microsoft Fabric workspace as a department in a big digital office building. This department needs to access a secure vault, in this case your ADLS Gen2 storage account, which is protected behind a strict firewall.
But this isn’t just any vault. It only opens for traffic coming from approved IP addresses or virtual networks. It doesn’t recognize individuals, it checks who’s knocking based on network source. So, even if you personally have permission to access the data, the vault's firewall doesn’t see you. Instead, it sees the shared, dynamic IP addresses used by Microsoft Fabric. And that’s a problem because trusting all Microsoft Fabric traffic would be a major security risk.
What is a workspace identity?
To solve this, Microsoft Fabric gives your workspace its own Workspace Identity — a special type of Managed Identity. Think of it as assigning your department its own robot assistant. This robot has its own secure key and doesn’t need to borrow yours. It represents your entire workspace and can securely authenticate to other Azure services, like ADLS Gen2, without any manual credential management.
This identity is registered in Microsoft Entra ID and is fully managed by Azure, meaning it’s secure, unique, and consistent. It’s like giving your department its own company ID card that can be used at the vault’s door.
Learn more about Workspace Identity here Workspace identity - Microsoft Fabric | Microsoft Learn
Why a workspace identity is needed
Since the vault (ADLS Gen2) only opens for specific sources, you need a trusted, stable identity that the firewall can recognize. This is exactly what the Workspace Identity provides. Rather than letting every team member knock individually, the workspace itself becomes the verified caller. If the Workspace Identity is on the vault’s approved list, the firewall opens the door.
Trusted workspace access
To allow Microsoft Fabric to securely access an ADLS Gen2 protected by a firewall, you can use Trusted Workspace access. This method doesn't require Private Endpoints or opening your firewall to broad IP ranges.
Instead of relying on IP-based rules or private network configurations, you can explicitly grant access to a specific Fabric workspace using its Workspace Identity. You can follow the documentation to configure this trust directly on the ADLS Gen 2 - Trusted workspace access in Microsoft Fabric
The role of workspace identity
It offers a consistent, auditable identity that the firewall can recognize. Without it, ADLS Gen2 can't distinguish your traffic from other Microsoft Fabric users, blocking secure access.
Now that you know the basics, let's return to the COPY INTO command.
Fabric Warehouse supports the COPY INTO command using a Service Principal, but there's a catch: the command only works when the process executing it is directly authenticated as the SPN. Check out this documentation Service Principals in Fabric Data Warehouse - Microsoft Fabric | Microsoft Learn
Here’s what makes things tricky:
Even if you grant the SPN access to the warehouse (via workspace roles or item permissions), trying to use it via SQL Server Management Studio (SSMS) will result in an error, something like:
This happens because SSMS and other IDEs can’t generate the Entra ID token that Fabric needs for authorization. And to enforce security, Microsoft Fabric requires a fresh sign-in every 30 days to ensure token validity.
In modern security, systems should never be able to create an access token on behalf of a user.
Systems should not generate user tokens because it gives them too much power to impersonate users, creating a huge security risk if the system is compromised. It's like giving a servant the power to forge your identity papers.
User-initiated operations mean when a user initiates the login, they are explicitly authenticating themselves with a trusted identity provider like Entra ID and giving consent for the application to receive a specific, limited-use token. The system never sees the user's password. Security best practices prioritize user control.
Tokens are more secure because they eliminate credential exposure, have limited lifespans and scopes, are verifiable, and enforce principles like least privilege, significantly reducing the impact of potential breaches compared to direct password usage.
So how do you automate or work around this limitation?
That’s where the magic trick comes in: authenticate the SPN using a Microsoft Fabric API call first. This generates the required Entra ID token behind the scenes, which then lets you run the COPY INTO command successfully even from SSMS.
Let's review the step-by-step process for running COPY INTO using an SPN. Here’s the full process you need to follow to get things working:
Step 1: Create a Workspace Identity
You must enable Workspace Identity for the Fabric workspace accessing the protected storage.
This identity is used to securely connect to the storage account.
Documentation: Workspace identity - Microsoft Fabric | Microsoft Learn
Step 2: Enable Trusted Workspace Access
Fabric allows trusted access to storage accounts with firewalls enabled by using Entra credentials.
Once Workspace Identity is enabled, Fabric can securely access storage behind firewalls.
This allows seamless integration without exposing the storage account publicly.
Documentation: Trusted workspace access in Microsoft Fabric - Microsoft Fabric | Microsoft Learn
Step 3: Assign SPN permissions
Configure role assignments at the storage account level for your SPN.
Recommended roles:
Storage Blob Data Contributor
Documentation: Grant the identity permissions on the storage account - Microsoft Fabric | Microsoft Learn
Step 4: Authenticate using the SPN via API
Use your Service Principal (SPN) to authenticate through an API, not through the Fabric UI or SSMS.
Make sure this process is automated or repeated every 30 days, as tokens expire.
Step 5: Trigger an API call
Now that your SPN is authenticated, you must trigger any Fabric API call to complete token generation.
This will generate the Entra ID token required by Fabric for future operations.
You can use Python notebooks, Azure CLI, PowerShell, or other REST clients.
For example, call the Get Warehouse API to list metadata.
Documentation: Items - Get Warehouse - REST API (Warehouse) | Microsoft Learn
This is the key step that makes everything else work. Without it, the Entra ID token doesn’t get generated and Fabric will deny access.
I have developed a PowerShell module that enables authentication using the SPN. For more details, check out the link: Fabric Toolbox PowerShell Module
The following PowerShell code assists with the execution of steps 4 and 5:
Step 6: Connect to the Warehouse using SSMS
Now that your SPN is authenticated and trusted:
Open SQL Server Management Studio (SSMS).
Use the SPN credentials to connect to your Microsoft Fabric Warehouse.
You must use Service Principal authentication modes.
Documentation: Connect to client applications using SPN - Microsoft Fabric | Microsoft Learn
Step 7: Run the COPY INTO command
Here’s the final step. Use your authenticated session to run the command without specifying a CREDENTIAL parameter.
The previously generated Entra ID token from the API call is what grants access behind the scenes.
That’s it! Your data is now securely ingested into Microsoft Fabric Warehouse from a firewall-protected the storage account using an SPN.
[EDIT 14/08/2025] - Adding OPENROWSET command
Fabric Warehouse provides support for OPENROWSET, and it consistently applies the same behavior as COPY INTO when accessing a storage account protected by a firewall.
Just make sure to go through the same steps so you can run it without any issues.
The function enables ad-hoc reading of external file formats like Parquet, CSV, or JSONL and returns the data as a rowset.
This makes it ideal for previewing file contents before loading into warehouse tables.
To inspect the schema (column names and types) without previewing actual data, use the stored procedure:
This returns metadata about the result set, helping you design your destination tables with accurate types before running the COPY INTO command.
Explore the documentation for a deeper understanding - Browse File Content Before Ingestion with the OPENROWSET function - Microsoft Fabric | Microsoft Learn
[END EDIT]
Security Best Practices
Use least privilege: Only assign the minimum roles and permissions needed for the SPN.
Automate token refresh: Remember that Entra ID tokens expire every 30 days.
Avoid public access: Use Workspace Identity and Trusted Access to work securely behind firewalls.
Final Thoughts
Using a Service Principal to execute the COPY INTO command in Microsoft Fabric may seem daunting at first especially with firewalls, token-based auth, and storage security in play. But with the right steps and a proper understanding of how authentication flows through Microsoft Fabric, it becomes a secure and repeatable process.
Principal Product Manager at Striim driving data insights and delivery.
1wTiago Balabuch, Thanks for sharing this article. I have a follow up question. What happens when you issue a COPY INTO command from fabric warehouse SQL endpoint UI, with a workspace identity enabled and the ADLS Gen2 storage is behind a firewall. I see it doesn't work. Any explanation on why it doesn't work ? cc: Freddie Santos
Microsoft Fabric | SQL Server | Integration | Information Architecture | Power BI | Monitoring | Mirroring | Cloud Migration | Orchestration
2wGreat idea, Tiago Balabuch. For bronze and silver ingestion, we use COPY INTO and TRUNCATE TABLE / INSERT INTO often for Fabric Data Warehouse from external data sources. Interesting concept you raise here. Loading 500 GB into ADLSg2 this week, and will begin Purview and AI Search processing next week for a client. Let's talk about this soon, if you have 30 minutes, Tom
Empowering data-driven innovation with Microsoft Fabric | Transforming data ecosystems with Lakehouse, Warehouse, Spark, Eventhouse, OneLake & Data Integration | Speaker & Trainer | Sr Program Manager | Fabric ACE Team
2wA big shoutout to Freddie Santos for taking the time to review it. I truly appreciated your help.