Working with Git integration on PowerBI (and Fabric) workspaces helps you overcome a lot of the limitations that working with PowerBI Desktop and PBIX files present. This is quite a vast topic, but in this post I will try to limit myself to versioning only PowerBI reports and an inconveniance that arises specifically with reports with a Live connection.
Setup
The explanation assumes following setup is being used:
- You have a Fabric / PowerBI Premium workspace that has Git integration enabled with an Azure DevOps project on the “main” branch
- The Azure DevOps project contains a Repository (and for simplicity we’ll just have 1 branch “main”, to which we commit changes)
- On the developer’s machine you
- Author reports in PowerBI Desktop
- Use VSCode to move the changes from the local repository back and forward to the Git repository in DevOps
data:image/s3,"s3://crabby-images/60b87/60b87d196ae937accff6842e497e25a6d288f3ce" alt=""
Use Cases
This post will just look at managing a PowerBI Report, no other Fabric items. I’m also assuming you will connect the “development” workspace to Git. All other stages (Test workspace, Production workspace) are handled by the Fabric Deployment pipelines, a great way to promote items between workspaces.
Use case 1 – PowerBI Report with Semantic model
data:image/s3,"s3://crabby-images/420c9/420c963f0acf4811b1399bcc310bdc93b6f101e3" alt=""
This is the most simple type of report, where you design the semantic model together with the report.
When publishing to the PowerBI Service, this will result in two items that are visible in the workspace, the Report (containing the layout, colors, etc…) and the Semantic model (containing the structure and data of the datemodel):
data:image/s3,"s3://crabby-images/b0eeb/b0eeb5b4ea7527da95d32998467b5283bec32215" alt=""
To setup source control on such a basic PowerBI report, there are plenty of tutorials available already. I will not go into detail on this use case.
Use case 2 – PowerBI Report with a Live connection
A Live connection (which is different from a Direct Query connection) allows the report to be rendering data defined in another Semantic model available in the workspace (or any workspace for that matter). This is considered a best practice as it:
- avoids creating multiple semantic models with the same structure and the same data (which create maintenance nightmares, because it will not always be 100% the same, small changes made in model 1 fail to make it to model 2, 3, 4, 5 etc…)
- avoids creating 1 report with 100 of pages to which everyone has access and will have to be managed either through a PowerBI app with different audiences. Filters on “all pages” might create unexpected issues on pages that should not have had this filter, and many other topics from ownership to segregation of responsabilities
data:image/s3,"s3://crabby-images/2a22b/2a22bf82bc41eb32546e39b07937cd5358246ce1" alt=""
Problem
PBIP’s definition.pbir
Working with Git integration requires you to adapt the way you store PowerBI Reports. Where in the past reports were saved in PBIX file (an archive of all the files necessary to design the report and optionally the semantic model), you will now use the PowerBI Project (PBIP) format, which is a clear text representation of the definition.
When saving a PowerBI Report (with a Live connection) as PBIP, a set of files will be created like below.
data:image/s3,"s3://crabby-images/3ea75/3ea753947265109ba001aec52ac2df9524a5280c" alt=""
The root will contain the PBIP file a <reportname>.Report folder containing the report details.
The definition.pbir file will specify how the report connects to the semantic model.
byPath vs byConnection
A report with an included semantic model (use case 1) has a definition.pbir file that looks like this:
{
"version": "4.0",
"datasetReference": {
"byPath": {
"path": "../Budget.SemanticModel"
},
"byConnection": null
}
}
It simply points to the Semantic model at the same level
A report with a Live Connection (use case 2) will have a definition.pbir file that looks like this:
{
"version": "4.0",
"datasetReference": {
"byPath": null,
"byConnection": {
"connectionString": "Data Source=powerbi://api.powerbi.com/v1.0/myorg/budget;Initial Catalog=Budget;Access Mode=readonly;Integrated Security=ClaimsToken",
"pbiServiceModelId": null,
"pbiModelVirtualServerName": "sobe_wowvirtualserver",
"pbiModelDatabaseName": "aa71f564-2daf-5555-8c0c-b097e77da2e0",
"name": "EntityDataSource",
"connectionType": "pbiServiceXmlaStyleLive"
}
}
}
The real problem
You could think “This seems like a long and unnecessary explanation without any problems at all!”
However, reports with a Live connection are stored in the Fabric workspace with a definition.pbir file that uses “byPath”. It doesn’t specify a “byConnection” property at all. This has the following consequences:
- when you open a report with a Live connection (that has been synchronized from the service to DevOps to your local repository) in PowerBI desktop you have a report that seems to have its own semantic model directly available. This has 2 major consequences:
- The main disadvantage of this approach is that you have to refresh your entire datamodel on your local machine, before you can test any changes made to the report. For large semantic models that’s just unworkable.
- Another disadvantage is that you could edit the semantic model and publish changes to it affecting all other reports perhaps unintentionally.
- when you commit and synch a report with a Live connection to the Git repository from VSCode, the definition.pbir file gets correctly stored in DevOps. Upon synchronizing these changes in the Fabric workspace, the report updates fine, but behind the scenes Fabric changes the definition.pbir file to byPath quickly and shows this as “uncommitted changes” in the workspace. You have to commit these changes back to Git and update your local repository as well. The flow becomes:
- commit changes in VSCode
- synch changes from local to DevOps
- synch workspace in Fabric
- Update All items in workspace in Fabric
- commit workspace in Fabric and synch to DevOps
- get changes in VSCode
- the definition.pbir file is now back a “byPath” file, so you can start by refreshing your semantic model again
Solution
Perhaps you can live with this flow. Perhaps you don’t mind the coffee time when refreshing the semantic model. And perhaps you don’t mind just changing the definition.pbir file to byConnection every time you want to make a change to a report and then changing it back to byPath before pushing to Git.
But another solution is to signal to Git that you intend to keep the definition.pbir file different between the Git respository and your local repository. Your goal is to have a local file that you want to be able to change, whatever the content of that file is on in the Git repository. There are other scenarios for such a use-case, e.g. developers wanting to specify certain parameters in a file they want to work with, but the configuration file that goes into test/production needs to have different “versioned” values.
To accomplish this, we can use the skip-worktree bit that can be set on a file with the Git update-index command. This basically tells Git to treat the file as unchanged in the working directory. The file keeps being tracked, which is mandatory as it is an integral part of the report definition.
The workflow becomes quite simple, you only have a manual initialisation operation to perform once (assuming you’re working with an unpublished report with a Live connection).
- change the definition.pbir file temporarily to byPath (see above for the content – this is a one-time change, just to feed the repository with a byPath definition.pbir file. If you already have the report synchronised from the workspace to the repositoy, you can skip these first 2 steps)
- commit and sync to DevOps -> the “byPath” file is now in the repository and will be used by the workspace
- in VSCode, in the terminal, execute:
git update-index –skip-worktree Budget.Report\definition.pbir - change the definition.pbir file back to byConnection (as you will use this while working locally)
The development workflow is now optimized:
- Edit the report In PowerBI Desktop
- Commit and synch changes to DevOps using VSCode (the definition.pbir file will not show up)
- Update All in the Fabric workspace
The report is now updated.
No Solution
What doesn’t work:
- using the .gitignore file to ignore “definition.pbir”, as this will either
- exclude the file from the repository – but it is mandatory to have it. Not having the file will result in the error “Failed to resolve dependancies” when Updating the workspace with the report from Git. The report will never be deployed there, which is not useful.
- include the file in the repository when it is already tracked, and synchronize the changes, which is not what we want
- assume-unchanged bit, as this will just skip looking at this file for performance reasons (e.g. an entire SDK directory that doesn’t change, but you don’t want to check all the files in there all the time). This is only from the local repository point of view it will assume there is no change. However, when a change is detected from the remote repository, it will overwrite this file again, which is again not what we want to do.