Generated using Microsoft Designer (first attempt only) |
Instead
of removing unused columns in Power Query to reduce the size of a semantic model, select only the columns you need. Why does this
distinction matter? In short: it makes your solution more resilient.
Default scenario (how I thought, prior to this insight)
A field is added to a database table by a data developer in the development data environment. You are using the table, and sometime when you applied changes to your Semantic Model, a dependency was automatically created on that new field, even though you weren't directly using it: it became a field in your semantic model. When your solution is migrated to UAT and repointed to a UAT database, the refresh fails because that field is not found in UAT (yet?).
You jump on the problem in the development environment in Power BI Desktop and right click and remove that field from the table in Power Query to eliminate it from your semantic model, then migrate your solution again. Unfortunately, the refresh fails in UAT again because there is a Power Query dependency on that field in the "Removed Columns" step's M code. To underscore that: ironically, removing unused fields creates dependencies on the fields you aren't using.
RemoveColumns M code with explicit reference to unused field, resulting from right click/remove column |
Imagine this alternate scenario
Following best practices, you plan to get rid of a few fields you aren't using, but instead of removing the columns you don't want, you select the columns you did want. (The easiest way to do it is to use the Choose Columns menu at the top, where you can uncheck the columns you don't need.) As a result, any columns added to the source table will not hit your semantic model because they aren't in the list of selected columns. Additionally, if anything changes about the fields you removed, it won't matter because your M code and semantic model have no explicit references to them.