Just what I needed. Another place to run DAX. |
Opening DAX Studio and seeing a blank query pane is intimidating because it seems like the main thing to do is write DAX, then try to rewrite it to run faster, but it's not. Most use cases I have found for DAX Studio have nothing to do with writing DAX, but instead improving my data model.
DAX Studio is a free, open-source tool designed to help you work with Data Analysis Expressions (DAX) queries in Power BI, but it also generates useful metadata about Semantic Models, helps you understand how a DAX query is executed against your data model structure, and more.
Getting started
Get DAX Studio directly from the official website, https://daxstudio.org/
Once installed, you can connect to a Power BI semantic model in two ways:
- Local Semantic Model: Open your PBIX in Power BI Desktop and navigate to the External Tools menu. DAX Studio will launch pre-connected.
- Semantic Model hosted in Power BI Service: Open the Settings page of the target Semantic Model in the Power BI Service, then the Server settings. Copy the Connection string (Data Source=powerbi://api.powerbi.com/...) to your clipboard. Open DAX Studio manually, which will bring up a Connect panel. Select the Tabular Server radio button, then paste the Connection string in the input box and click Connect.
Use Cases
Analyze and optimize the data model
Setup, viewing the semantic model summary
- Click the Advanced menu at the top, then View Metrics. A new tab appears at the bottom called VertiPaq Analyzer. Click VertiPaq Analyzer.
- Under VertiPaq Analyzer, click the Summary menu. This gives you a starting point to understand whether your Semantic Model is large enough (Semantic model size in memory) to be worthwhile to invest time in reducing its size. Revisiting the size after refinements also gives you a way to measure progress in reducing Semantic Model size.
Tables
- Focus on the tables that take the most space ("Total Size") for optimization.
- For Tables, the Cardinality column contains the row count.
- The RI Violations column also gives you a preview of a referential integrity issue to check into later, under Relationships. (Hover over any column header for a brief definition throughout.)
- "LocalDate…" tables indicate AutoDateTime is being used, which should be disabled.
Columns
- Prioritize the largest columns ("Total Size") for optimization.
- Don't overly normalize, remove what fields you can, and reduce the distinct values of the fields that remain.
Relationships
- First, review the relationship cardinalities, which should generally all be many-to-one.
- Next, review the key field cardinalities. Two items to note regarding the cardinalities are:
- A very high From Cardinality is likely to be a slower relationship for performance
- An excessively high To Cardinality (higher than the From Cardinality) may be mean you are carrying extraneous reference data, ex., a full Patient dimension table with Patients for all-time when your fact table only has Patient transactions for the last year.
- Lastly, the Missing Keys, Invalid Rows, and Sample Violations provide information about missing reference data -- the inverse of the last example with the Patient dimension. This indicates that facts will not roll up to any dimensional values from a reference data table.
Measure and understand query performance
Besides simply measuring the run time of a query -- essential to measure progress when tuning performance -- DAX Studio helps you understand the query logic behind your visuals in a way you can actually understand (i.e., not DAX, which can be hard to understand). Here's how to analyze a specific visual's DAX query:
- Open the Power BI Desktop report and navigate to the Performance Recorder under the Optimize menu.
- Start recording, then open the page with your target visual. Stop recording once done.
- Locate the relevant activity in the list and copy the DAX query. Note: Selecting a step will also highlight the visual, giving you some feedback that you have found the relevant query.
- Launch DAX Studio and paste the query into the query pane.
- Click Server Timings (opens Server Timings tab), then Clear Cache, then Run.
Click the Server Timings tab to review what it captured about your query -- or really, the data model:
- First, note whether multiple SE (Storage Engine) queries were required. This can happen with complex DAX functions or expressions or as a result of complex data models, such as those containing many-to-many relationships.
- Next, look at the individual queries by selecting each to display xmSQL output in the right pane. This is a SQL-like representation of the requests that the Formula Engine sent to the Storage Engine, so it's easier to understand the work that Power BI is doing based on the combination of the data model structure and DAX query running against it.
Two lesser known features
Impact analysis
Dynamic Management Views (DMVs)
- How to connect Power BI Desktop to Analysis Services DMVs
- A very simple example of output you could generate
- And another
- The schema behind the VertiPaq Analyzer (scroll down to "Internal Structure")
Coincidentally, Microsoft just released a lot of these DMVs as DAX functions, so another interesting use case might be to build a descriptive interface to browse metadata when you create a Semantic Model to serve up data for thin reports instead of leaving the report pages completely blank.