Articles in this section

Four Easy Ways to Reduce the Size and Improve the Performance of Your PBIX File

Updated:

Power BI files can quickly become large and slow if development best practices aren’t followed. Large models not only impact performance but also make development, sharing, and refreshing data more difficult. Fortunately, there are a few straightforward practices that can significantly reduce file size and improve overall efficiency without sacrificing functionality.

First: Disable Auto Date/Time

One of the easiest and most impactful steps is disabling Auto Date/Time in the settings. By default, Power BI automatically creates hidden date tables for every date column in your model. This means that Power BI automatically generates Year, Quarter, Month, and Day for each date field. While convenient, this feature can dramatically increase model size, especially in datasets with numerous date fields. Each of these hidden tables consumes memory and adds unnecessary complexity. Turning off Auto Date/Time forces you to use a proper, centralized date table instead, which is both more efficient and aligns with best practices for time intelligence calculations. This step alone can significantly reduce the file size and performance of your PBIX file and reports.

To disable Auto Date/Time, go to File > Options & Settings > Options > Data Load (either under global or current file settings). Under Time intelligence, uncheck Auto date/time.

Picture1.png

Second: Incorporate Query Folding

Another key optimization is structuring your data transformations (in the query editor) to support query folding. Query folding allows Power BI to push transformations back to the source system—such as SQL Server—rather than processing them locally in the PBIX file. When transformations are folded, the heavy lifting happens at the database level, which is typically far more efficient. Poorly structured steps, such as adding custom columns too early or using non-foldable operations, can break query folding and force Power BI to process large volumes of data in memory. By carefully ordering and designing transformation steps, you can keep folding intact and significantly reduce resource usage.

In practice, this means applying transformations in a logical sequence: first reduce the size of the dataset (filter rows, remove columns), then perform joins and aggregations, and only introduce more complex or custom logic after the dataset has been minimized. The earlier query folding is preserved, the less data Power BI needs to import and process locally.

There are several common transformations that typically support query folding when working with relational sources like SQL Server. These operations are generally translatable into native SQL and should be prioritized early in your query:

  • Filtering rows
  • Selecting or removing columns
  • Renaming columns
  • Changing data types (simple casts)
  • Basic aggregations (Group By)
  • Merging queries (joins) within the same data source
  • Appending queries from the same source
  • Removing duplicates (in many cases, depending on source support)

On the other hand, some transformations commonly break query folding (note: this does not break the transformation process itself) because they cannot be easily translated into source queries or require row-by-row evaluation. These should be used as late as possible in the transformation process:

  • Adding custom columns with complex logic (if statements, text parsing, multi-step calculations)
  • Using functions like Text.Contains, Text.Proper, or other advanced text transformations (depending on source support)
  • Adding an index column
  • Pivoting and unpivoting data (in many scenarios)
  • Using Table.Buffer or other in-memory enforcement functions
  • Invoking custom functions row-by-row
  • Merging data across different data sources
  • Complex conditional replacements or transformations

It’s important to note that query folding behavior can vary depending on the data source and connector. Some modern databases can handle more advanced transformations than others. Because of this, it’s a best practice to regularly check whether folding is still occurring by right-clicking a step in Power Query and selecting “View Native Query.” If this option is disabled, folding has been broken at that step.

Third: Remove Unnecessary Columns, Especially Those with High Cardinality

Removing unnecessary columns is another simple yet powerful way to shrink your data model. Every column imported into Power BI consumes memory, but columns with high cardinality—those with many unique values, such as transaction IDs or timestamps—are particularly memory-consuming. If these columns are not actively used in visuals, relationships, or calculations, they should be removed as early as possible in the data loading process. This not only reduces file size but also improves compression and speeds up query performance.

Why is reducing the number of high cardinality columns important? Power BI stores data in a columnar format, meaning each column is stored separately rather than row by row. This allows for high compression, especially for low-cardinality columns, because repeating values can be stored efficiently using techniques like dictionary encoding. High-cardinality columns compress less effectively, so removing unnecessary unique columns from fact tables can improve performance and reduce file size.

Another way to streamline your data model is to avoid including descriptive columns (like names) in fact tables that are already linked to an ID. For instance, a fact table might have both ItemID and ItemName. However, the ItemName column isn’t necessary in the fact table because the fact table should connect to a dimension table—such as dim_Item—that contains the ItemName (as in the star schema). Any reporting or dashboard use of ItemName should reference the dimension table rather than the fact table.

Fourth: Avoid Creating Calculated Columns as Much as Possible

Finally, minimizing the use of calculated columns can lead to substantial performance gains. Calculated columns are computed and stored in the model, increasing its size and memory footprint. In many cases, their logic can be replaced with measures, which are calculated on the fly and do not consume storage space. If you need to perform a calculation that iterates over rows in a table, use the DAX X-functions instead (e.g., SUMX; AVERAGEX). Measures are generally more efficient and flexible, especially for aggregations and dynamic calculations. By shifting logic away from calculated columns and into measures where appropriate, you can create a leaner and more performant model.

If you really need to add another column, instead of creating a calculated column with DAX, it’s best practice to add the column in either the query editor or in the original data source.

Conclusion

By implementing these four steps—disabling Auto Date/Time, preserving query folding, removing unnecessary columns (especially those with high cardinality), and avoiding calculated columns when possible—you can significantly optimize your PBIX files. These practices not only improve performance but also create cleaner, more maintainable data models that scale more effectively as your datasets grow.

Was this article useful?
Like
Dislike
Help us improve this page
Please provide feedback or comments
Access denied
Access denied