PBIX bloat: a cause and the “solution”


Had a customer report that their 100kb Excel file was bloating to 77mb when imported into Power BI desktop.

Turns out the behavior that customer was hitting is by design. 

Which begs the questions: what is happening and why.

Before we go into the underlying cause it is important to remember the underlying analytics engine Power BI leverages was designed for a hosted service and many of the designs optimized speed over file size.  In this case the customer only had around 40 columns and 20,000 rows in their Excel spreadsheet, but had a two date columns that spanned millions of rows if there was a row for every date.  

 

As a way to improve performance under the covers we DO create indexes and pad out the data structure to have a row for every date!  This can really help the query performance…at the cost of the file size.

 

If file size is more important, you can control this behavior by unchecking the time intelligence option in the data import settings under Options (see image below).

 

image

 

compression, file size, increase, corrupt, corruption

Comments (0)

Skip to main content