Lately a lot of questions came up around what our different compression types in SQL Server 2008 would achieve in reduction of database volume. In this article and some following articles we would like to answer this question as well as the question on how to get an existing database to use SQL Server 2008 compression methods.
Lets clarify one thing first: SQL Server 2008 Row Level Compression is not really a compression in the sense of applying an algorithm to compress some data. It simply is a different row format which is more efficiently storing all numeric data types and fixed length character data types. In all our tests with SAP systems we so far didn’t experience any negative performance impact by applying Row Level Compression. Hence any SAP application which will install against SQL Server 2008 will create ALL tables Row Level compressed. Nonclustered Indexes are created non-compressed since SAP is not yet ready with the implementation of compressing non-clustered indexes. SQL Server 2008 itself can compress clustered as well as non-clustered indexes.
But now to our usage case in Microsoft’s own ERP system. The challenge looked like:
· We were looking at our SAP ERP system having 6.5TB data volume
· We were not allowed to take any additional downtime than the monthly 2h in order to perform additional work to compress the data. This forced us to perform compression of the tables during operational times using the possibility to rebuild the tables online
· We were not allowed to perform the compression during the week, but on the weekends
· Some Business User groups had concerns and we needed to arrange to have certain tables being rebuild on specific weekends
· The goal was to get ALL the tables onto the new Row level Compression as if the system would have been just installed by SAP against SQL Server 2008
From a tool side there are two possibilities. The functionality SAP developed to compress tables and a procedure I developed. The SAP tool has clear advantages in cases where the database is smaller (up to 1TB) and the possibility exists to take the system down for 24-36h. Whereas the tool I developed has advantages in situations as we encountered them here in Microsoft. I am in the process of writing a longer paper on both tools which we will publish on SDN and/or on Microsoft.com in the next few months. The procedure I developed will be released in one of the articles in this blog after some more testing.
Based on data we collected applying Row Level Compression in our test system it became clear that we will need to spread this activity over 5-6 weekends. We were allowed to start Friday evening 7pm, but needed to finish latest on Sunday afternoon 4pm every weekend. Based on the timing we got from the test system, we basically built 6 groups of tables. Each group was planned to be worked through on one weekend. The group for the first weekend contained all but around 120 tables from the empty tables up to small to midsized tables in the range of a few GB. At the other end of the spectrum, at the last weekend we just planned for compressing our largest table. Our largest tables in the SAP ERP system is a customer created table supporting a customer created functionality and has a volume of more than 1 TB with around 2.3 billion rows. The strategy was to perform compression on the smallest tables first in order to create free space which is necessary to perform compression on the larger tables.
Where are we now? Our progress was faster than expected. We were able to compress on two weekends all but the 6 largest tables plus some of the tables with varbinary(max) columns which can’t be rebuild online. All in all we already worked through 4TB of the original 6.5TB data volume. The 4TB of original data volume ended up to be 3TB volume after the compression. Means we cut a nice 1TB database volume out of our productive database. With at least another 2 weekends to go, we are pretty optimistic to cut out another 600GB at least. Sure, applying Row Compression to a table also will defragment a table. However we need to keep in mind that the productive database of Microsoft’s SAP ERP system got completely exported/imported for the Unicode conversion in February 2007. Hence fragmentation on some of the tables was rather limited at least to significant parts of the tables.
Another good experience was that our synchronous Database Mirroring could keep up with the rebuild activity of the tables on the productive database. There never was a need to resynchronize either the mirror or the Log-Shipping destination. Given the fact that we have 10 copies of our productive database within our SAP landscape, cutting 1TB out of it, meant saving all in all 10TB already in the whole SAP landscape.
We are looking forward to finish our activity in another 2 weekends after getting permission from our Finance Department who blocked all further compression activities while running Quarter End Reporting. Stay tuned for more news