Fragmentation in SQL Azure!


Do you think you are paying for whatever space you are using or you are paying extra?

Most of the people think that they won’t have fragmentation in SQL Azure, but that’s not right SQL Azure is also SQL residing on machines which are present in cloud environment and we have as equal probability of fragmentation in Azure as it is there in on-premise environment.

Let me give you some recent problems on which I worked on,

One of my client was having database of 120GB, he was using federation and having multiple shards, he did split operation which does not involve any data movement but it just creates new shard which will hold data of next month and after Split operation finished his 120GB shard was showing size as 80 GB

Now what happened to those 40 GB?

Number of rows were same there was no data loss so was he paying more than what he was actually storing?

Reason of that data size difference was fragmentation, when he did Split operation SQL Azure engine did rebuild of indexes in background which removed fragmentation and which resulted in reduction in size of database shard.

In another example database size became 350 MB from original 1.5 GB after running query which does rebuild on all indexes of database.(isn’t that great considering how much dollar it would have saved!!!!)

Fragmentation becomes nightmare when we have unique identifier as primary key and large columns like varchar(1000).

Now question comes how to find out whether my database is having fragmentation or not.

 

First let’s find size of all tables in database, most of the people use database size feature of SQL Azure management portal to find out size , I would say instead use SQL Query to find perfect number.

Query:

 

select obj.name, sum(reserved_page_count) * 8.0 as "size in KB" from sys.dm_db_partition_stats part, sys.objects obj where part.object_id = obj.object_id group by obj.name

 

This query gives size of each table in given database.

Connect to database in concern and run this query.

 

 

select sum(reserved_page_count) * 8.0 / 1024 as "size in MB" from sys.dm_db_partition_stats

This query gives size of overall database… In case you are not interested in size of each table!

 

Now next task is finding out % of fragmentation…

 

For that also we have query which uses SQL Azure DMV’s to find out % of fragmentation

 

SELECT
DB_NAME() AS DBName
,OBJECT_NAME(ps.object_id) AS TableName
,i.name AS IndexName
,ips.index_type_desc
,ips.avg_fragmentation_in_percent
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.indexes i
ON ps.object_id = i.object_id
AND ps.index_id = i.index_id
CROSS APPLY sys.dm_db_index_physical_stats(DB_ID(), ps.object_id, ps.index_id, null, 'LIMITED') ips
ORDER BY ps.object_id, ps.index_id

 

Running this query will give you result something like…

 

 

 

So it gives,

1)      Database Name

2)      Table Name

3)      Index Name

4)      Index type

And

5)      % of Fragmentation for that index

 

Any index which shows more than 10% of fragmentation needs rebuild or reorganization of index.

If fragmentation is less than 30% we can go with reorganization of index and if fragmentation is more than 30% then we need to rebuild indexes.

When we rebuild indexes we can do it with ONLINE=ON option which will keep table live while rebuilding happens, it will slow down overall rebuilding process.

 

There is a limitation that when we have any table which has columns which has large data like

varchar(50) varbinary(max) etc , we cannot do rebuild operation with ONLINE=ON .

 

For such table when we do rebuild of indexes table won’t be accessible.

 

Now question is how to do rebuild or reorganization of indexes…?

 

We have query for that also…

 

Query:

 

DECLARE @TableName varchar(255)

DECLARE TableCursor CURSOR FOR
(
SELECT '[' + IST.TABLE_SCHEMA + '].[' + IST.TABLE_NAME + ']' AS [TableName]
FROM INFORMATION_SCHEMA.TABLES IST
WHERE IST.TABLE_TYPE = 'BASE TABLE'
)

OPEN TableCursor
FETCH NEXT FROM TableCursor INTO @TableName
WHILE @@FETCH_STATUS = 0

BEGIN
PRINT('Rebuilding Indexes on ' + @TableName)
Begin Try
EXEC('ALTER INDEX ALL ON ' + @TableName + ' REBUILD with (ONLINE=ON)')
End Try
Begin Catch
PRINT('Cannot do rebuild with Online=On option, taking table ' + @TableName+' down for douing rebuild')
EXEC('ALTER INDEX ALL ON ' + @TableName + ' REBUILD')
End Catch
FETCH NEXT FROM TableCursor INTO @TableName
END

CLOSE TableCursor
DEALLOCATE TableCursor

 

 

 

You need to connect to database which is having fragmentation problem and run this query, it will do rebuild on all tables present in database.

If it is possible to rebuild indexes while keeping database online it will do that else it will run without online=on option.

 

Output of this query will be something like,

 

 

Once this query finishes executing run query to find database size again and see the magic!

 

Send some gifts if this blog help you in saving some money!!! J

 

Query to find fragmentation on specific database and specific table

 

 

SELECT a.index_id, name, avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (DB_ID(N'<Database name>'), OBJECT_ID(N'<table name>'), NULL, NULL, NULL) AS a
JOIN sys.indexes AS b ON a.object_id = b.object_id AND a.index_id = b.index_id;
GO

 

 

Some links which gives more details,

 

http://msdn.microsoft.com/en-us/library/ms189858%28v=sql.105%29.aspx

http://msdn.microsoft.com/en-us/library/ms188917.aspx

http://beyondrelational.com/modules/2/blogs/76/posts/15290/index-fragmentation-in-sql-azure.aspx

 

 

 

Comments (12)

  1. Someone Asked me whether it is possible to rebuild one index at a time,

    Yes it is possible, i manage to automate it using below query

    DECLARE @TableName varchar(255)

    DECLARE @IndexName varchar(255)

    Declare IndexCursor CURSOR FOR

    (

    select '['+ind.name+']' index_name,'[' + IST.TABLE_SCHEMA + '].[' + IST.TABLE_NAME + ']'  table_name from

    sys.indexes as ind

    inner join sys.tables as tab on tab.object_id=ind.object_id

    inner join information_schema.tables ist on tab.name=ist.TABLE_NAME

    )

    OPEN IndexCursor

    FETCH NEXT FROM IndexCursor INTO @IndexName,@TableName

    WHILE @@FETCH_STATUS = 0

    BEGIN

    Begin Try

    exec('ALTER INDEX '+@Indexname+' ON ' + @TableName + ' REBUILD;')

    print('Rebuild done for INDEX '+@Indexname+' ON ' + @TableName)

    End Try

    Begin Catch

    print('Rebuild failed for INDEX '+@Indexname+' ON ' + @TableName)

    End Catch

    FETCH NEXT FROM IndexCursor INTO @IndexName,@TableName

    END

    CLOSE IndexCursor

    DEALLOCATE IndexCursor

  2. Alexandros says:

    Can you make a rough estimate on how mush time it will take to rebuild the indexes of a 2GB database?

  3. Dilkush says:

    It will actually depend on how fragmented indexes are.

    I do not have estimates but it could have 30 min to somewhere around 1 hour max.

    Do online rebuild operation so it will not impact your production(unless it uses up all tempdb space, in which case do index by index rebuild)

    And then do offline for which online fails.

  4. G says:

    I tried to run the script ("Now question is how to do rebuild or reorganization of indexes") but after a while it throw:

    Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.

    Any way to fix the timeout?

  5. G says:

    Nevermind,

    The timeout was the client (azure's SQL database manager), so I run the script per table

    ALTER INDEX ALL ON dbo.SMSMessages REBUILD;

  6. Dilkush-MSFT says:

    G You can also use per index fragmentation mentioned in first comment.

  7. Jathin says:

    Can we rebuild index from the command prompt?

    I am getting the following error when i tried to use sqlcmd to rebuild my index.

    >sqlcmd -Q "ALTER INDEX PK_Index ON dbo.scheme.tablename REBUILD with (ONLINE = on)" -U user@server -P password -S sqlserver.database.windows.net

    Msg 40515, Level 15, State 1, Server sqlserver, Line 16

    Reference to database and/or server name in 'dbo.scheme.tablename' is not supported in this version of SQL Server.

  8. Dilkush-MSFT says:

    I think command is something wrong,

    ALTER INDEX PK_Index ON dbo.scheme.tablename R

    Azure does not support three part query… So it can be schema.tablename but it cant be dbname,schema.tablename

  9. I've been fighting with a DotNetNuke install hosted on Azure for a while. We've been testing as there's a lot we like about Azure but the performance when editing DotNetNuke has caused us to go a different route…but that's another..

  10. MyWish says:

    Hi

    beyondrelational.com/…/index-fragmentation-in-sql-azure.aspx

    Blog is saying

    FILLFACTOR (it is defaulted to 100%, and cannot be changed)

    REORGANIZE (we have REBUILD only)

    SORT_IN_TEMPDB

    DATA_COMPRESSION

    are won't work on SQL Azure ..Is it true ?

  11. yup that would be correct as far as i know.

    if you look at below article,

    msdn.microsoft.com/…/ms188388.aspx

    there is no re-organize for Azure alter index command.

    and as tempdb is shared across all tenants in azure I think sorting or usage of tempdb would be limited otherwise it will blow up temp db causing server issues for overall server.

    Azure world is changing rapidly so not sure whether this features will come up sometime or not but at this point no!!!! (you can research on other two as I didn't check whether they are supported or not)

  12. Ian Bates says:

    Wow – thanks for this post! – I was puzzling why I took a copy of our Azure db and one of the tables used less than 1/5 of the space on Azure. Turns out index fragmentation was 99%!