Doc storage in SQLce and the 4gb limit


We’ve been getting a number of questions regarding Doc/blob storage and the current 4gb limit. 
For doc storage, I’m hearing two concerns. 



  • Perf when SQLce is used for blob storage.

  • Removing the 4gb limit so that you can store more docs in the database. 

On devices, space was confined, so these just weren’t mainline scenarios.  On desktop, we realize it will be.  For SQLce 3.1, we didn’t do a ton of work as the goal was to get the existing SQL Server Mobile product out and available for general desktop usage.  Most of the work went into MSI packaging, private deployment and of course, the several name changes the team had to deal with along the way. <g>


Sravanth, our Program Manager running our Perf lab for SQLce has done a lot of great perf work.  His team identified several opportunities to increase perf, such as bumping the .sdf file allocation up when more space is needed.  The team incorporated several of their findings into the 3.1 release, and will incorporate additional enhancements in the 3.5 release.  However, the team hasn’t yet done extensive testing of blob storage for desktop scenarios but will be soon.  They’re also working on a perf white paper based on their findings.  I’ll post some snippets as a preview to the doc soon. 
Regarding blob storage; I don’t want readers to think the team doesn’t consider these important scenarios for desktop usage.  Just being upfront that on devices this wasn’t the case so it may not be optimal with the current release.  For the desktop scenarios where SQLce has been used for products like Media Center PC, the data didn’t incorporate large amounts of storage, or blobs.  As Sravanth’s team has more perf data, we’ll post the findings, and let you know what we plan to do about it.  The obvious thing to start with is to keep blobs in separate tables.  Having a single table with an additional blob will certainly slow things down for any number of reasons.  Beyond that, …please stand by… and we’ll have real numbers soon.


For the 4gb limit, we recognize the need, but because the product was optimized around this large size for devices, it’s not a simple thing for the team to change.  On devices, 4gb is beyond the normal size scenario.  Anyone seen a Pocket PC device with more than 4gb of user storage?  I’m sure there are some, but certainly not the common.  The desktop, or laptop comparison would likely be a 1 terabyte 7200rpm drive.  Unlike Express Edition, SQLce isn’t meant to be the free database with the constrained set of features to get people started.  So, we will either remove this constraint, or put it in the range where it just won’t be an issue.  As for when, well, that’s a great question.  We have lots of requests.  It’s all a matter of prioritization.  If it were as simple as setting a bit somewhere, we’d run some perf tests to make sure it didn’t topple over and just change the bit.  Because the engine was optimized to assume a 4gb limit, it will just take a bit more work and the team just needs to prioritize this cost with several other features.


I’ll see if we can post the list of features we’re considering so you can post your comments on our priority list.  But, we’ve heard 64bit, XML, Eventing (something to fill the Trigger gap), and of course the sync runtime.  The 3.5 release is mostly focused on enabling the Sync Services (OCS) features I’ve been talking about here.  When talking to customers, as soon as they become comfortable with using SQLce as their local data store, most of the time, the very next question is how to synchronize it with their central server, and how to sync via services.  As part of the 3.5 release we’ll also have some additional features to round out the product, such as the TimeStamp datatype for better, and consistent concurrency checking. 


So, we hear you on the blob storage and 4gb limit.  What would help us is to get a feel for your specifics.



  • Do you need blobs in the same table?  Or can they be stored in a related table, 1:1, or 1:many?  Today, we don’t support column level synchronization in the new Sync Services.  Is storing objects in separate tables a reasonable solution?

  • What’s the range of file sizes do you think you’ll be storing on average?

  • How many objects do you think you’ll need to store on average?

  • Are these blobs mostly reference data originated from the server?  Created on the client and synchronized up?  Yes (ie both). 

  • Will these blobs have lots of changes locally before or after they are synchronized with the server? 

Thanks,
Steve


Comments (20)

  1. jamome says:

    Steve, I do see where you are going with this post.  And I can totally emphasis with your position of bridging the roads between SQLce team development aspects and the needs/wants of your customers (other development people). But… your questions are somewhat hard to answer.  For example: “Do you need blobs in the same table?”  Personally, it makes me squirm a bit to think of choosing a DBMS that would constrain me in such a manner as how I’m can layout my data relations.  In similar regards, I’m just as disenchanted with the other questions b/c to me (a developer) I’m not thrilled about estimating worst cases when I don’t even know what they are.  So please: don’t take my post as criticism, b/c technically the questions you’ve asked are reasonable, very good, cut-and-dry questions that you technically need to ask.  Judging by the deafening silence of everyone else who jumped on the “me too” > 4GB bandwagon, I’m guessing these other people are just as leery as I am in committing to some worst case design conditions on a potential solution that’s possibly a definite maybe.  Since you’ve probably already looked at SQLce 4GB roadblock, could you talk about that a bit?  I’m guessing a major factor is that SQLce is using 32-bit unsigned integers (UINT) throughout the code, and as any knowledgeable C++ developer can tell your UINT gives 4GB.  Could you compare/contrast, a bit, some possible development tracts required to extend past 32 bit – and some general performance trade-offs?  Personally, I would find this very very interesting.  How might SQLce be extended 4GB?  Would you have to mess with a 32bit far and 32bit near pointer (Yuck!) Or, might you have some 64 emulation library for use of 64 or 48 bit ints on 32bit processors (serious performance hit!?!)  Does the SQL Server 2005 team have any basic principals you can apply to add a new type of table, such as: “CREATE LARGE_TABLE” then this table type could have special work-around code (logic) that would take a performance hit when other tables are joined, but this would allow you to support a larger table type.  After having written a bit, it’s hitting me that a SQLce database must have some sort of master space allocation scheme (mapping) that’s using 32bit UINTs.  Could you have a different space allocation scheme where each table got up to 4GB?  Could you manage space on 4 or 8 KB pages (like SQL Server)?  If so, a 32-bit collection of 1KB pages would provide a max DB size of 4TB, right?  But, this requires serious re-architecture… and you’d probably need to re-architect the table code to get around the current max table size.  Which brings me to: Currently what are the max constraints in SQLce: Max table size, Max column width, Max Field widths (varchar 8000?, exactly what data types are supported) etc.   I’m just throwing darts here, but at a very interesting topic!  Steve: Keep up the good work!  And, I’m looking forward to trying SQLce even in a 4GB capacity.  

    Regrading Perf:  I see SQLce as an Access replacement so… could you possibly (please) talk a bit about general performance differences between Access and SQLce?   And, I’m not taking about all the extra Access capabilities, extensions, tools, etc (GUI Editior, VBA integration, reports, etc), I’m just talking about SQL ISUD performance: INSERTs, UPDATEs, SELECTs and DELETEs compared to Microsoft Access.

  2. Peter says:

    Hi

    We plan to use SQLCe for our measurement equipment. The application might run on the equipment itself or on a PC. In both cases Windows XP is running on a PC hardware. There will be no connection to a SQL server (no synchronisation needed).

    100 bytes per measurement

    10 measurements per second

    10 channels

    10 kilo bytes per second

    600 kilo bytes per minute

    36 mega bytes per hour

    864 mega bytes per day

    25.92 giga bytes per month

    311.04 giga bytes per year

    Some of our measurement equipments are installed on the top of a mountain and might collect data for 1 year.

  3. Steve.Lasker says:

    Hi Peter,

    This is a great scenario to add to the matrix.  While we will expand the limit, this is one of those scenarios where it might make sense to physically break up the files.  As with Web Logs, you might want to break up the files per week, month, quarter, or even yearly.  

    When exploring the perf metrics of large databases, this is another example of how a database could grow with lots of little entries, as compared to the larger blob entries that have been discussed.

    Thanks for the example,

    Steve

  4. Well, after several name changes, power outages, the holidays, and crippling winter storms, we’ve released

  5. Stacey.Hopkins says:

    Hey Steve,

    I started thinking about your questions and worked with our product experts to come up with some answeres that actually made sense…

    For a little background, the main application I’m looking at using SQLce for deals with floorplans and building layouts.  There is actual data, like # floors, square footage, etc., as well as images.  The images can be anything from pdf to jpg to AutoCad files.

    Answers:

    1)  I currently split the images into a sepaerate table

    2 & 3)  File sizes vary from a few hunderd KB to hundreds of MB’s per image.  Also there can be multiple images per floor, building, complex, etc.

    4 & 5)  The images originally come from the server.  They can be used as reference data, or the user can make changes to them to reflect changes or errors in the original image.  These changes then need to be synch’d back to the server and from there will be propogated to other clients.

    Hope this helps in some way, and I really appreciate the work y’all are doing!

    Stacey

  6. We’ve started to get a number of request whether SQL Server Compact will support 64bit. The answer is

  7. We&#39;ve started to get a number of request whether SQL Server Compact will support 64bit. The answer

  8. JohnGalt says:

    The way we’re planning on using SQLce is to use it as an ultra-light-weight easily deployable system that can be put on our "remote users".

    The database file is encrypted and has all documents loaded into it. We have our own synchronization system, so we don’t need the MS one.  We just read the blob byte by byte from the database using a data reader, and stream that up to the server.

    we stream out and stream into the database using the same sort of commands in SQL Server 2005, but the actual blob column is almost always in a separate table so we dont have issues along those lines.

  9. Farhan says:

    I need to know that how many rows i can insert in a table in SQL CE?

    i am getting error that "not enough space to store data in database"

    although i have much space in SD card.

  10. Steve.Lasker says:

    Hi Farhan,

    I’m guessing you’re hitting the Max Database size of 128mb, which is the default value.  You can change this in the connection string.  We use the Max Database size to determine how much shared memory we pre-allocate.

    We don’t have a physical limit on the number of rows or tables, it’s more a practical limitation that varies on the type of data.

    Steve

  11. Max Kosenko says:

    Hello Steve.

    Do you have any new plans on removing size limit?

    We need to store some sort of cache in db and that can easily go beyond the limits even on PDA version. We hoped that limit will gone with 3.5, but it’s not. So is there any thoughts on what might probably become on plans with that.

  12. Steve.Lasker says:

    Hi Max,

    By removing the size limit, are you referring o to the 4gb limit, or the 128mb default value?  For the default value, you can simply change this in the connection string.  For the 4gb limit, due to the complexity, and priority based on customer feedback, we’ve pushed >4gb work ’till our next major release.  For now, one should assume that is with the next major release of Visual Studio, but we haven’t firmed up specific feature work just yet.

    Steve

  13. Max Kosenko says:

    Hi Steve,

    Thanks for your answer. Yes, I mean 4Gb limit. So SQL CE major updates are connected to VS releases as I’ve understood. That probably means something like 2010-2011 year?

    Max

  14. Steve.Lasker says:

    In general, we tend to align our major releases of SQL Server Compact with VS as we focus on our developer audience.  

    That said, VS has been working to ship on 24month ship cycles, so I wouldn’t assume 2010, but more like 2009.  It’s still early in the cycle, and we haven’t committed to a given schedule, so don’t hold me to that.  

    As you probably read, you can use multiple .sdf files to get larger storage, but there are limitations, such as you won’t be able to do cross database joins.  

    Feel free to email me directly to describe the types of data your storing that you need >4gb, as it would be helpful to understand the specifics,

    Steve

  15. Max Kosenko says:

    >use multiple .sdf files to get larger storage, but there are limitations, such as you won’t be able to do cross database joins.

    Thanks Steve.

    That’s the main problem actually – we can’t query several tables in different dbs at once.

    Something like attaching database files with different tables in same connection can help a lot (at least we’ll have an option in 2009).

    Best Regards. Max.

  16. Latha says:

    Hi,

      what is the limitation of length of the string which we give in a nvarchar column.

  17. Steve.Lasker says:

    Hi Latha,

    Here’s some info on sizes:

    http://msdn2.microsoft.com/en-us/library/ms172451(SQL.100).aspx

    Steve

  18. Prashanth says:

    Hi,

    We have a Device which has an internal memory of 2GB & 2GB of external memory(Optional). we are planning to use SQLCE as database for our app. In case if the size exceeds the space, can we span the database across the memory cards ? or Is there any other round about of solving this issue except "taking backup to PC :-)"

  19. Steve.Lasker says:

    Hi Prashanth,

    Compact doesn’t provide a way to split or partition a database across multiple drives.  That’s actually a fairly complicated infrastructure to create a virtual file system.  If the Windows Mobile operating system provided a way to create a virtual directory that happens to span two partitions, such as drive striping, I suppose you could create Compact database that happened to span these two drives, but I don’t believe Windows Mobile enables this.

    Steve

  20. Tushar says:

    Hi! Blobs are very important, to store pure binary files, encrypted files, images, compressed files etc. Therefore 4 GB limit is a bad thing. It is the same like Microsoft Access Database which runs on this limit. People are leaving and taking SQLite and other such magnificent database systems. Now the time has come for you to release the version which would eventually support at least 1 TB of database file size, as an optional parameter. So that Pocket PC would function with 4 GB, and the Desktop features should function to as large as 1 TB, all in one and the same library. Oracle too have released Berkeley DB and it's a great gift, to those who desire to have it. Then why not us desire to have it in SqlCe?

    Best wishes and grace, 🙂

    Tushar