Consistency and Corruption in Enterprise Databases


One topic that is frequently avoided or trivialized, unless it is personally affecting you, is database corruption.   It is not a pleasant topic to consider.   But expanded awareness can be valuable when handling corruption situations.

For this blog post, “consistency” and “corruption” are used as antonyms to each other.   i.e. When data is consistent or a database has consistency then there is no corruption in it, and conversely when a database is determined to be corrupt it is correspondingly inconsistent.  The word “inconsistency” carries less negative connotation than does “corruption”, so it gets used sometimes to soften the psychological stress surrounding corrupt database situations.   Corruption in a critical line-of-business database warrants serious attention.   For practical purposes, an inconsistent database is synonymous with a corrupt database.

There are different ways to describe the consistency in a Relational DataBase Management System (RDBMS) database.  They distill into two main areas: RDBMS owned and Application owned.  Those things that are under the control of the RDBMS, and those things that are outside of the control of the RDBMS but under the control of the application which uses that data.  The distinction is somewhat subtle but significant and is the purpose for this post.  

RDBMS consistency is assured on two levels: Physical and Logical.  RDBMS Physical consistency (RDBMSPhysCon) is best thought of as the correct storage and retrieval of all the zeros and ones that were saved from computer memory to disk and retrieved back from disk as intended.   RDBMS Logical Consistency (RDBMSLogCon) involves the correct interpretation of all the metadata that a RDBMS uses to make appropriate sense out of the zeros and ones that define the data of interest.  Taken together, they can be referenced as RDBMS Consistency (RDBMSCon). 

Application consistency (AppCon) relates to everything between the RDBMS and the end consumption of the data.  AppCon consists in all additional metadata used by an application that the RDBMS is  NOT aware of but is still critical to the correct and complete interpretation of the information stored in the database.  For instance: Header-Detail relationship information that is not maintained through RDBMS Declarative Referential integrity, or proprietary Data Dictionaries that keep an additional layer of data indirection on top of what is managed by the RDBMS.

These three aspects of consistency (or of corruption) can be visualized as three concentric spheres:  RDBMSPhysCon on the inside, RDBMSLogCon in the middle, and AppCon on the outside.   Some applications store all metadata about their data and all relationships between their tables inside of the RDBMS database itself, such that there is no metadata that the application uses that resides outside of the logical consistency in the RDBMS database.   In those application environments the APPCon outer consistency sphere shrinks to nothing or is effectively transparent.   Such applications have RDBMS encapsulated consistency.    Some enterprise class applications, however, do not rely on RDBMS encapsulated consistency.   That is, they depend on the outer sphere of Application Consistency for the application to operate properly.  This is significant when considering and handling corruption in such databases.  It is always delivered application consistency that the applications,and the end-users of the applications, expect for productive use of the
critical line-of-business database.

 

Elaborating on this further

A page, in the RDBMS world is a fixed amount of zeros and ones to be considered as a unit.   In SQL Server a ‘page’ has the fixed value of 8 Kilobytes or 8,192 bytes or 65, 536 bits.   So a page in SQL Server is really just a continuous stream of 65,536 zeros and ones.   “Tables” {in SQL Server vernacular} are made up of lists of connected pages to keep the data identifiable as a unit.  Relational databases are large collections of lots of these pages that are structured for fast access and efficient storage.   On these pages are indicators of how many rows should be on a particular page and where each next row should begin.   And there are also indicators of where pages came from and go to next.   More information can be found on the web {e.g. here}

RDBMSPhysCon as mentioned is the reliable storage and retrieval of a sequence of zeros and ones between memory and disk and back again. But a RDBMS has no knowledge of your data stored in it to know what is different from what you had intended to be there.   So all an RDBMS can do is check for the consistency of the data, by asking itself things about that data that it does know:  Is there the correct number of rows on this page compared to what should be?  Is the total number of bits retrieved for a page the same number of bits that were saved?   These are things that the physical consistency checking in the database can quickly and easily verify as it checks a page.   “Torn pages” are a common form of physical corruption where a fragment of a page was incompletely written out to disk.   These are usually recognized when the RDBMS contacts the actual pages (as in when an application tries to access that page or when DBCC CHECKDB is run and the consistency is explicitly assessed).

A physically consistent database is one where all of the pages of data are consistent in sequence, not fragmented or torn, and consistently mapped to a particular object.  RDBMSPhysCon is the most fundamental aspect of consistency.  Without physical consistency, one cannot deliver meaningful data to an application in a completely consistent fashion.   Most corruption starts out as physical corruption.

RDBMSLogCon presumes physical consistency but additionally draws in the rest of the database centered components that work together to make sense of the data from the database perspective.   RDBMSLogCon
is concerned with things like: are all the column data values consistent with their column data types specifications, are all of the nonclustered indexes correctly referenced to valid clustered index keys or Heap RIDs, and are all the referential integrity rules and constraints valid across the domain of tables and keys to which they apply (i.e. that all foreign keys in tables have valid primary keys)  The SQL Server consistency checking utility (DBCC CHECKDB) is able to check both RDBMSPhysCon and RDBMSLogCon from the perspective of the RDBMS.

If your application has RDBMS encapsulated consistency and if it is clean of all signs of physical and logical database corruption, then that database is fully consistent.   Since there is no additional layer of application consistency, because it has RDBMS encapsulated consistence, then everything within the database makes complete sense for any application that would be accessing the database.  All the consistency needs are entirely within the scope of the database.  But if the application that uses that database has other expectations of AppCon then this is NOT necessarily the case.

Many enterprise class applications do not rely on RDBMS encapsulated consistency.   e.g. Some choose to not use SQL Server’s DRI features.   For instance, they can have separate tables acting in Parent-Child relationships with no foreign keys assigned inside of the database (that information is managed within the application; not within the RDBMS).   As a result, there is no way for the database consistency checking (DBCC)
utility to determine whether the relationship between two such tables is consistent or not.  DBCC CHECKDB can only assess RDBMSPhysCon and RDBMSLogCon.  Enterprise class applications can also use a data dictionary to provide their own proprietary wrappers around many RDBMS features to deliver a consistent interface across the different RDBMS providers.    The RDBMS isn’t able to meaningfully inspect proprietary objects to interpret what consistency means in regards to the physical instantiation of those objects inside of the RDBMS.   That is, there is nothing that SQL Server, or any RDBMS, can do to ensure the AppCon of any database.   AppCon is outside of what an RDBMS can assess.

What are the attributes of database applications where AppCon is significant?   Most have one or more of the following attributes:

  • A proprietary application centered meta data repository (or data dictionary).  
  • Cross table references that are managed outside of the RDBMS’s Declarative Referential Integrity (DRI) management.  i.e. There can be Parent and Child relationships that the database has no awareness of.
  • Nested metadata internal to the data itself.  Data about the data or its relationship to other data, kept within the bits of data itself that the application understands but that the RDBMS does not.
  • The code of the application is built to work across multiple RDBMS vendors.   Therefore different nuances of a specific RDBMS vendor might not be supported. This typically implies the existence of a data dictionary.

 

So why is this blog post important?  Why should one care about this topic of Application Consistency rather than simply Database Consistency?  

 

  • To amplify why a database restore is so very valuable when fixing corruption problems.
  • To encourage you to make sure you have a backup strategy in place, that it is known how it works, and that it is tested to ensure that it works.
  • To grasp the impact to Application Consistency that might arise through efforts to eliminate RDBMS corruption when restoring from backup is not an option.
  • To consider what might be done to better prepare for impacts to Application Consistency when restoring from backup is not an option.

 

Database corruption happens.  And will continue to, so long as digital computing is deemed valuable for civilization.  When it is encountered, after eliminating the source cause of corruption, there is a need to get the database back to a fully consistent state (RDBMSPhysCon, RDBMSLogCon, and AppCon).   Restoring from BACKUP achieves this.   It delivers all three.  There will usually be some data loss owing to the need to go back to an earlier point-in-time, but it WILL deliver a fully consistent state of the database at that time.   And for most enterprise class applications that completely consistent state is very crucial.  Consider the impact of  lost financial details, in accurate books, and failed SOX compliance.  By restoring from backup, additional data may need to be re-entered that occurred after the time that the backup was restored to, but there will be no cause for concern for any “holes” in the data before that time.  Confidence in the reliability of the RDBMS data can be maintained.

 

It is important when working with corruption inside of an enterprise database, to remember that AppCon is the final goal of any effort to remove corruption from a database and return it to productive use.   RDBMSCon is not the only thing to consider.  One should not simply remove database corruption {support experts have guidance for this} without understanding the impact of such actions on AppCon.  The database in such situations will not truly be viable, from an end-user perspective, until full Application Consistency is re-established.  

 

In the very regrettable state where corruption is encountered but there is no viable backup that can be used, RDBMSCon must be forced {with the help of corruption handling experts or support professionals}.  In that
situation, steps should be considered to mitigate the impact to AppCon as a result of forcing RDBMSCon.   Specifically, for systems on SQL Server what this means; always try to get at least the Primary Key columns from all tables expected to be impacted by any physical consistency repair effort.  When restoring from backup is not an option and the corruption is forcibly removed (RDBMSCon forced, through collaboration with support professionals) generally a loss of data will be encountered, and the availability of Primary Keys can be valuable for application experts to identify the missing data and sometimes even successfully resynthesize it.   These are things that all individuals working with corrupt databases should consider as they strive to renew Application Consistency for expected productive use.

 

RECOMMENDED ACTIONS

  1. Know your backups are good.  Validate that you have viable Backups of all critical databases, and know how to restore them.    Test and validate that you have a functional Disaster Recovery Plan.
  2. Understand Consistency.  Understand the difference between RDBMSPhysCon, RDBMSLogCon, and AppCon. 
  3. Grasp how Applications use the data.  Know, at a high level, the data inside of the databases for which you are responsible.   Know if Application Consistency is important to the application or not.
  4. Understand what DBCC CHECKDB does and doesn’t do.  Know that DBCC CHECKDB cannot assess AppCon.   That is CHECKDB can be clean of all errors but the database might still be corrupt as to Application Consistency and productive functional use.  This is relevant if you ever end up needing to force RDBMSCon by some drastic actions (in collaboration with support professionals) because no viable backup exists.   One should not simply stop with, “DBCC CHECKDB now shows no errors.”, for AppCon problems need to still be considered.
  5. Involve trained experts when considering database corruption. If you encounter corruption in a database always involve a trained professional.  
  6. IF AppCon is important AND corruption is encountered in that database AND restoring from backup is not an option, such that RDBMSCon needs to be forced: Make efforts to obtain Primary Key
    information (when possible), for that can be helpful for efforts to re-establish AppCon.
Comments (0)

Skip to main content