Considerations When Building Database Schema with Federations in SQL Azure

Article
04/16/2011

When working with federations, database schema requires a special consideration.

With federations, you take parts of your schema and scale it out. With each federation in the root database, a subset of objects are scaled-out. You may create multiple federations because distribution characteristics and scalability requirements may vary across sets of tables. For example, with an ecommerce app, you may both have a large customer-orders set of tables and a very large product catalog that may have completely different distribution requirements.

All the schema artifacts in the root and in federation members are scoped to the database. Meaning objects in the root database are only visible when connected to root and objects in the members are only visible in the member. There is no schema enforcement across members of a federation to have exactly matching schema elements. So federation member 1 and 2 may have completely different schema.

The root database, provide the containment boundary for all information required to access all the information app cares about – root know about all federations and all federation members in these federation members and their distribution scheme and current distribution layout. This is all exposed through metadata for federations. New bunch of system tables like sys.federation, sys.federation_members and more will give you this information in the root and in federation members. Root’s containment boundary also serves as a control point for authentication and account management across all federations and their members. the following figure represent the setup I described above;

To express all this, there are surprisingly few annotations required in the federated schema. All object other than tables do not require any special annotations. Stored procedures, indexes, functions or triggers work within the scope of the database they are created in, regardless of the specific location of the object: in the root or in any federation member. The only annotations required are on the federated tables. Lets take a step back and take a look at the types of tables in databases with federations. Federations introduce 3 types of tables; federated tables, reference tables and central tables.

Federated Tables:
Refer to tables that contain data that is distributed by the federation. Federated tables are created in federation members and CREATE TABLE syntax contain a federation distribution key annotated with the FEDERATED ON(federation_distribution_key = column_name) clause.

 CREATE TABLE t1(…) FEDERATED ON (distribution_key=column_name)

In the figure above, federated tables are marked light blue. Federated tables contain part of the scaled out data. There can be many federated tables in a federation member. When a SPLIT operation is issued; federated table schema is fully copied to the destination members. However, federated table data is filtered based federation key and based on the value of the split point to the destination federation members.

First thing to remember is that column specified in the federated tables as the distribution key is required the exactly match the data type of the federation schemes data type definition. This is specified as part of the CREATE FEDERATION statement for the federation.

Another important aspect is that the column that is the federation key is present in each federated table. For some tables, this may require denormalization. For example, order_details table may not contain customer_id but to annotate the table as part of the classic orders & order_details schema, customer_id needs to be added to the order_details table as well as the orders table.

There are a few others requirements to remembers;

Federation key column can only be one of the following types in v1: INT, BIGINT, UNIQUEIDENTIFIER, VARBINARY(900). In future iterations, you can expect the list to expand.
Federation key column value in the table cannot be a nullable type and cannot be updated.
Federation distribution key is required to be part of every unique key.
Foreign key relationship between federated tables are required to include the federation key at the same ordinal. This restriction does not apply to foreign key relationships to reference tables.
In v1, federation key column cannot be a computed column.
Require all foreign key relationships between federated tables to include federation distribution key.
Due to repartitioning operations, database scoped functions are challenging with federation members as well. Federation members do not support identity property on tables and
Due to similar limitations, timestamp data type is not supported in federation members.
Schema for the federated tables can only be changed in an UNFILTERING connection.

Reference Tables:
Refer to tables that contain reference information to optimize lookup queries in federations. Reference tables are created in federation members and no special annotation is required when creating reference tables. The absence of the FEDERATE ON clause is all that is needed. Reference tables typically contain small lookup information useful for query processing such as looking up zipcodes that is cloned to each federation member.

In the figure above, reference tables are marked green. When a SPLIT operation is issued, reference table schema and data is cloned to both destination members.

A few things to remember with reference tables;

The same identity and timestamp limitations apply to reference tables as well.
Updating reference tables schema and data can only be done in an UNFILTERING connection.
Reference tables cannot have foreign key relationships referring to federated tables.

Central Tables:

Refer to tables that are created in the federation root for typically low traffic objects such as application metadata. No special annotation is required for the these. In the figure above, central tables are marked orange. Central tables simply only exist in the root database context and are accessible only when you are in the root database. Some of the limitations listed above do not apply to the root database and tables created in the root database. Since centralized tables are in the root database only, they do not participate in any repartitioning operations. I refer to this above as well but one thing to remember is that centralized tables are only visible when connected to the root and not accessible when connected to the member databases.

Other Considerations for Federation Member Schema

Finally, a few other limitation in v1 exist in federation member schema.

This is implied above but to be explicit; object_ids between federation members for objects with same name are independent.
Indexed views are not supported in federation members.
All changes that impact the global view of the atomic unit require connections that has turned off filtering with FILTERING=OFF.
- Schema updates to the federation members can only be done with none-filtering connections..
- Updates to the reference table data also require none-filtering connections.
Much like the login and user relationship that exist today between master database and user databases, user accounts in federation members have to match the user account name at the root at create time.

The above list summarizes the list of requirements for the schema. As always, if you have questions, feel free to reach out through the blog.

-Cihan

Considerations When Building Database Schema with Federations in SQL Azure

Additional resources