Version Control migration - 2006 makes a design decision

Last time I bribed my 1999 self to answer 20 questions about what was needed from a version control migration process. Below is the summary of key points from the answers:

1) We need to support migrating only a portion of the entire source tree at a single time.

2) We need to support migrating in multiple iterations.

3) We need to support migrating history from the start of time.

4) We need to support migrating history from a fixed time or label.

5) Label migration is an important feature to allow the promotion model to continue.

6) The label based promotion model will likely be phased out soon.

7) The labels could be recreated by hand at the cost of several days of work.

8) Migrating a subset of the labels would help mitigate many short-term problems.

9) Integration history is nice but not strictly needed.

10) There is ambiguity around how some areas should be migrated but it is expected that it will “just work” in the end.

11) They would like invalid file names to be mapped to valid ones and not just skipped.

12) Migrating security settings for users and groups would be nice but it is not a huge deal as the permissions are not very granular and will therefore be easy to setup.

13) File encodings must be properly migrated.

14) Workspaces should not be migrated.

15) Work item linking is important and something they would like to do if possible.

16) Change metadata (author, comments, etc) should be retained as much as possible.

17) There are some domain users who are no longer with the company. Their changes should still migrate successfully.

18) Crash recovery is very important as this will be a long running migration.

19) Log consolidation is important because the migration will be done in stages.

20) Correctness is more important than speed.

21) Not having to intervene manually is more important than speed.

 

Additionally some cost/benefit information is:

1) The cost of not having migration history is large – several hundred thousand dollars of lost productivity over the first 12 months.

2) The cost of not having label migrations is a moderate amount of immediate pain that quickly diminishes and eventually goes away.

3) Migrating security settings manually will be cheap.

4) Migrating workspaces will actually create more work than it saves.

 

So what does this all mean?

What I’m really looking for here are the big architectural things. Issues like:

· Migrations can be run over multiple sessions

· Migrations can recover gracefully from a crash (and pick up where they left off)

· Migration logs can be viewed in a consolidated form

All of these point to the same thing – we need a method of recording state in a persistent and transactional form.

Additionally label migration and work item linking could be written as custom tools not integrated with the core VC migration functionality. Both would need to able to map a changeset in the source system to the migrated changeset in the target system. If the migration could record the migrated source and target changesets this would open up new possibilities for other tools that could be written out of band with the core migration tool.

In other words we need to know what we are going to do, what we are doing and what we have done. We need this to be a reliable storage mechanism that supports transactions. We need to be able to query against the data and it has to be in a well-known format since other applications will need to make use of this information.

In other words we need a SQL database.

Other architectural thoughts next time – most importantly “what type of process (console, WinForms, Win32 service, web service, etc) should the migration be?”