Migrating from Team Foundation Version Control to Git (and preferably at scale)

I've been wanting to write this post for a while (a few years) now and I never took the time. At some point I just thought that it wasn't relevant anymore, but since I'm working on a project in a similar context and I feel that the need still does exist, I could not refrain myself and write this thing down.

If you don't mind about the "What/Why", then I encourage you to scroll down to the "How" 😃

Here we go:

What/Why?

There was a time, only a few years ago, that Azure DevOps (TFS/Visualstudio.com/VSTS/... at the time) only supported their own version control system which was called Team foundation Version Control (TFVC for short). A lot of people have worked with this at some point in their career and most likely, a lot of you still are.

Since then, a lot has changed; Microsoft added support for Git in Azure DevOps service (and server) and has bought GitHub. All this is part of a bigger scheme that goes way beyond the contents of this post: The idea that Microsoft added a "second" version control system to Azure DevOps (server) gave way to a set of questions and one of them was a real practical one: how can we migrate from TFVC to Git. I'm not sure how it was in the beginning, but Microsoft quickly recognized the need of a migration trajectory from TFVC to Git. In the beginning, there was a tool called git-tfs that allowed for 2-way (manual) syncing, but tbh, I never really liked it. (to me it felt like "being active in 2 parallel worlds" and having to support this.)

Since the base concepts of Git and TFVC do not align, you just cannot "import" the one into the other and expect that you can keep your history "and" your branches... To give you an idea of the "basic" differences:

TFVC GIT
Centralized (you have to be connected) Decentralized(you have your copy of the changes)
Changes are called: Check-ins& change sets Commits&Pushes
Branches are: Path-based branches are used mostly as long-standing constructs to isolate risk of change among feature teams and releases. Branching is lightweight and path independent

More info on this topic can be found here on the Microsoft docs page.

Of course you can decide to go for a "breaking" (and labor-intensive) transition where you:

  1. download the code that you need from a TFVC repo,
  2. create a new git repo,
  3. clone the new repo
  4. copy the code over to this new repo,
  5. add the files (git action),
  6. commit the changes
  7. and finally push those changes.

And from this point on, you will able to work with the new repo.

There is of course a so called "catch" as your history is not close to your code. One of the consequences if you need to find out "why something happened", then you need to step out of your "zone", switch to another version control system to find out what ever you want do find out. One really negative point in this story is that you have to switch (mentally) to another way of working and this over and over again. At some point in the future, you will need that history less and less (as you are creating constantly new history) and at some point in the (hopefully) not so distant future, you will have enough "new history" and will not need that "old history" anymore.

And even though that this is a valid approach, the time to evolve from the one way of working to the other is relatively long and this might become quite "costly"... So be aware: Deciding to do such a transition in this fashion might be cheap at first, but it comes at a (rather big) hidden cost that will follow you for a while. (Additionally, this approach is not really scalable) This always makes me think about this quote by Benjamin Franklin:

The bitterness of poor quality remains long after the sweetness of low price is forgotten
Benjamin Franklin

How?

What is provided by Azure DevOps:

As you can see, it is not that evident to move from the one to the other and keep some characteristics (such as history). Microsoft saw this too and figured out to facilitate this transition. With the idea in mind that at some point in the (hopefully not so distant) future, TFVC would be out of support. (Be aware that a lot of development teams, all over the world, have been using TFVC for years, so this transition in itself would be no small feat. Therefore, Microsoft has created a solution to facilitate this transition.

It is easy to find in the "Repos" tab in each Azure DevOps Project: On the top level, there is a drop down where you can choose between the repo's that are available in your project, at the bottom of the list, one of the entries is "Import Repository". When you click on it, you get the following "flyout": Important to note is that you basically need to take 4 steps before you are ready to do an import:

  1. choose whether you want to import from Git or TFVC (in our context, it is the latter)
  2. Provide the path that you want to import
  3. Decide whether you want to import the history and how much history that you want (180 days is the maximum)
  4. The name of the Git repo where you are going to import into

It is also pointed out that this change can be "disruptive" and that you should read up on it here to completely understand what you are getting into. (To be clear, I have done this procedure a lot of times and by keeping in mind the basic concepts, I never had real issues)

These are a few of the catches (which are not necessarily bad):

  1. you cannot specify (and thus import) a path that has branches on lower levels in the tree structure: This makes sense to me as you want to move from one operational way of working into a new one and it makes sense to adopt the new one as soon as possible (you can keep the TFVC structure for specific cases where the imported history is not enough)
  2. History is limited to a maximum of 180 days

After starting the import, you get a fancy screen that indicates that your import is ongoing and that you will be notified when it's ready.

In most cases, the import is ready quite fast, there might be some exceptions where the volume of code that you want to import, is quite big.

And that's it for the "manual" import process provided by Azure DevOps.

The same, but now at (a bigger) scale!

Now, I think we can agree that this way of working might be enough for small teams that do not have that many projects... But imagine what would happen in an environment where there are potentially hundreds of imports that need to be done. You might suspect that a lot of time would be lost. And that is why I decided to try and automate this process. One principle I always try to adhere to is

In my approach, I use the API of Azure DevOps to "replicate the behavior" of azure DevOps, but in such a way that I can automate the process and make it repeatable without having to loose too much time. The process is actually really simple and can be broken down in a few steps:

  1. Setting up the prerequisites
  2. Getting your variables (can be with input arguments, by reading from a (config) file or whatever works for you)
  3. Setting up the (secured) connection to Azure DevOps
  4. "Requesting" the GitClient
  5. Creating the Import request (from TFVC in our case)
  6. Actively waiting for the process to finish (This is Optional (e.g.:if you want to do some postprocessing))

1. Setting up the prerequisites

In this context, I will be working in C# and I will use the full .net framework (the old thing yes) as it allows for a fancy login experience. You an do this also in a .net core context, but then there is no interactive login provided by the Nuget package that we are going to use (Microsoft.TeamFoundationServer.ExtendedClient). In that case, you will have to do the authentication with a PAT (Personal Access Token). In a case of automation, this might even be better... (be aware that you store this token in a safe spot as anyone that gets their hands on it, can do anything that the token is authorized to, in your name)

Also, install the Nuget package Microsoft.TeamFoundationServer.ExtendedClient (it will add a bunch of others too)

2. Getting your variables

To get started, you need to ensure that you have some relevant information:

  • the TFVC source path that you want to import
  • the team project where you want to import into
  • the name of the Git repo that you are going to create.

I will 'just' work with some variables, for the sake of this example:

1var tfvcSourcePath = "$/<your TFVC sourcepath>/";
2var destinationTeamProject = "TheProjectNameWhereYouAreImportingInto";
3var newrepo = "TheAutomaticallyGeneratedRepo";

3. Setting up the (secured) connection to Azure DevOps

This part is easy:

1// Interactively ask the user for credentials, caching them so the user isn't constantly prompted
2    VssCredentials creds = new VssClientCredentials();
3    creds.Storage = new VssClientCredentialStorage();
4
5// Connect to VSTS
6    VssConnection connection = new VssConnection(new Uri("<TheUrlOfYourAzureDevopsAccount>"), creds);

4. "Requesting" the GitClient

This part is also easy. From the connection object, you can request the correct API client that will be used in the communication to Azure DevOps

1var gitclient = connection.GetClient<GitHttpClient>();

5. Creating (and executing) the Import request

In the UI, this might seem as one step, but in the background, "the creation of the Git repo" and the "import from TFVC" are 2 separate things

This means that you first need to "create the Git Repo"

1            var creationResult = gitclient.CreateRepositoryAsync(
2                new GitRepository()
3                {
4                    Name = newrepo
5                },
6            destinationTeamProject)

And after that has been done, you can instruct Azure DevOps to run the import process:

 1// First create an import request, this will be used in the actual call to Azure DevOps. 
 2// Please note that some info has to be specified (as discussed earlier)
 3GitImportRequest request = new GitImportRequest()
 4            {
 5                Parameters = new GitImportRequestParameters()
 6                {
 7                    TfvcSource = new GitImportTfvcSource()
 8                    {
 9                        ImportHistory = true, //specigy yes if you want to import the history
10                        ImportHistoryDurationInDays = 180, // specify how much you want
11                        Path = tfvcSourcePath // specify the TFVC path that you want to import
12                    }
13                }
14            };
15// Then launch the "actual import"!
16var executedrequest = gitclient.CreateImportRequestAsync(request, destinationTeamProject, newrepo).Result;        

6. Actively waiting for the process to finish

Now, if you want use the same API to start manipulating this newly imported repo, then I must warn you. The last method call "only" launches an async process that can take some time (depending on the amount of work that needs to be done). At first, I was a bit surprised by this and it took me some time to figure this out, so I'll add the steps you need to take if you want to wait for the import to "actually complete" 😃

The idea of what is happening, is that the CreateImportRequestAsync returned an object that can be used to retrieve the actual status of the import. Most likely, that status was "Queued" of "InProgress" at that time. and depending on the stage of the process, that state, can change. This means that you can use another method call (GetImportRequestAsync) to retrieve the most recent status. and if you know the different stages and their order, then you can "follow up" the progress of your import request and take action when the import is "Completed".

A basic implementation of this logic can be found here:

 1var importatstus = gitclient
 2    .GetImportRequestAsync(destinationTeamProject, creationResult.Id, executedrequest.ImportRequestId).Result;
 3var currentindex = -1;
 4do
 5{
 6    if (currentindex != importatstus.DetailedStatus.CurrentStep - 1)
 7    {
 8        currentindex = importatstus.DetailedStatus.CurrentStep - 1;
 9        Console.WriteLine(importatstus.DetailedStatus.AllSteps.ElementAt(currentindex));
10    }
11    else
12    {
13        //just some logic to wait, otherwise you might get throthled by Azure DevOps
14        Thread.Sleep(1000); 
15        Console.Write(".");
16    }
17    // Request the status of the import
18    importatstus = gitclient
19        .GetImportRequestAsync(destinationTeamProject, creationResult.Id, executedrequest.ImportRequestId).Result;
20} while (importatstus.Status == GitAsyncOperationStatus.Queued ||
21         importatstus.Status == GitAsyncOperationStatus.InProgress);    
22    if (importatstus.Status == GitAsyncOperationStatus.Failed ||
23        importatstus.Status == GitAsyncOperationStatus.Abandoned)
24        {
25            Console.WriteLine("oops, something went wrong");
26        }
27Console.WriteLine("Your import has completed!");

And with this, you can now start working with your brand new repo!

Conclusion

In this post, we discussed the process of evolving from TFVC to Git in Azure DevOps and ways in which this transition can be performed. If you only have to do one or two imports, then I really suggest that you use the functionality in the UI (as provided by Microsoft). If you want to scale up, then you can use the logic/"way of working as I have laid out in the previous section". The beauty of it is that it almost contains no "custom" import functionality as we use the API's as provided by Microsoft!

If you have questions, do not hesitate to contact me!

I really hope that it helps you!

Tim

comments powered by Disqus