Extract method to class (part 1)

Article
09/11/2004

I've been working on a refactoring i like to call "extract method to class". It's a refactoring I like to do whenever i notice the following smells:

You have a method that is seriously far too large for its own good.
You need to change the logic in the method in such a way that the current procedural code just won't be good enough.
You can do extract method several times but you notice that each time you do an extract method you end up passing a large number of parameters around, and most of the extracted methods pass along the same set of parameters.

I'm currently using this refactoring to help with our implementation of

interface IVsLanguageDebugInfo : IUnknown

{

...

// Validate the given position as a place to set a breakpoint. If the

// location is valid, and pCodeSpan is non-NULL, the span is filled in

// with the extent of the statement at which execution would stop. If

// the position is known to not contain code, this will return S_FALSE.

// If this function fails, the breakpoint is set, pending validation

// during debug session startup.

HRESULT ValidateBreakpointLocation ([in] IVsTextBuffer *pBuffer,

[in] long iLine,

[in] long iCol,

[out] TextSpan *pCodeSpan);

...

};

This is the method that calls into us when the user tries to set a breakpoint on a line. It allows us to give find grained control over where the breakpoint is going to be (as opposed to just setting it on the line). An example of that would be:

for (Position pos = map.GetStartPosition();

pos != null;

pos = map.GetNextPosition(pos)) //<-- trying to set the breakpoint here

{

}

If you try to set a breakpoint on that last line then we will highlight the expression “pos = map.GetNextPosition(pos)” to indicate that that’s what we’ll stop on.

The code for our implementation exists inside “class Source”. That class represents a C# source file and wraps the underlying text buffer with C# specific information. (Note: names of classes and code have been simplified for the purpose of this example). It tries to take the line and column passed in to determine the most relevant expression to set a breakpoint on. Now, this basic idea happens a lot in the C# language service. When you ask for a completion list or parameter help we are doing the exact same thing. However, in the latter cases we have a far easier time with it. Why? Well normally there is a caret position that gives us both the line and column that we are on whereas normally when someone sets a breakpoint they are just clicking on the gutter of the source code which just gives us a line and a column position of 1. i.e:

do {

foo();

}

while (BlahBlah());

//^

//|

//User wants to set a breakpoint on BlahBlah(), but the query comes in a column 1

So the purpose of this method is to a lot of futzing (ßwow, that apparently isn’t a spelling mistake according to Word) with the token stream, the parse tree, and the semantic graph to determine what the most appropriate location is to put this on. It’s not that hard, but there are a lot of special cases which is why this method has grown so huge.

I was going to start by doing a lot of extract methods to break up the functionality, but I noticed that the code has the following form:

public TextSpan ValidateBreakpointLocation(long line, long column)

{

IParseTree parseTree = this.ParseTree;

IMemberParseTree memberParseTree = this.MemberParseTree;

ITokenStream tokenStream = this.TokenStream;

//stuff...

}

So if i extract method i’m going to see a lot of methods that look like this.

private void DoSomething1(long line, long column, IParseTree parseTree, IMemberParseTree memberParseTree, ITokenStream tokenStream, //SomeOtherStuff ...

private void DoSomething2(long line, long column, IParseTree parseTree, IMemberParseTree memberParseTree, ITokenStream tokenStream, //SomeOtherStuff ...

(Actually there would be even more parameters, but i wanted to keep this kind of consise). It's pretty icky. So the first step I did was to encapsulate all that state into a dummy container object which I could then pass around to every method. Slightly cleaner, but not great. i.e:

class ValidateBreakpointState

{

long line, long column;

public IParseTree parseTree;

public IMemberParseTree memberParseTree;

public ITokenStream tokenStream;

}

class Source : IVsLanguageDebugInfo

{

public TextSpan ValidateBreakpointLocation(long line, long column)

{

ValidateBreakpointState state = new ValidateBreakpointState();

state.line = line;

state.column= column;

state.parseTree = this.ParseTree;

state.memberParseTree = this.MemberParseTree;

state.tokenStream = this.TokenStream;

//stuff...

}

private void DoSomething1(ValidateBreakpointState, //SomeOtherStuff ...

private void DoSomething2(ValidateBreakpointState, //SomeOtherStuff ...

Now the next smell I noticed was that after extracting the methods almost all the data that was used came from the ValidateBreakpointState class instead of the Source class. Looked like a good time for “Move Method”. I moved ValidateBreakpoint from Source to ValidateBreakpointState passing along the “this” Source as a parameter because it was going to be used a little.

Now this raised an interesting problem. The data inside Source that was needed in ValidateBreakpointLocation was private to Source. How then should ValidateBreakpointState get access to it? In C++ I could use friend classes to do this (ick ick gag gag), but instead I’ll just make ValidateBreakpointState a nested class inside of Source. Now accessing privates isn’t a problem, although now this class is highly tied to the source class (which isn’t really a bad thing if it’s accessing privates). In the future I could do a few ‘Encapsulate Field’s and then wean the two classes apart from each other.

Now that I’m in this position I can start doing some ‘Extract Method’s to get the code into the state I want.

Ok. I’ve just spent a lot of time just getting the code into a form where I can start fixing the original few bugs that i started out with. Why did i have to spend like a good half hour on this? Why couldn’t we do most of the work for you here? Next post will cover how I think we can do that.

Extract method to class (part 1)

Additional resources