Understanding the Visual Studio Load Agent Controller

Article
02/08/2006

Understanding the Visual Studio Load Agent Controller

Visual Studio 2005 Team Edition for Software Testers

Visual Studio 2005 Team Test Load Agent

Summary: A deeper Look at settings and how the controller in Microsoft Visual Studio Team Load Agent 2005 works.

Introduction

The Microsoft Visual Studio Team Load Agent 2005 Controller (controller) is a Windows Service that can manage one or more agent machines. The configuration settings of a test run and attributes of the test cases in the run determine the actions and scheduling the controller will perform. This document will describe the algorithms the controller runs through to start and monitor test runs.

What Happens At Startup

Configuration File

When the controller service starts it first opens the configuration file, QTController.exe.config. The settings in the configuration file determine what port the controller should listen on, the working directory for temporary files, whether to create a log file or not and a few other items described here:

WorkingDirectory: Base directory for staging deployment and reverse deployment files. Setting this value overrides a call to ‘Environment.SpecialFolder.LocalApplicationData’. The directory you specify here must already exist. Subdirectories are created for each test run submitted to the controller and have the form: VSEQT\QTController\<GUID>. The GUID is a generated value and uniquely identifies each test run.
CreateTraceListener: Indicates to the service it needs to create a trace log (VSTTController.log ) file in installation directory. Default: no, override with "yes". When trouble shooting issues with test runs, creating this log file is very helpful, although it can get fairly large quickly depending on the trace level.
ControllerServicePort: TCP port used by controller. Changing this value will require you to change the controller port the agent service uses to connect to the controller. These values must match.
BindTo: the name of the NIC to bind to for communication. This allows you to dedicate a single NIC card to test tools infrastructure communication and any other cards for load tests and other network activities.
AgentSyncTimeoutInSeconds: The number of seconds before the agent is considered unresponsive synchronizing test run start. The test run is aborted when this timeout period is hit. This applies only to load tests.
LogSizeLimitInMegs: Maximum allowable size of the VSTTController.log file. Setting this value to ‘0’ allows size limited by disk space. The default is 20megs. When the log file hits this size it is backed up and a new file is created. If there is already one back up it is overwritten. The controller monitors the log file size on 20 second intervals, so it is likely the file will not be backed up at exactly the file size limit.
ControllerJobSpooling: Set to false to turn off automatic test run spooling. When spooling is enabled test run events are captured and stored on the controller. This feature allows you to disconnect from a test and reconnect at a future time through the ‘Test Runs’ window in Visual Studio, even if the test run completed. All the events are played back to the connecting instance of Visual Studio.
ControllerUsersGroup: When the controller is installed setup creates a group called ‘TeamTestControllerUsers’. By default this setting references that group. This is a group the controller will do role checking against. Any user added to this group can submit, stop and view owned tests. They can not view details or stop another users test runs.
ControllerAdminsGroup: A setting used for administrative role checking. By default the setting references a group created during setup, ‘TeamTestControllerAdmins’. Users in the group can submit, stop and view any test and perform administrative actions. Note that ‘Administrators’ on the controller machine can perform these same actions.

If any of these settings are changed, the controller service needs to be stopped and restarted. Removing some settings will cause the controller to default to the value specified in the configuration file. Removing functional settings such as the ‘ControllerServicePort’ will keep the controller service from starting. Any event keeping the controller from starting will be logged in the event viewer. Some events may be difficult to diagnose the exact cause. For instance, if any invalid xml formatting is used an error will be logged, but the exact line or value is not always determinable. Removing the settings ‘ControllerUsersGroup’ and ‘ControllerAdminsGroup’ effectively turns off role validation. It is recommended that you use role validation as a security emasure.

Here are the default values set in the qtcontroller.exe.config file:

</appSettings>

Say you want to change the working directory to, “c:\ControllerWorkingDir”. First create the directory on the controller machine. And then add the ‘WorkingDirectory’ key to the ‘<appSettings>’ section in the config file:

</appSettings>

Agent Settings

After reading the settings above the controller reads the agents it will manage, and the assciated agent attributes (persisted in qtcontrollerconfig.xml). There is a user interface for modifying this file so you should not need to modify it by hand. Any agent not in this list is ignored if it tries to connect to the controller. It is denied the connection if the user the agent is running as is not in the ‘TeamTestAgentService’ group.

Submitting a Test Run – What Happens?

When a test run is submitted to the controller it is placed in a queue. If any files are deployed with the test run a deployment job is created. Deployment must complete before a test run is placed into a ‘ready to execute’ state. If a problem occurs during deployment from the host application, such as Visual Studio, the test run will be aborted.

Deployment

If files are deployed as part of the test run, the controller does not use a regular file copy buts copies the data over the connection from VS. The path of each file in the file list is relative to the client machine (the machine from which the test run was submitted). The controller calls back into the host application and asks it for a block of the file currently being copied. The files are copied into a directory under the WorkingDirectory. Using the example above:

c:\ControllerWorkingDir\VSEQT\QTController\620bd97a-94aa-401e-b58e-0ec1c09558ca\deployment

Once all the blocks of a file has been copied, the same steps are taken for the next file. Once all the files are copied the test run is placed in a ‘ready to execute’ state. When the test run begins to execute, the same deployment process occurs between the controller and each of the agents the test is submitted - before actual test execution can begin.

When an agent completes the test run it may return files back to the controller, these are stored here:

c:\ControllerWorkingDir\VSEQT\QTController\620bd97a-94aa-401e-b58e-0ec1c09558ca\results\<agent_machine_name>

Test Run Scheduling – Part 1

After completion of the deployment stage the test run will remain idle until it is popped off the queue for execution. Scheduling the test run is determined by a number of variables contained in the test run configuration and the list of test cases in the test run.

Initially the controller splits the test cases up into a set of tasks. This allows for the possibility of multiple test cases to execute simultaneously. Based on what was just stated, a test run can be thought of as a job, and the job manages a set of one or more tasks. Before you go further some terminology needs to be defined.

Terminology

Bucketing is the term used to describe a way to split up a large number of non-load tests. Simply put, it is a subset of the tests specified in a test run.

- BucketThreshold - is the number of non-load tests in a test run before buckets are created. The default value is 1000. This is really optimized for unit tests – small, fast running tests. Consider a much smaller bucket size for slower tests as it will increase the parallelism achieved in the run.
- BucketSize – the size of each bucket once the threshold has been hit. The default value is 200.

An agent can only execute one test at a time, and should a catastrophic error occur during execution the run may be aborted leaving many tests unexecuted. Bucketing helps in a couple of ways: a) it allows the test cases to execute simultaneously on different agent machines; b) if a catastrophic error occurs during execution, only the remaining tests in a bucket will go unexecuted.

One bucket at a time is sent to each agent that has been selected to execute the test run. If more than one agent is available to execute a test run, then more than one bucket can be executing at a time. As an agent completes the execution of a bucket, it is sent another if more exist.

If 1000 or more non-load tests exist in a submitted test run, buckets of size 200 will be created. You can change these values, in a test run configuration file, but note there is no user interface to make the changes. In the test run configuration file (*.testrunconfig), the following two tags exist:

You change at your own risk – do not save the file without making a back up. If the file is saved with invalid xml formatting the file may become unreadable and you will not be able to run tests with this test run configuration.

A Controller Job (job) is the executing component of a test run. A job manages a set of tasks that need to happen for a test run to execute and complete.

A Controller Task (task) is managed by a Controller Job. It manages a specific set of tests and agent selection to execute those tests. The set of tests can be a bucket as defined above, a load test, or a simple test list. Tasks can execute independently of one another as long as the agent resources exist.

Test Run Scheduling - Part 2

When a job is scheduled to execute it first creates a set of tasks to execute the tests within a test run. Tasks are created using the following algorithm:

If the test is a load test type, a task is created to manage the load test. Tasks manage a single load test. If there is more than one load test in a test run, a task is created for each.
Once the load tests have had tasks created the remaining number of tests is checked. If the remaining total is greater than the bucketThreashold, then the tests are broken up into buckets of size bucketSize. Each bucket gets its own task.
If agent constraints are specified so agents meeting specific criteria are used, this will override bucketing. For instance, say 4 available agents meet the selection criteria and there are 100 tests in the test run. Each agent will get 25 tests and they will execute simultaneously.

For example, say you have a test run with the following tests in it:

LoadTest1
UnitTest1
LoadTest2
OrderedTest1
UnitTest2-UnitTest999

Task1: LoadTest1.

Task2: LoadTest2.

All the non-load tests total 1000 (UnitTest1-UnitTest999 + OrderedTest1). Five tasks would be created:

Task3: OrderesTest1 + 199 Unit tests

Task4: 200 unit tests

Task5-7: 200 unit tests each

If there are enough agent machines, it is possible Tasks3-Task7 will execute simultaneously. Load tests use all available agents so Task1 and Task2 are idle until all agents are available to execute.

Multiple test runs can execute at the same time. The initial test run popped from the queue will always have first rights to any agents meeting the test run configuration criteria. Subsequent test runs can use any agents still available that meet test run configuration criteria associated with the test run. Note that an agent must be marked online to be available for execution.

See section ‘Miscellaneous’, ‘Examples of Agent Selection and Scheduling’ for some example test run submissions.

Executing Test Types

Load tests require a synchronized start. As each agent selected to execute a load test completes any deployment it signals the controller it is ready to execute the test. Once all the agents have entered this state, the controller will send a start event to all agents. The agents will then begin execution. Load tests will use a thread-pool when executing Web Tests and multiple threads to execute non-Web tests. This allows multiple tests to execute simultaneously. This is all managed by the test adapter plug-in.

Non-load tests do not require a synchronized start. If more than one agent is used to execute non-load tests they will begin executing immediately after deployment, no matter the state of the other agents. These tests are executed synchronously one after another.

Monitoring Test Runs

The controller task is charged with monitoring the test list the job assigned it – making sure the tests/agents remain responsive. It does this by monitoring events returned by the agents. If an agent does not return a test result within the timeout period specified in the test run configuration, the task will attempt to abort the current test case and execute the next one on the agent. This applies to all test types. Note that load tests send aggregated result messages back typically every few seconds.

The controller job is charged with monitoring the overall execution time of the entire test run. If the test run does complete within the specified run timeout, the job will stop all the tasks and eventually the entire run.

Timeouts can be accessed from the Visual Studio ‘Test’ menu. Click ‘Test’, selected ‘Edit Run Configurations’ and select the run configuration you want to edit. A dialog will appear, in the left pane select ‘Test Timeouts’.

Miscellaneous

Connecting to the Controller

Firewall and some proxy devices can block the communication between the controller and agents, or the controller and Visual Studio. Be sure to open the listening ports used for the controller and agent to allow communication. See the InstallGuide.htm on the installation CD for more information on this. As noted above the listening port setting for the controller is in the qtcontroller.exe.config file. The setting for the agent is in it’s respective configuration file, qtagentservice.exe.config or qtagentserviceui.exe.config.

By default Visual Studio uses any available port to communicate with the controller. Since the port can change from run to run, you would need to open all ports on your firewall device for proper communication between the controller and Visual Studio. You can set a specific port or port range for Visual Studio to use when communicating with the controller by doing the following:

You can tell Visual Studio to use a specific port by adding this registry key:

HKEY_LOCAL_MACHINE\SOFTWARE\MICROSOFT\VisualStudio\8.0\EnterpriseTools\QualityTools\ListenPortRange

After creating the key, add two new DWORD values:

PortRangeStart

PortRangeEnd

To use a single port you can just specify a value in ‘PortRangeStart’. If you specify a value in ‘PortRangeStart’ and ‘PortRangeEnd’ Visual Studio will attempt to listen on the ports starting from ‘PortRangeStart’ to ‘PortRangeEnd’. Once it successfully opens a port it will stop looking for a port in the port range.

If there is a proxy or firewall device between Visual Studio and the controller machine you will need to open the port or port range you add.

You can verify Visual Studio is listening on the specified port or a port in the specified range by:

netstat -a

You should see one of the ports you specified in a listening state. For example:

TCP <machine>:6905 mymachine.microsoft.com:0 LISTENING

Examples of Agent Selection and Scheduling

For more information of ‘Agent Constraints’ see Distributed Functional Testing with Visual Studio 2005 Team System

Here are some examples the controller scheduling and agent selection given various test run configurations:

Controller is managing 3 agents

- Example 1

bucketSize = 5

bucketThreshold = 6

Using a mix of 12 Unit and Web tests.

No agent constraints specified.

Submit the test run.

The test list is split into 3 buckets of sizes 5, 5, and 2 and the buckets are executed simultaneously.

- Example 2

bucketSize = 5

bucketThreshold = 6

Using a mix of 12 Unit and Web tests.

The same name/value attributes are assigned to each agent. All 3 agents meet the selection criteria.

Submit the test run.

The test list is split into 3 buckets of sizes 4, 4, and 4 and the buckets are executed simultaneously.

- Example 3

Using a mix of 12 Unit and Web tests.