There are several scenarios to this issue and the resolutions are very similar with a few differences. I will add more scenarios as I work with customer support cases for TFS.
First Scenario - Access Level Missing
Customer created several domain user accounts to use to use for Build Agent when configured to run as a service. Half of the agents were green and working and the other half were red and not enabled.
Unconfigured one of the non-working agents and removed from the Agent listing in TFS.
Installed and configured the agent using one of the non-working domain accounts and it registers, but still shows red in the listing of agents in TFS
Checking the Agent service account log file (_diag folder) the latest log file showed the following error:
BaseLogger.LogConsoleMessage(scope.JobId = 00000000-0000-0000-0000-000000000000, message = Authenticating to the server http://localhost:8080/tfs)
JobManager.LogConsoleMessage (scope.JobId = 00000000-0000-0000-0000-000000000000, message = Authenticating to the server http://localhost:8080/tfs)
JobManager.LogConsoleMessage - job not found in dictionary (scope.JobId = 00000000-0000-0000-0000-000000000000)
Authenticating to the server http://localhost:8080/tfs
Failed to create session. Sleeping for 10 seconds before next retry. Attempts=1/10.
Microsoft.TeamFoundation.DistributedTask.WebApi.TaskAgentPoolNotFoundException: No agent pool found with identifier 1.
Verified the domain account was a member of the Agent Pool Administrators role and added the domain account to the local administrators group (not required, but tried anyway)
The domain accounts used for the working build agents were members of the Basic Access Level and the non-working domain accounts were not. Added the domain accounts to the Basic Access level and the agents authenticated and started working. All Green!
Second Scenario - Default Access Level not Set Properly
Customer is using domain accounts for the Build Agent service accounts and the agents register, but stay red and will not initialize.
The default Access Level for the TFS instance is set to Stakeholder and this does not provide the build capabilities for domain users.
Set the default access level to Basic and restart the Agent service accounts
Third Scenario - Only Certain Agents are Offline - Red
The agents appear as such:
The default access level is set to Stakeholder, or the default level is Basic and the domain account is not a member of the access level (see scenario 1). The BuildDude Agent is a domain account and the Foo Agent is configured for NetworkService. Thus, the NetworkService account is allowed access since it is not a domain account and is a trusted account.
Add the domain account used for the Agent's service account to the Basic Access Level and restart the service.