TSQL coding challenge

Today’s entry is another T-SQL puzzle. Steve Kass took the prize for the best solution to my last T-SQL puzzle, and several others came up with some pretty original solutions of their own. I especially liked the ones that were T-SQL specific – ones that weren’t translated from other languages.

Today’s challenge is to detect whether a cross-row relationship that exists in a particular table has been corrupted. The scenario is this:

You have a table of employee-to-mentor relationships that looks like this:

CREATE TABLE Employees

(EmpId int PRIMARY KEY,

Mentor int REFERENCES Employees(EmpId)

)

Every employee has exactly one mentor and every mentor has exactly one associated employee. The company's first employee, the CEO, has no mentor, so her mentor column is NULL. Here's what a simplified version of the table might look like:

INSERT Employees VALUES (0,NULL)

INSERT Employees VALUES (1,0)

INSERT Employees VALUES (2,1)

INSERT Employees VALUES (3,2)

INSERT Employees VALUES (4,3)

INSERT Employees VALUES (5,4)

INSERT Employees VALUES (6,5)

INSERT Employees VALUES (7,6)

INSERT Employees VALUES (8,7)

INSERT Employees VALUES (9,8)

The most recently added employee doesn’t mentor anyone for obvious reasons. His employee number is tracked in a separate table so that it is readily available as a mentor when another employee is added. When a new employee joins, he is assigned the employee that was previously the last employee as his mentor, regardless of his or her title or rank in the company. There is otherwise no relationship between the rows in the table, and, in particular, employee numbers are not necessarily sequential or related in any way.

The company that uses this table has a stored proc that runs on a nightly basis, determines who is mentoring whom, and, using singleton SELECTs, returns the complete list of mentor-employee pairs beginning with the mostly recently added employee (whose ID is cached in a separate table) and working up to the CEO. The Employee table is so huge that it doesn't attempt to store these relationships anywhere; it just returns each one using simple PRINT commands as it determines them. Due to a bug in the app that updates the table when an employee leaves and a lack of RI on the table itself, the relationships between the employees and mentors occasionally become corrupted. Specifically, an employee higher up in the chain erroneously has his mentor set to someone lower in the chain. In other words, his Mentor column is being set to someone being mentored (perhaps indirectly) by him.

The net effect of this is that the stored procedure that returns the mentor-employee pairs gets stuck in an infinite loop. It never reaches the CEO because the relationship chain leading to her is broken. Here's the code for the proc and a function it uses:

CREATE FUNCTION dbo.Mentor(@EmpId int)

RETURNS int

AS BEGIN

RETURN (SELECT Mentor FROM Employees WHERE EmpId=@EmpId)

END

GO

CREATE PROC ListMentors(@LastEmpIdAdded int) AS

DECLARE @i int

SELECT @i = @LastEmpIdAdded -- ID of last employee added (known in advance)

PRINT 'Employee: '+CAST(@i AS varchar(10))

SELECT @i = dbo.Mentor(@i)

PRINT 'Mentor: '+CAST(@i AS varchar(10))

WHILE (@i IS NOT NULL)

BEGIN

PRINT 'Employee: '+CAST(@i AS varchar(10))

SELECT @i = dbo.Mentor(@i)

PRINT 'Mentor: '+CAST(@i AS varchar(10))

END

And here's some code to simulate the buggy app by intentionally breaking one of the mentor-employee relationships in the sample table:

UPDATE Employees SET Mentor = 7 WHERE EmpId=5

If you run the proc against the sample table after running the above UPDATE, it will get stuck in an infinite loop:

EXEC ListMentors 9 -- 9 is the ID of the last employee added (known in advance)

Your task is to change the proc's singleton SELECT code so that it can detect this type of corruption in the mentor-employee relationships as it traverses the table and return an error message rather than get stuck in an infinite loop. Some rules of the game:

1. Because the company is huge and space on the system is limited, you can’t store values from the Employee table elsewhere (either at the client or on the server, on disk or in memory).

2. For the same reason and because of permissions issues, you can’t modify the table in any way or create any other tables or objects.

3. Your code must run on both SQL Server 2000 and SQL Server 2005.

4. You can't replace the proc's singleton SELECT code with set-oriented code. The company has its reasons for using singleton SELECT logic here, and it's your job to modify the singleton SELECT loop to avoid running infinitely, not rewrite it. This includes changing the logic in the Mentor function. In fact, you can't change the Mentor function at all. Your alterations must be limited to the proc.

5. To keep this interesting, you can't use techniques that rely on counting the number of employees or mentors in advance. For purposes of this exercise, you have no way of knowing how many employees, mentors, or rows are in the table or that meet a particular set of criteria. There may be 10,000 or there may be 10,000,000 -- you have no way of knowing and can't use COUNT() or anything similar to tell you.

6. Because I’ve discussed this with various MS people, if you work for MS (or ever have), you can’t play J

Happy Coding!