This blog post is about a difficult Visual Studio Tools for Microsoft Dynamics GP (VSTools) performance issue that I had on a citrix server environment. What I'm going to do is list the symptoms that I received in the case and then the troubleshooting that we did to narrow down the situation and the results of that. So what the case started with was the following:
The client was experiencing an issue with windows developed using Visual Studio Tools for Dynamics GP where the first time after logging into GP, certain users attempt to open these windows it takes a very long time for the windows to display, sometimes as long as 115 seconds. Anytime after that initial window open in the same Dynamics GP session, if they open these same windows again they open almost instantly. Dynamics GP windows themselves work fine, it is only the VSTools window where the slowness occurs.
This issue is occurring for only certain users, meaning that some users working in the same environment (Citrix – GP Client & SQL Server) are experiencing this problem while others are not. Also, the slowness issues seems more related to the windows credentials rather than the GP user, meaning that if windows user ‘AD1’ is experiencing the slowness described, then it does not matter what GP login he/she uses, it is slow. If Windows user ‘AD2’ is not having the problem they can log on with the same GP Login as Window user ‘AD1’, they will not see the slowness.
So the first thing we did was the following.
- In the situation where it does take a long time for a .net addin form to open, let's get a SQL profile of that form opening. That should tell use what type of SQL activity is happening if any when that form is opened and we might be able to find a bottleneck.
- This appears to be a windows users issue, VSTools isn't aware of windows users unless the code in them does something with them. Since this is a citrix environment, let's try make a new user profile and see if the new profile works.
- Is this problem only for one user or is it more? It is always the same users that have the trouble? This we need to narrow down for sure. We have to be able to determine if this is definitely windows user specific or not.
- If we can identify a certain form where this happens, then we might have to get the code for that form in so we can take a look at what it's doing. At least the code on one form.
The answers came back with the following:
- We got the SQL profile in and we actually got one for a good user and one for a bad user. I went through and there definitely was a lot of SQL activity, but as far as the times went, on the bad user, there was really no significant slowdown. The times for both profiles matched up pretty well, so I ruled out SQL being an issue.
- A new user profile had the same slowness issues.
- The problem was definitely specific to certain users. There was a group of windows users that were always slow and a group of windows users that were always fast. GP users didn't factor in to the equation.
- We got the code in for one of the forms and when the form opened, the code was accessing SQL through ADO.net and was returning a record set of about 30,000 records. So that is significant, but the the ADO code was not using integrated security, it was using GPConnNet. So the code wasn't really doing anything with the windows credentials.
At this point, I don't think it's SQL but I'm not sure which way to go. Screen sharing is not an option as this customer is in Singapore and they had processes running in their off hours that they were not willing to shut down. So the next steps were:
- The performance issue only happens for certain windows users on the citrix server. So what we would like the customer to do is install a GP client on a local machine and point to the same SQL DB that the users are logging into. Then test on the local machine with the problem users. If there are no performance issues, then we know it is some sort of citrix server/user profile issue.
- The developer is making a new version of the app that will log the time it takes to open the windows. We will have that loaded on the terminal services and local machine. That way we have a definitive log so we can see how long these windows take.
- Double check and make sure that the users that work are actually polling the same amount of records that the users who don't work are, just to make sure.
The answers came back with the following:
- On the local install, all users were fast, even the bad users. So I think now we can rule out SQL server performance for sure.
- We had the client log the opening of the windows on the local client and on the terminal server for the good and bad users. On the local machine, the VSTools windows opened in about 2 seconds for all users. On the terminal server, the good users opened in about 3 seconds and the bad users took about 93 seconds. So definitely a significant time issue.
- All the users were returning the same record set of about 30,000 records.
Now this is definitely a citrix server/user profile issue. Our new plan of action is as follows:
- Have the users log into the citrix server and do a Start >> Run >> %temp%. This will show where the temp directory is pointing for these user profiles. Do this for both the good and bad users. Where is the temp directory pointing for the good and bad users? If the temp location is a network location or the good and bad users are different, that could be causing the issue.
- The second thing would be to run process monitor from www.sysinternals.com and start the process monitor and log into GP and open the VSTools form. Do this for good and bad users and see if we can see where the bottle neck is.
- Both the good and bad users were pointing to the local temp directory. So none were pointing to network locations.
- The process monitor logs had some interesting information. We received both a log for a good user and a bad user and here is what we saw.
- The bad user has the Dexsql.log turned on. So they definitely need to turn the dexsql.log off for the bad user and delete the existing dexsql.log.
- We could see that they have Symantec Antivirus running. The need to insert the following file types as exceptions in the Antivirus software: .dat, .idx, .tmp, .mdf, .ldf and .dic. They need to do this on any machine running Symantec Antivirus including the SQL Server.
- The biggest issue that I see is that for the bad user, in the network activity tab it is showing CRL.VERISIGN.NET:http and that is significantly bogging things down. In the VSTools window, the CRL.VERISIGN.NET:http is accounting for 1 min and 31 seconds alone. So I believe that this is where the biggest issue is. Under the good users process monitor log, it doesn't show any activity for CRL.VERISIGN.NET:http. I believe that is the main issue. I believe that it is something with SSL or encryption or certificates or network encryption, but whatever it is, if you open the process monitor log for the bad user and unmark everything except for network activity, you will see that's where the time is being spent. Under the good user there is no network activity and that is why his is fast.
So at this point, it looks like the accessing of CRL.VERISIGN.NET:http is causing the slowdown and we need to find out what this is and why it's running for certain users.
The reply came back with the following:
This issues seems to be a “CryptoAPI certificate chain validation” problem. Here are our most recent findings:
- Users experiencing slowness either altogether do not have the ‘CryptoAPI Cache’ folder in their Windows Profile (C:\Documents and Settings\UserName\Application Data\Microsoft\CryptnetUrlCache) or there are no “CRL’s” in the folders.
Apparently when the above is true the CryptoAPI goes directly to Certificate Authorities (e.g. VeriSign) in an attempt to obtain a valid certificate.
- From a review of the various Windows Profiles, Users who are not experiencing the problem have an existing CryptoAPI Cache. The directories and certification cache files can automatically be created when there is an internet connection established at the user profile. Currently, the default internet connection at their servers was not defined with a default proxy setting. Therefore, all citrix users have to configure the proxy setting themselves. Once this was done, the issue was resolved.
This is maybe more of an anatomy of a support case along with the answer to this issue. In the end, the dexsql.log being turned on also wasn't helping matters and the antivirus should have those exclusions. So that was a little bit of the problem, but the main issue was that when the .net form was being opened, it was trying to download certificates for that user.
In hindsight, I should have went to process monitor earlier in the case but hindsight is always 20/20! It's a great tool for diagnosing performance issues and I plan to use it more in the future.