Recently, a customer had issues with an ASP.Net application connecting to an Oracle database using the DataDirect provider. The customer was seeing runaway memory usage and was using a 40 minute App Pool recycle schedule to combat it. This worked for a while, but during App Pool recycles, they were seeing connection timeouts to the database.
It turned out that the timeouts were directly caused by a patch addressing mutex issues that needed to be applied to Oracle. However, many connection settings in the application contributed to the problem, and in the process of debugging the issue I learned a lot about connections that I’ll try to share here.
In this system, all application users are also database users. Many consider this to be a security best practice in Oracle, but I’ll leave discussion of that for a later date. The problem arises because DataDirect does connection pooling based on unique connection strings (every unique connection string gets its own pool). In this setup, ever user was getting his own pool to each database in the application, which led to rapidly increased memory usage under load.
One issue we identified that was contributing to the high memory usage was that the Load Balance Timeout connection string parameter was set to 1 hour. This meant that, combined with the 40 minute App Pool recycle, that user connections were never being gracefully expired from the pool. Every user had their own connection pool that lived until the App Pool recycled and killed every pool at once.
Changing this timeout parameter to 10 minutes allowed us to change the App Pool recycle to a daily event. Users did not see any impact, because their connections only expired after 10 minutes of inactivity, and the penalty for re-establishing a connection was rare and comparatively minor.