← Back to Dashboard
Critical Incident Report

MSSQL Connection Timeouts & Resource Exhaustion

1. Executive Summary:
The server experienced connection timeouts because active database connections (1,745) exceeded the Linux ulimit (1,024). This prevents the SQL service from opening new network sockets, causing connection attempts to hang.

2. Diagnostics & Verification

Use these commands to verify if you are hitting OS-level limits:

# Check Active Connections (OS Level)
netstat -an | grep 1433 | wc -l

# Check Connection States (Look for high ESTABLISHED counts)
netstat -an | grep 1433 | awk '{print $6}' | sort | uniq -c

# Check Current OS Ceiling
ulimit -Sn

3. Immediate Resolution

Increase the file descriptor limit for the MSSQL service to allow its full enterprise capacity.

# 1. Modify the Service Configuration
sudo systemctl edit mssql-server

# 2. Add these lines in the editor:
[Service]
LimitNOFILE=64000

# 3. Apply and Restart
sudo systemctl daemon-reload
sudo systemctl restart mssql-server

4. Root Cause: Addressing Connection Leaks

Increasing limits is a "band-aid." 1,700+ connections usually indicates an application-level connection leak.

Identify Source IP
Run ss -atp | grep :1433 to find which remote server is flooding the database.
Code Review
Ensure all DB calls use Dispose, Close, or using blocks.

Check for "Sleeping" Sessions

Run this in SQL Management Studio to find idle sessions that weren't closed properly:

SELECT session_id, login_name, host_name, status 
FROM sys.dm_exec_sessions 
WHERE status = 'sleeping' AND is_user_process = 1;

5. Enterprise Best Practices