Symptoms
Web services (w3svc, named, ftpsvc, rdp) do not work inside containers or stop working after some time.
The following errors appears in the Event Viewer on the hardware node:
Event ID 2020
Event Type: Error
Event Source: Srv
Event Category: None
Event ID: 2020
Description: The server was unable to allocate from the system paged pool because the pool was empty.
Event ID 2019
Event Type: Error
Event Source: Srv
Event Category: None
Event ID: 2019
Description: The server was unable to allocate from the system Non-Paged pool because the pool was empty.
Various processes may produce a WIN32 1450 error:
C:\>net helpmsg 1450
1450: Insufficient system resources exist to complete the requested service
or fail with code 0x8:
C:\>net helpmsg 8
Not enough storage is available to process this command
As an example of this error occurrence, containers may fail to start:
C:\>vzctl start 101
Starting container ...
Parallels Virtuozzo Containers API function call 'VZVolumeMountExW' failed (C:\vz\Private\101\root.efd, {04588fbf-09b5-42a1-af9b-5f0031dd511c}) err = 1450
Parallels Virtuozzo Containers API function call 'dq_mount' failed
Cannot set disk quota for container 101
Cannot mount disk for container 101
Or
C:\>vzctl start 101
Starting container ...
Parallels Virtuozzo Containers API function call 'VzkrnlStartVps' failed dwErr=0x00000008
Container 101 is not started
The following error code may appear as well:
Virtuozzo API function call 'VzkrnlStartVps' failed dwErr=0x0000013D
Container 101 is not started
Exec '@VzOnShutdown' failed in container 101
Or a backup of a container cannot be created:
* Operation backup_env with the Env(s) ct101 is started
* Backing up environment ct101 to backupnode
...
* Operation backup_env with the Env(s) ct101 is finished with errors:
Failed to backup partition Acronis Error: Module=7 Code=4, "Write error"{=0}, Tag=0
Acronis Error: Module=4 Code=3, "Error writing the file."{function="WriteFile"}, Tag=7ceb2cdc9fb1200f
Acronis Error: Module=0 Code=fff0, "Insufficient system resources exist to complete the requested service."{code=ffffffff800705aa}, Tag=bd28fdbd64edb816 .
Backup operation for node 'vz1' failed: Failed to backup partition Acronis Error: Module=7 Code=4, "Write error"{=0}, Tag=0
Acronis Error: Module=4 Code=3, "Error writing the file."{function="WriteFile"}, Tag=7ceb2cdc9fb1200f
Acronis Error: Module=0 Code=fff0, "Insufficient system resources exist to complete the requested service."{code=ffffffff800705aa}, Tag=bd28fdbd64edb816
Cause
The server has a shortage of non-paged (NP) memory pool. The NP pool on Windows 2003 x86 systems has a limit of 256 MB, which is used for critical kernel operations.
Due to memory management complications on 32-bit operating systems it is tiny:
- 32-bit Windows Server 2003 with 2GB or more of RAM will have a nonpaged pool limit of 256MB
- 32-bit Windows Server 2008 will have a nonpaged pool limit of either 2GB or slightly more than 75% of physical memory, whichever is smaller
On 64-bit operating systems, which have a much larger address space, NonPaged pool has higher limits:
- 64-bit Windows Server 2003 will have a nonpaged pool of either 128GB or 40% of physical memory, whichever is smaller
- 64-bit Windows Server 2008 (or 2008 R2) will have a nonpaged pool limit of either 128GB or slightly more than 75% of physical memory, whichever is smaller
In case the NP pool is overloaded, the system becomes slow and unresponsive and some software components cease to work normally (for example, IIS starts refusing connections).
The NP memory pool shortage can be caused by memory leaks in third-party software, malware, or generally overstraining the system with resource-intensive operations.
Analyzing kernel memory usage with Pool Monitor
Use Windows Task Manager to check NonPaged Pool value. If it is high (>200MB on a 32-bit system), it makes sense to analyze its utilization and fine-tune the server.
Download and install the corresponding tools pack that contains poolmon.exe
utility:
- For Windows 2008 and higher: Windows Driver Kit (after installation, available in
...\Windows Kits\8.1\Tools\
) - For Windows 2003: Windows Support Tools
With the help of poolmon
, check the usage of Paged/Nonpaged memory pools and identify the abusing memory tags. You can find the examples of using poolmon
here: PoolMon Examples.
Here's an example of analyzing the kernel memory usage. poolmon
output:
Check the Bytes column. This is the exact utilization.
Note TPLA and NDpp tags - we can directly affect them. They both originate from the TCPIP stack and the ndis.sys driver. TPLA always occupies roughly 5MB x NUM_CPUs. NPpp is a packet scheduler.
If the CPU load is not really high on the server, the amount of CPUs may appear to be too numerous. If you limit the number of logical CPUs to some less amount, TPLA will decrease accordingly.
NDpp share may be decreased (or eliminated completely) by turning off a variety of packet schedulers, including the PVC traffic shaper and QoS-configured containers.
There is only one PVC-related tag (Drre) and its NPP share is only 14MB (out of up to 40 allowed by design). At this point, we are not facing any sort of NP pool leak caused by PVC components.
Analyzing per-process kernel memory usage
Sometimes specific processes may experience kernel memory leaks, and this can be found out by analyzing per-process paged/nonpaged pool usage:
Open Windows Task Manager, click the Processes tab, go to View -> Select columns on the top menu, and select the PID, Session ID, handles, memory paged pool and nonpaged pool checkboxes.
Find out the processes with the largest paged pool usage, nonpaged pool usage, handles count by sorting the corresponding columns - these are the candidates for memory leaks.
- Find out which containers and services these processes belong to and stop/disable them with container restart. Verify that the paged/nonpaged pool usage has stabilized.
Additional resources
Pushing the Limits of Windows: Paged and Nonpaged Pool
How to use Memory Pool Monitor (Poolmon.exe) to troubleshoot kernel mode memory leaks
PoolMon Examples
Who's Using the Pool?
(Non)paged memory pool limit, it might be smaller then you expect
Troubleshooting Nonpaged and Paged Pool Errors in Windows
How to find pool tags that are used by third-party drivers
Windows NT Kernel memory pool tags
Managing Container Memory Pools