Symptoms

Web services (w3svc, named, ftpsvc, rdp) do not work inside containers or stop working after some time.

The following errors appears in the Event Viewer on the hardware node:

Event ID 2020
Event Type: Error
Event Source: Srv
Event Category: None
Event ID: 2020
Description: The server was unable to allocate from the system paged pool because the pool was empty.

Event ID 2019
Event Type: Error
Event Source: Srv
Event Category: None
Event ID: 2019
Description: The server was unable to allocate from the system Non-Paged pool because the pool was empty.

Various processes may produce a WIN32 1450 error:

C:\>net helpmsg 1450
1450: Insufficient system resources exist to complete the requested service

or fail with code 0x8:

C:\>net helpmsg 8
Not enough storage is available to process this command

As an example of this error occurrence, containers may fail to start:

C:\>vzctl start 101     
Starting container ...
Parallels Virtuozzo Containers API function call 'VZVolumeMountExW' failed (C:\vz\Private\101\root.efd, {04588fbf-09b5-42a1-af9b-5f0031dd511c}) err = 1450 
Parallels Virtuozzo Containers API function call 'dq_mount' failed 
Cannot set disk quota for container 101
Cannot mount disk for container 101

Or

C:\>vzctl start 101
Starting container ...
Parallels Virtuozzo Containers API function call 'VzkrnlStartVps' failed dwErr=0x00000008
Container 101 is not started

The following error code may appear as well:

Virtuozzo API function call 'VzkrnlStartVps' failed dwErr=0x0000013D
Container 101 is not started
Exec '@VzOnShutdown' failed in container 101

Or a backup of a container cannot be created:

* Operation backup_env with the Env(s) ct101 is started
* Backing up environment ct101 to backupnode
...
* Operation backup_env with the Env(s) ct101 is finished with errors: 
Failed to backup partition Acronis Error: Module=7 Code=4, "Write error"{=0}, Tag=0 
Acronis Error: Module=4 Code=3, "Error writing the file."{function="WriteFile"}, Tag=7ceb2cdc9fb1200f 
Acronis Error: Module=0 Code=fff0, "Insufficient system resources exist to complete the requested service."{code=ffffffff800705aa}, Tag=bd28fdbd64edb816 .
Backup operation for node 'vz1' failed: Failed to backup partition Acronis Error: Module=7 Code=4, "Write error"{=0}, Tag=0 
Acronis Error: Module=4 Code=3, "Error writing the file."{function="WriteFile"}, Tag=7ceb2cdc9fb1200f 
Acronis Error: Module=0 Code=fff0, "Insufficient system resources exist to complete the requested service."{code=ffffffff800705aa}, Tag=bd28fdbd64edb816

Cause

The server has a shortage of non-paged (NP) memory pool. The NP pool on Windows 2003 x86 systems has a limit of 256 MB, which is used for critical kernel operations.

Due to memory management complications on 32-bit operating systems it is tiny:

  • 32-bit Windows Server 2003 with 2GB or more of RAM will have a nonpaged pool limit of 256MB
  • 32-bit Windows Server 2008 will have a nonpaged pool limit of either 2GB or slightly more than 75% of physical memory, whichever is smaller

On 64-bit operating systems, which have a much larger address space, NonPaged pool has higher limits:

  • 64-bit Windows Server 2003 will have a nonpaged pool of either 128GB or 40% of physical memory, whichever is smaller
  • 64-bit Windows Server 2008 (or 2008 R2) will have a nonpaged pool limit of either 128GB or slightly more than 75% of physical memory, whichever is smaller

In case the NP pool is overloaded, the system becomes slow and unresponsive and some software components cease to work normally (for example, IIS starts refusing connections).

The NP memory pool shortage can be caused by memory leaks in third-party software, malware, or generally overstraining the system with resource-intensive operations.

Analyzing kernel memory usage with Pool Monitor

Use Windows Task Manager to check NonPaged Pool value. If it is high (>200MB on a 32-bit system), it makes sense to analyze its utilization and fine-tune the server.

Download and install the corresponding tools pack that contains poolmon.exe utility:

With the help of poolmon, check the usage of Paged/Nonpaged memory pools and identify the abusing memory tags. You can find the examples of using poolmon here: PoolMon Examples.

Here's an example of analyzing the kernel memory usage. poolmon output:

Check the Bytes column. This is the exact utilization.

Note TPLA and NDpp tags - we can directly affect them. They both originate from the TCPIP stack and the ndis.sys driver. TPLA always occupies roughly 5MB x NUM_CPUs. NPpp is a packet scheduler.

If the CPU load is not really high on the server, the amount of CPUs may appear to be too numerous. If you limit the number of logical CPUs to some less amount, TPLA will decrease accordingly.

NDpp share may be decreased (or eliminated completely) by turning off a variety of packet schedulers, including the PVC traffic shaper and QoS-configured containers.

There is only one PVC-related tag (Drre) and its NPP share is only 14MB (out of up to 40 allowed by design). At this point, we are not facing any sort of NP pool leak caused by PVC components.

Analyzing per-process kernel memory usage

Sometimes specific processes may experience kernel memory leaks, and this can be found out by analyzing per-process paged/nonpaged pool usage:

  1. Open Windows Task Manager, click the Processes tab, go to View -> Select columns on the top menu, and select the PID, Session ID, handles, memory paged pool and nonpaged pool checkboxes.

  2. Find out the processes with the largest paged pool usage, nonpaged pool usage, handles count by sorting the corresponding columns - these are the candidates for memory leaks.

  3. Find out which containers and services these processes belong to and stop/disable them with container restart. Verify that the paged/nonpaged pool usage has stabilized.

Additional resources

Pushing the Limits of Windows: Paged and Nonpaged Pool
How to use Memory Pool Monitor (Poolmon.exe) to troubleshoot kernel mode memory leaks
PoolMon Examples
Who's Using the Pool?
(Non)paged memory pool limit, it might be smaller then you expect
Troubleshooting Nonpaged and Paged Pool Errors in Windows How to find pool tags that are used by third-party drivers
Windows NT Kernel memory pool tags
Managing Container Memory Pools

Internal content