Skip to main content

basic rctl process.max-stack-size (value 8388608) exceeded. Sun(c)

from dhelios.blogspot.com


SYMPTOMS

Messages file logged below messages on Solaris[TM] 10:

May 23 17:48:32 g4as7030 genunix: [ID 883052 kern.notice] basic rctl process.max-stack-size (value 8388608) exceeded by process 20431

CHANGES

Logging of these messages is not enabled by default; enabling it with rctladm(1M) will make them visible but the results cannot be correctly interpreted without knowing something of how the internal implementation works.

CAUSE

For those very rare processes which do need more than a few KB of stack, it will be beneficial to map large pages into their stack segments, when possible, because these processes will then run more efficiently.  The system will allocate pages whenever another page is needed so the process can make progress.  This is one of the main responsibilities of the virtual-memory subsystem of any modern operating system and it happens hundreds of times every minute.  But that has little to do with stack-growing.

Growing a stack happens whenever a process is exec(2)'d (namely bringing the first 8K page on its stack into existence) and then whenever the process needs more room on its stack than it currently has.  The max-stack-size is an artifact of an internal implementation detail, in this case a side product of variable page sizes.  The vast majority of processes need only 8K, and can get along for their entire lifetime with just a single stack page of this smallest size.  A minority of processes needs 16K or 24K of stack and their stacks get grown by one or two pages as needed.  A tiny vanishing minority of processes needs more than 24K and these will have their stacks grown several times during the process lifetime, switching to larger pages when appropriate (which is when these rctl checks happen and when these messages may get logged, if the logging has been enabled).  The max-stack-size rctl is also used to keep other address space mappings outside the range reserved for this growth, so that we don't bump into a shared library or anonymous mapping later when we do try to grow the stack.  Growth is triggered simply by memory accesses into the "yellowzone" below the already existing stack pages.  The pagefault trap handler recognizes them as access to memory destined to become part of the stack, and calls a kernel function named grow().

In the past, when all sun4u pages were 8K in size, things were simple: one check against the stack size limit sufficed.  If the check passed, the stack would be grown by enough zero-initialized pages to contain the desired address.  If the check failed, no growth is possible, and the process would then be in trouble, typically resulting in a signal (indicating that the pagefault could not be satisfied, rather than arising from a resource control action) and almost always in a coredump, unless the process was programmed to catch and handle it on an alternate stack.  But nowadays we have large pages and processes will benefit in performance from using them even on the stack segment, once this has already grown large enough to make this worthwhile.  So the grow() algorithm has become more complex.  It now tries to convert the existing stack segment to a larger pagesize when it is large enough to benefit from that and when large pages are available.  It also tries to use the larger pages for the piece to be grafted on, when possible.  Depending on how large the segment already is, on what its current pagesize is, on what page sizes are available, on whether we have large pages ready to be used and on where the requested address falls in the yellowzone, this may require a few successive attempts before it finds something that will fit.  So a single stack-growing memory access can now result in several (up to four on these platforms) checks against max-stack-size; one check for each page size being tried.

The messages which are displayed come from such attempts to fit a large-page piece onto the growing stack.  This can bump into the default 8M ceiling (or any other configured ceiling), especially when a 4M page on the V240, or a 4M or 256M page on the T2000 is being tried.  The failing attempts result in the messages being logged.  They are indeed denials of the operation attempted by the page-fault handler at this point, but they aren't fatal to the process: the handler goes on to try the next smaller page size.  If the 8M haven't yet been exhausted, then it will eventually succeed, using one or more 8K pages.  In no case will stack growth ever go beyond the ceiling, nor can this ceiling be raised to make more room inside a running process (though it can be raised to take effect on a future exec(2) of itself or its future children), since ld.so.1 and shared-library mappings are going to be sitting in the way.  (This applies to SPARC and amd64 architectures.  On x86, the stack is at the bottom of the address space and a perpetually unmapped redzone page at virtual address 0x0 is in the way, for catching null pointer abuses.)

Running
# /bin/pmap -sx PID[,...]

against a process or processes of interest will display, among other things, the page sizes currently in use for each mapped segment.  You may find a few processes on your system which do use larger pages for their stack segments (at the top of the address space and thus near the end of the pmap output, and marked as "[ stack ]"), which will confirm that this mechanism is indeed getting triggered under the workloads of this system and that it is indeed operating successfully.

SOLUTION

This is NOT a "problem". The messages merely reflect an aspect of normal system operation.

The messages may be quieted in syslog by using:
# rctladm -d syslog process.max-stack-size

Comments

Popular posts from this blog

Solaris. remove unusable scsi lun

Solaris remove unusable or failing scsi lun 1. The removed devices show up as drive not available in the output of the format command: # format Searching for disks...done ................      255. c1t50000974082CCD5Cd249 <drive not available>           /pci@3,700000/SUNW,qlc@0/fp@0,0/ssd@w50000974082ccd5c,f9 ................      529. c3t50000974082CCD58d249 <drive not available>           /pci@7,700000/SUNW,qlc@0/fp@0,0/ssd@w50000974082ccd58,f9 2. After the LUNs are unmapped Solaris displays the devices as either unusable or failing. # cfgadm -al -o show_SCSI_LUN | grep -i unusable # # cfgadm -al -o show_SCSI_LUN | grep -i failing c1::50000974082ccd5c,249       disk         connected    configured   failing c3::50000974082ccd58,249 ...

memory error detect XSCF uboot

If you see something like this when you poweron you server: memory error detect 80000008, address 000002d0 data 55555555 -> fbefaaaa capture_data hi fbefaaaa lo deadbeef ecc 1b1b capture_attributes 01113001 address 000002d0 memory error detect 80000008, address 000002d4 data aaaaaaaa -> deadbeef capture_data hi fbefaaaa lo deadbeef ecc 1b1b capture_attributes 01113001 address 000002d4 memXSCF uboot  01070000  (Feb  8 2008 - 11:12:19) XSCF uboot  01070000  (Feb  8 2008 - 11:12:19) SCF board boot factor = 7180     DDR Real size: 256 MB     DDR: 224 MB Than your XSCF card is broked. Replace it with new one. After that it will ask you for enter chassis number - located at front of the server XSCF promt to enter your chasses number ( is a S/N of your server ): Please input the chassis serial number : XXXXXXX 1:PANEL Please select the number : 1 Restoring data from PANEL to XSCF#0. Please wait for se...

SPARC OBP cheatsheet

Boot PROM Basics Boot PROM(programmable read only memory): It is a firmware (also known as the monitor program) provides: 1. basic hardware testing & initialization before booting. 2. contains a user interface that provide access to many important functions. 3. enables the system to boot from wide range of devices. It controls the system operation before the kernel becomes available. It provides a user interface and firmware utility commands known as FORTH command set. These commands include the boot commands, the diagnostic commands & the commands for modifying the default configuration. Command to determine the version of the Open Boot PROM on the system: # /usr/platform/'uname -m'/sbin/prtdiag -v (output omitted) System PROM revisions: ---------------------- OBP 4.16.4 2004/12/18 05:21 Sun Blade 1500 (Silver) OBDIAG 4.16.4.2004/12/18 05:21 # prtconf -v OBP 4.16.4 2004/12/18 05:21 Open Boot Architectures Standards: It is based on IEEE standard #1275, accord...