SQL2043N 与 linux的randomize_va_space特性

时间:2022-03-14 04:06


[db2inst1@limt ~]$ db2 ? SQL2043N

SQL2043N  Unable to start a child process or thread.
Unable to start up the child processes or threads required during the
processing of a database utility. There may not be enough available
memory to create the new process or thread. The utility stops
User response: 
Ensure the system limit for number of processes or threads has not been
reached (either increase the limit or reduce the number of processes or
threads already running). Ensure that there is sufficient memory for the
new process or thread. Resubmit the utility command  

从描述看好像是数据库在申请内存的时候失败,但是内存应该很充裕 ,redhat4.6的时候是16G,升级到redhat5.5    
中发现组内其他系统也会偶尔出现SQL2043N,看来这似乎并不是一个偶尔现象 ,晚上就回家百度一下 SQL2043N,获得

ASLR or Address Space Layout Randomization is a feature that is activated by default on some of the newer linux distributions. It is designed to load shared memory objects in random addresses.

In DB2, multiple processes map a shared memory object at the same address across the processes. It was found that DB2 cannot guarantee the availability of address for the shared memory object when ASLR is turned on.

Important note: DB2 10.1 has been enhanced so that ASLR can be safely enabled.
This conflict in the address space means that a process trying to attach a shared memory object to a specific address may not be able to do so, resulting in a failure in shmat subroutine. However, on subsequent retry (using a new process) the shared memory attachment may work. The result is a random set of failures. Some processes that have been known to see this error are: db2pd, db2egcf, and db2vend.
Some of the behaviors seen include the following:
For the db2pd command, it will report no data found even through the instance / database is active:
Database SAMPLE not activated on database partition 0.

For the db2egcf process, used for HA monitoring, the db2egcf may incorrectly determine the instance is down and initiate a failover.

For the db2vend process, backup and log archive methods might fail with an error indicating a child process could not be started:
SQL2043N  Unable to start a child process or thread.  

Diagnosing the problem
When this problem is suspected, check db2diag.log for the shmat failure like the following. Note that the same error message can also occur for a different cause. Hence, it's important to note the process that reported this error.

FUNCTION: DB2 UDB, SQO Memory Management, sqlocshr, probe:180
MESSAGE : ZRC=0x850F0005=-2062614523=SQLO_NOSEG
          "No Storage Available for allocation"
          DIA8305C Memory allocation failure occurred.
CALLED  : OS, -, shmat                    OSERR: EINVAL (22)
Resolving the problem
1) Disable ASLR temporarily (change is only effective until next boot):
Run "sysctl -w kernel.randomize_va_space=0" as root.

2) Disable ASLR immediately and on all subsequent reboots:

Add the following line to /etc/sysctl.conf:
and then run "sysctl -p" as root to make the change take effect immediately.  

大致意思就是LINUX的内存随机化地址特性导致DB2进程不能正确的attach到一个 shared memory object ,那么linux为什么要开启这种特性?
在百度 randomize_va_space 关键字:

Linux Kernel引入了地址空间布局随机化的概念,该概念的提出是出于安全考虑。试想如果堆栈空间的地址都是确定的,那么恶意代码就很容易

在/proc/sys/kernel/randomize_va_space中的值如果为0则表示关闭所有的随机化,如果为1,表示打开mmap base、栈、VDSO页面随机化,如果


了解这些之后突然想起在平时使用db2pd时候,也会出现SQL2043N,然后在运行一次就正常了,因为db2pd通过attach db2共享内存来获得数据库




