共计 1438 个字符,预计需要花费 4 分钟才能阅读完成。
我们的服务器中使用了很多启动脚本为 shell 脚本,为了方便管理改为 systemctl 方式管理。
早上重启后正常,但是晚上流量高峰期间,大量用户无法链接服务器。
查看服务器进程日志出现大量报警日志。
后经过大佬排查。发现是因为 systemctl 启动的进程没有遵循 limits 资源限制,导致到达 systemctl 默认限定值后无法加载更多文件!
[root@kilig ~]# ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 15082
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 102400
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 102400
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
对比进程启动 limits
[root@kilig ~]# cat /proc/1024/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 1024 1024 processes
Max open files 1024 1024 files
Max locked memory 8048 8048 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 1508 1508 signals
Max msgqueue size 8192 8192 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
发现明显差异
解决办法加上 LimitNOFILE LimitNOFILE LimitNOFILE 指定参数即可
[Unit]
Description=kilig.systemctl
[Service]
LimitNOFILE=infinity
LimitNOFILE=102400
LimitNOFILE=102400
WorkingDirectory=xxxxxxx
ExecStart=xxxxxxxxxxxxxx
ExecStopPost=xxxxxxxxxxxxxxxxxxxxxx
Restart=always
正文完