many hanging zmu processes are not exiting, process leak
Re: many hanging zmu processes are not exiting, process leak
I'm sure we are not the only one with this problem, so our workaround idea may help others also... hopefully, someone will be able to fix this problem in one of the next versions...
-
- Posts: 116
- Joined: Wed Mar 11, 2009 8:32 pm
Re: many hanging zmu processes are not exiting, process leak
I agree. Something tells me that we're the only ones keeping such metrics, and since it doesn't affect performance, people probably don't notice it.
Re: many hanging zmu processes are not exiting, process leak
Thank you for this.
I have the same problem on a somewhat different setup, and I think the proposed workaround of killing off stray zmu processes will "solve" it for me.
Background:
I've been running Zoneminder for years on an old Fedora server, currently zoneminder-1.36.5-1.fc34.x86_64.
I'm using ZMNinja on my phone, and the API works fine.
I have 7 active remote monitors currently, both wired and wireless, both Modect and Nodect.
The zoneminder service runs as the apache user, same as httpd.
When I activate the zoneminder integration in Home Assistant, and add the cameras to the panel, the httpd (apache) service on the zoneminder server will stall after about 24 hours. The reason seems to be too many zmu processes (I think it stalls at 51), and then the httpd error log says:
Immediately when the zmu processes are killed, the httpd logs burst through with requests that are finally coming through.
So I think this is a combined problem that is connected with the zoneminder API, maybe triggered by "wrong" use by the zoneminder integration, I'm not sure.
I have put the following line in my root crontab on the server:
I have the same problem on a somewhat different setup, and I think the proposed workaround of killing off stray zmu processes will "solve" it for me.
Background:
I've been running Zoneminder for years on an old Fedora server, currently zoneminder-1.36.5-1.fc34.x86_64.
I'm using ZMNinja on my phone, and the API works fine.
I have 7 active remote monitors currently, both wired and wireless, both Modect and Nodect.
The zoneminder service runs as the apache user, same as httpd.
When I activate the zoneminder integration in Home Assistant, and add the cameras to the panel, the httpd (apache) service on the zoneminder server will stall after about 24 hours. The reason seems to be too many zmu processes (I think it stalls at 51), and then the httpd error log says:
Code: Select all
[mpm_event:error] [pid 483768:tid 483768] AH00484: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
So I think this is a combined problem that is connected with the zoneminder API, maybe triggered by "wrong" use by the zoneminder integration, I'm not sure.
I have put the following line in my root crontab on the server:
Code: Select all
0 */3 * * * pkill --signal kill "^zmu$" >> /dev/null
Re: many hanging zmu processes are not exiting, process leak
Definitely see this with home assistant integration (which has been throwing various errors from about 3-4 HA version's back, but the stream still works which is the only thing I care about).
I think this is more about having to change the sw that accesses ZM in this "ddos"-ish manner, which makes me sad because there was no activity on the ZM HA integration in quite some time and I think I'll have to look for different cctv app, as compatibility with home assistant is the most highest priority for me as everything in my home and on my devices is connected and managed by HA, but after all the years of using ZM it's not something I'm too keen on.
edit:
perhaps related to how many time the HA ZM integration logs in? Looks like it's every 10 seconds to pull still image from each cam and that apparently hammers the DB as I originally noticed this issue when I couldn't watch stream of my cameras and in ZM web ui it showed DB 151/151 in yellow and in syslog were messages about db not being accessible.
edit2: also worth noting though I've been using this setup for over 2 years and only seen this issue first time about 4days ago and the ZM integration was always like this. It's problems like this where I'm not really sure if to ask about it on ZM forum or HA forum.
I think this is more about having to change the sw that accesses ZM in this "ddos"-ish manner, which makes me sad because there was no activity on the ZM HA integration in quite some time and I think I'll have to look for different cctv app, as compatibility with home assistant is the most highest priority for me as everything in my home and on my devices is connected and managed by HA, but after all the years of using ZM it's not something I'm too keen on.
edit:
perhaps related to how many time the HA ZM integration logs in? Looks like it's every 10 seconds to pull still image from each cam and that apparently hammers the DB as I originally noticed this issue when I couldn't watch stream of my cameras and in ZM web ui it showed DB 151/151 in yellow and in syslog were messages about db not being accessible.
Code: Select all
zoneminder | Aug 8 09:32:16 1b8519f654dc zms_m6[5534]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:32:16 1b8519f654dc zms_m9[5535]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:32:31 1b8519f654dc zms_m9[5583]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:32:31 1b8519f654dc zms_m6[5584]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:32:41 1b8519f654dc zms_m6[5590]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:32:41 1b8519f654dc zms_m9[5591]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:32:51 1b8519f654dc zms_m6[5600]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:32:51 1b8519f654dc zms_m9[5601]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:01 1b8519f654dc zms_m6[5639]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:01 1b8519f654dc zms_m9[5640]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:11 1b8519f654dc zms_m6[5649]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:11 1b8519f654dc zms_m9[5648]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:21 1b8519f654dc zms_m9[5681]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:21 1b8519f654dc zms_m6[5682]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:31 1b8519f654dc zms_m6[5700]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:31 1b8519f654dc zms_m9[5701]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:40 1b8519f654dc zms_m6[5706]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:40 1b8519f654dc zms_m9[5707]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:41 1b8519f654dc zms_m9[5711]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:42 1b8519f654dc zms_m6[5713]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:42 1b8519f654dc zms_m1[5712]: INF [zms_m1] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:42 1b8519f654dc zms_m2[5710]: INF [zms_m2] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:50 1b8519f654dc zms_m6[5747]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:33:50 1b8519f654dc zms_m9[5748]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:34:00 1b8519f654dc zms_m9[5757]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:34:10 1b8519f654dc zms_m9[5768]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:34:19 1b8519f654dc zms_m9[5797]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:34:30 1b8519f654dc zms_m9[5805]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:34:40 1b8519f654dc zms_m9[5817]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:34:50 1b8519f654dc zms_m9[5848]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:35:00 1b8519f654dc zms_m9[5856]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:35:10 1b8519f654dc zms_m9[5868]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:35:20 1b8519f654dc zms_m9[5897]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:35:30 1b8519f654dc zms_m9[5905]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:35:40 1b8519f654dc zms_m9[5918]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:35:50 1b8519f654dc zms_m9[5948]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:35:59 1b8519f654dc zms_m9[5956]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:36:09 1b8519f654dc zms_m9[5967]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:36:19 1b8519f654dc zms_m9[5996]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:36:29 1b8519f654dc zms_m6[6001]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:36:29 1b8519f654dc zms_m1[6000]: INF [zms_m1] [Authenticated user 'ha']
zoneminder | Aug 8 09:36:29 1b8519f654dc zms_m2[5998]: INF [zms_m2] [Authenticated user 'ha']
zoneminder | Aug 8 09:36:29 1b8519f654dc zms_m9[5999]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:36:29 1b8519f654dc zms_m9[6012]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:36:39 1b8519f654dc zms_m9[6025]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:36:49 1b8519f654dc zms_m9[6054]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:00 1b8519f654dc zms_m9[6061]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:09 1b8519f654dc zms_m9[6075]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:15 1b8519f654dc zms_m6[6077]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:25 1b8519f654dc zms_m6[6106]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:25 1b8519f654dc zms_m9[6107]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:35 1b8519f654dc zms_m6[6126]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:35 1b8519f654dc zms_m9[6127]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:45 1b8519f654dc zms_m6[6136]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:45 1b8519f654dc zms_m9[6137]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:55 1b8519f654dc zms_m9[6167]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:37:55 1b8519f654dc zms_m6[6168]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:38:05 1b8519f654dc zms_m6[6188]: INF [zms_m6] [Authenticated user 'ha']
zoneminder | Aug 8 09:38:05 1b8519f654dc zms_m9[6187]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:38:15 1b8519f654dc zms_m9[6191]: INF [zms_m9] [Authenticated user 'ha']
zoneminder | Aug 8 09:38:15 1b8519f654dc zms_m6[6192]: INF [zms_m6] [Authenticated user 'ha']
Re: many hanging zmu processes are not exiting, process leak
HI
This is my first post on zoneminder, but I have been using the product for aboubt 2 years
I also use it with Home assistant, however when I upgraded to 1.36, I noticed that there was a lot of errors in the HA logs about not connecting to the server (I run it with ubuntu 18.04)
However the cameras still appared to work.
But after a while I noticed that zonemider had crashed and the web page was displaying Unable to connect to ZM db.SQLSTATE[08004] [1040] Too many connections
I removed the intergration from Home assistant, and so far it seems to be working
But it is strange, once I went to 1.36 As I said I have been using this for about 2 years without any issues...
Did anyone else post on the Home Assistant Forums.
This is my first post on zoneminder, but I have been using the product for aboubt 2 years
I also use it with Home assistant, however when I upgraded to 1.36, I noticed that there was a lot of errors in the HA logs about not connecting to the server (I run it with ubuntu 18.04)
However the cameras still appared to work.
But after a while I noticed that zonemider had crashed and the web page was displaying Unable to connect to ZM db.SQLSTATE[08004] [1040] Too many connections
I removed the intergration from Home assistant, and so far it seems to be working
But it is strange, once I went to 1.36 As I said I have been using this for about 2 years without any issues...
Did anyone else post on the Home Assistant Forums.
Re: many hanging zmu processes are not exiting, process leak
There's this open issue on zm-py github (which is what HA integration is querying as I understand it), which broke in a way you describe https://github.com/rohankapoorcom/zm-py/issues/48. Thing is last activity on zm-py github was 10months ago...
I found the workaround with kill the zmu's working well meanwhile. Added to my crontab (as I run in docker) and everything is peachy.
I found the workaround with kill the zmu's working well meanwhile. Added
Code: Select all
0 */3 * * * docker exec zoneminder pkill --signal kill "^zmu$" >/dev/null 2>&1
Re: many hanging zmu processes are not exiting, process leak
Having the same problems overhere. Each zmu process will use 1 mysql connection. I having max 150 and then mysql servers stops because limit is reached..
As a workaround i increased the limit to 1000 and restart apache2 every night.
As a workaround i increased the limit to 1000 and restart apache2 every night.
-
- Posts: 360
- Joined: Sun Jun 05, 2016 2:53 pm
Re: many hanging zmu processes are not exiting, process leak
I created a cron to restart zoneminder (not apache2) every night, seems to have helped. I am also gathering data on ram useage of the various processes to try to figure out what is going wrong. I know the developers have worked hard on trying to resolve these RAM leak issues. 1.36.12 is coming soon, perhaps today, so please upgrade and post up if this has resolved your problems.
Re: many hanging zmu processes are not exiting, process leak
Adding some info about this issue here in case it helps anyone. The visible symptom is that the web server becomes unresponsive and then breaks Home Assistant, zmNinja, and anything using the web interface.
I'm fairly certain it is triggered by Home Assistant's ZoneMinder integration causing the number of active php-fpm processes to grow until it hits the maximum and cannot serve more requests. I tried it on an existing ZM installation and also on a brand new server with the same result. I'm running 8 Dahua cameras in mocord mode and hosting on a dedicated Fedora 35 virtual machine. ZM is installed using dnf packages and the settings are generally default other than adding the cameras and storage. Turning on the HA integration will reliably cause the issue within a few hours, if HA is off then ZM is stable.
On the server, pairs of hung php-fpm and zmu processes start to accumulate until they are killed. The pids are generally not immediately adjacent - it looks like the php-fpm process serves a number of successful requests before eventually blocking on a failed zmu request. zmu works fine most of the time, but in my case it hangs with a frequency of about 5 times per hour. The command line for the hanging zmu is always "/usr/bin/zmu -mX -s" where X is the monitor ID between 1-8. The -s flag returns the monitor state as an int so it's likely hanging in that function somewhere but I haven't been able to dig further.
If you enable and monitor the php-fpm status page you can see the number of active processes increases in a linear fashion. You can delay the inevitable crash by increasing pm.max_children in /etc/php-fpm.d/www.conf, but eventually it will hit the max and then stop responding. Running "sudo systemctl restart php-fpm httpd" will reliably kill the hanging processes and bring the count back to normal without reboot. I've added that command to a regular cron job until there is a better fix available.
I can provide more info if the devs need anything but my guess is that Home Assistant's repeated querying is causing zmu to hang occasionally when checking monitor state, which eventually causes pm.max_children to block new process creation. Here is a sample ps aux output approximately 3 hours after the last reset if it helps:
edit: I bolded the workaround for anyone just looking for a quick answer.
Also, this is running on ZM version 1.36.12, the package is "zoneminder-common-1.36.12-1.fc35.x86_64.rpm"
I'm fairly certain it is triggered by Home Assistant's ZoneMinder integration causing the number of active php-fpm processes to grow until it hits the maximum and cannot serve more requests. I tried it on an existing ZM installation and also on a brand new server with the same result. I'm running 8 Dahua cameras in mocord mode and hosting on a dedicated Fedora 35 virtual machine. ZM is installed using dnf packages and the settings are generally default other than adding the cameras and storage. Turning on the HA integration will reliably cause the issue within a few hours, if HA is off then ZM is stable.
On the server, pairs of hung php-fpm and zmu processes start to accumulate until they are killed. The pids are generally not immediately adjacent - it looks like the php-fpm process serves a number of successful requests before eventually blocking on a failed zmu request. zmu works fine most of the time, but in my case it hangs with a frequency of about 5 times per hour. The command line for the hanging zmu is always "/usr/bin/zmu -mX -s" where X is the monitor ID between 1-8. The -s flag returns the monitor state as an int so it's likely hanging in that function somewhere but I haven't been able to dig further.
If you enable and monitor the php-fpm status page you can see the number of active processes increases in a linear fashion. You can delay the inevitable crash by increasing pm.max_children in /etc/php-fpm.d/www.conf, but eventually it will hit the max and then stop responding. Running "sudo systemctl restart php-fpm httpd" will reliably kill the hanging processes and bring the count back to normal without reboot. I've added that command to a regular cron job until there is a better fix available.
I can provide more info if the devs need anything but my guess is that Home Assistant's repeated querying is causing zmu to hang occasionally when checking monitor state, which eventually causes pm.max_children to block new process creation. Here is a sample ps aux output approximately 3 hours after the last reset if it helps:
Code: Select all
[root@vm-zoneminder ~]# ps aux|grep "zmu\|php-fpm"
root 907317 0.0 0.1 246896 28236 ? Ss 12:01 0:00 php-fpm: master process (/etc/php-fpm.conf)
apache 907318 0.0 0.1 247688 24760 ? S 12:01 0:11 php-fpm: pool www
apache 907319 0.0 0.1 247688 24844 ? S 12:01 0:03 php-fpm: pool www
apache 907320 0.1 0.1 247700 25360 ? S 12:01 0:14 php-fpm: pool www
apache 907321 0.0 0.1 247676 24528 ? S 12:01 0:03 php-fpm: pool www
apache 907322 0.0 0.1 247676 24196 ? S 12:01 0:04 php-fpm: pool www
apache 907349 0.0 0.1 247712 25200 ? S 12:01 0:09 php-fpm: pool www
apache 907368 0.0 0.1 247680 24032 ? S 12:01 0:01 php-fpm: pool www
apache 908269 0.0 0.1 247680 24076 ? S 12:10 0:02 php-fpm: pool www
apache 910424 0.0 0.2 224512 35060 ? Sl 12:28 0:00 /usr/bin/zmu -m5 -s
apache 912220 0.0 0.2 224512 35356 ? Sl 12:45 0:00 /usr/bin/zmu -m5 -s
apache 912264 0.1 0.1 247680 24164 ? S 12:45 0:11 php-fpm: pool www
apache 913080 0.0 0.2 224512 35136 ? Sl 12:53 0:00 /usr/bin/zmu -m8 -s
apache 913094 0.0 0.1 247680 24140 ? S 12:53 0:01 php-fpm: pool www
apache 913099 0.0 0.2 224512 35368 ? Sl 12:53 0:00 /usr/bin/zmu -m3 -s
apache 913146 0.0 0.1 247680 24152 ? S 12:53 0:03 php-fpm: pool www
apache 914098 0.0 0.1 247680 24076 ? S 13:02 0:01 php-fpm: pool www
apache 915079 0.0 0.2 224512 35528 ? Sl 13:10 0:00 /usr/bin/zmu -m3 -s
apache 915269 0.0 0.2 224512 35356 ? Sl 13:12 0:00 /usr/bin/zmu -m7 -s
apache 915297 0.1 0.1 247680 24152 ? S 13:12 0:09 php-fpm: pool www
apache 916059 0.0 0.1 247680 24152 ? S 13:19 0:02 php-fpm: pool www
apache 916197 0.0 0.2 224512 35416 ? Sl 13:20 0:00 /usr/bin/zmu -m3 -s
apache 916241 0.0 0.1 247680 24156 ? S 13:21 0:06 php-fpm: pool www
apache 919632 0.0 0.2 224512 35500 ? Sl 13:51 0:00 /usr/bin/zmu -m2 -s
apache 919894 0.1 0.1 247680 24172 ? S 13:54 0:07 php-fpm: pool www
apache 920415 0.0 0.2 224512 35316 ? Sl 13:58 0:00 /usr/bin/zmu -m4 -s
apache 921714 0.0 0.1 247680 24176 ? S 14:10 0:05 php-fpm: pool www
apache 922612 0.0 0.2 224512 35656 ? Sl 14:18 0:00 /usr/bin/zmu -m6 -s
apache 923794 0.1 0.1 247680 24176 ? S 14:28 0:04 php-fpm: pool www
apache 926695 0.0 0.2 224512 35576 ? Sl 14:55 0:00 /usr/bin/zmu -m7 -s
apache 926972 0.0 0.2 224512 35604 ? Sl 14:57 0:00 /usr/bin/zmu -m7 -s
apache 927026 0.0 0.1 247680 24108 ? S 14:58 0:00 php-fpm: pool www
apache 927989 0.0 0.2 224512 35780 ? Sl 15:06 0:00 /usr/bin/zmu -m2 -s
apache 928053 0.1 0.1 247680 24172 ? S 15:07 0:02 php-fpm: pool www
apache 928365 0.1 0.1 247680 24116 ? S 15:10 0:01 php-fpm: pool www
apache 930517 0.0 0.2 224512 35772 ? Sl 15:29 0:00 /usr/bin/zmu -m2 -s
apache 930636 0.0 0.2 224512 35668 ? Sl 15:30 0:00 /usr/bin/zmu -m4 -s
apache 930675 0.1 0.1 247680 24180 ? S 15:31 0:00 php-fpm: pool www
apache 931357 0.2 0.2 224512 36124 ? Sl 15:37 0:00 /usr/bin/zmu -m4 -s
apache 931404 0.1 0.1 247680 23872 ? S 15:37 0:00 php-fpm: pool www
Also, this is running on ZM version 1.36.12, the package is "zoneminder-common-1.36.12-1.fc35.x86_64.rpm"