many hanging zmu processes are not exiting, process leak

Discussions related to the 1.36.x series of ZoneMinder
fmeili1
Posts: 14
Joined: Fri Jun 11, 2021 4:15 pm

Re: many hanging zmu processes are not exiting, process leak

Post by fmeili1 »

:wink:
I'm sure we are not the only one with this problem, so our workaround idea may help others also... hopefully, someone will be able to fix this problem in one of the next versions...
jsylvia007
Posts: 116
Joined: Wed Mar 11, 2009 8:32 pm

Re: many hanging zmu processes are not exiting, process leak

Post by jsylvia007 »

I agree. Something tells me that we're the only ones keeping such metrics, and since it doesn't affect performance, people probably don't notice it.
mgartin
Posts: 1
Joined: Sat Aug 07, 2021 9:45 am

Re: many hanging zmu processes are not exiting, process leak

Post by mgartin »

Thank you for this.
I have the same problem on a somewhat different setup, and I think the proposed workaround of killing off stray zmu processes will "solve" it for me.

Background:
I've been running Zoneminder for years on an old Fedora server, currently zoneminder-1.36.5-1.fc34.x86_64.
I'm using ZMNinja on my phone, and the API works fine.
I have 7 active remote monitors currently, both wired and wireless, both Modect and Nodect.
The zoneminder service runs as the apache user, same as httpd.

When I activate the zoneminder integration in Home Assistant, and add the cameras to the panel, the httpd (apache) service on the zoneminder server will stall after about 24 hours. The reason seems to be too many zmu processes (I think it stalls at 51), and then the httpd error log says:

Code: Select all

[mpm_event:error] [pid 483768:tid 483768] AH00484: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
Immediately when the zmu processes are killed, the httpd logs burst through with requests that are finally coming through.

So I think this is a combined problem that is connected with the zoneminder API, maybe triggered by "wrong" use by the zoneminder integration, I'm not sure.

I have put the following line in my root crontab on the server:

Code: Select all

0 */3 * * * pkill --signal kill "^zmu$" >> /dev/null
User avatar
sammael
Posts: 13
Joined: Fri Jul 31, 2020 11:26 am

Re: many hanging zmu processes are not exiting, process leak

Post by sammael »

Definitely see this with home assistant integration (which has been throwing various errors from about 3-4 HA version's back, but the stream still works which is the only thing I care about).

I think this is more about having to change the sw that accesses ZM in this "ddos"-ish manner, which makes me sad because there was no activity on the ZM HA integration in quite some time and I think I'll have to look for different cctv app, as compatibility with home assistant is the most highest priority for me as everything in my home and on my devices is connected and managed by HA, but after all the years of using ZM it's not something I'm too keen on.

edit:
perhaps related to how many time the HA ZM integration logs in? Looks like it's every 10 seconds to pull still image from each cam and that apparently hammers the DB as I originally noticed this issue when I couldn't watch stream of my cameras and in ZM web ui it showed DB 151/151 in yellow and in syslog were messages about db not being accessible.

Code: Select all

zoneminder    | Aug  8 09:32:16 1b8519f654dc zms_m6[5534]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:32:16 1b8519f654dc zms_m9[5535]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:32:31 1b8519f654dc zms_m9[5583]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:32:31 1b8519f654dc zms_m6[5584]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:32:41 1b8519f654dc zms_m6[5590]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:32:41 1b8519f654dc zms_m9[5591]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:32:51 1b8519f654dc zms_m6[5600]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:32:51 1b8519f654dc zms_m9[5601]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:01 1b8519f654dc zms_m6[5639]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:01 1b8519f654dc zms_m9[5640]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:11 1b8519f654dc zms_m6[5649]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:11 1b8519f654dc zms_m9[5648]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:21 1b8519f654dc zms_m9[5681]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:21 1b8519f654dc zms_m6[5682]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:31 1b8519f654dc zms_m6[5700]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:31 1b8519f654dc zms_m9[5701]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:40 1b8519f654dc zms_m6[5706]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:40 1b8519f654dc zms_m9[5707]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:41 1b8519f654dc zms_m9[5711]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:42 1b8519f654dc zms_m6[5713]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:42 1b8519f654dc zms_m1[5712]: INF [zms_m1] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:42 1b8519f654dc zms_m2[5710]: INF [zms_m2] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:50 1b8519f654dc zms_m6[5747]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:33:50 1b8519f654dc zms_m9[5748]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:34:00 1b8519f654dc zms_m9[5757]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:34:10 1b8519f654dc zms_m9[5768]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:34:19 1b8519f654dc zms_m9[5797]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:34:30 1b8519f654dc zms_m9[5805]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:34:40 1b8519f654dc zms_m9[5817]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:34:50 1b8519f654dc zms_m9[5848]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:35:00 1b8519f654dc zms_m9[5856]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:35:10 1b8519f654dc zms_m9[5868]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:35:20 1b8519f654dc zms_m9[5897]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:35:30 1b8519f654dc zms_m9[5905]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:35:40 1b8519f654dc zms_m9[5918]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:35:50 1b8519f654dc zms_m9[5948]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:35:59 1b8519f654dc zms_m9[5956]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:36:09 1b8519f654dc zms_m9[5967]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:36:19 1b8519f654dc zms_m9[5996]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:36:29 1b8519f654dc zms_m6[6001]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:36:29 1b8519f654dc zms_m1[6000]: INF [zms_m1] [Authenticated user 'ha']
zoneminder    | Aug  8 09:36:29 1b8519f654dc zms_m2[5998]: INF [zms_m2] [Authenticated user 'ha']
zoneminder    | Aug  8 09:36:29 1b8519f654dc zms_m9[5999]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:36:29 1b8519f654dc zms_m9[6012]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:36:39 1b8519f654dc zms_m9[6025]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:36:49 1b8519f654dc zms_m9[6054]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:00 1b8519f654dc zms_m9[6061]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:09 1b8519f654dc zms_m9[6075]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:15 1b8519f654dc zms_m6[6077]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:25 1b8519f654dc zms_m6[6106]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:25 1b8519f654dc zms_m9[6107]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:35 1b8519f654dc zms_m6[6126]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:35 1b8519f654dc zms_m9[6127]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:45 1b8519f654dc zms_m6[6136]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:45 1b8519f654dc zms_m9[6137]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:55 1b8519f654dc zms_m9[6167]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:37:55 1b8519f654dc zms_m6[6168]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:38:05 1b8519f654dc zms_m6[6188]: INF [zms_m6] [Authenticated user 'ha']
zoneminder    | Aug  8 09:38:05 1b8519f654dc zms_m9[6187]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:38:15 1b8519f654dc zms_m9[6191]: INF [zms_m9] [Authenticated user 'ha']
zoneminder    | Aug  8 09:38:15 1b8519f654dc zms_m6[6192]: INF [zms_m6] [Authenticated user 'ha']
edit2: also worth noting though I've been using this setup for over 2 years and only seen this issue first time about 4days ago and the ZM integration was always like this. It's problems like this where I'm not really sure if to ask about it on ZM forum or HA forum.
craigmate
Posts: 1
Joined: Tue Aug 10, 2021 5:41 am

Re: many hanging zmu processes are not exiting, process leak

Post by craigmate »

HI
This is my first post on zoneminder, but I have been using the product for aboubt 2 years
I also use it with Home assistant, however when I upgraded to 1.36, I noticed that there was a lot of errors in the HA logs about not connecting to the server (I run it with ubuntu 18.04)
However the cameras still appared to work.
But after a while I noticed that zonemider had crashed and the web page was displaying Unable to connect to ZM db.SQLSTATE[08004] [1040] Too many connections

I removed the intergration from Home assistant, and so far it seems to be working
But it is strange, once I went to 1.36 As I said I have been using this for about 2 years without any issues...

Did anyone else post on the Home Assistant Forums.
User avatar
sammael
Posts: 13
Joined: Fri Jul 31, 2020 11:26 am

Re: many hanging zmu processes are not exiting, process leak

Post by sammael »

There's this open issue on zm-py github (which is what HA integration is querying as I understand it), which broke in a way you describe https://github.com/rohankapoorcom/zm-py/issues/48. Thing is last activity on zm-py github was 10months ago...

I found the workaround with kill the zmu's working well meanwhile. Added

Code: Select all

0 */3 * * * docker exec zoneminder pkill --signal kill "^zmu$" >/dev/null 2>&1
to my crontab (as I run in docker) and everything is peachy.
wallee
Posts: 2
Joined: Fri Oct 23, 2020 9:13 am

Re: many hanging zmu processes are not exiting, process leak

Post by wallee »

Having the same problems overhere. Each zmu process will use 1 mysql connection. I having max 150 and then mysql servers stops because limit is reached..
As a workaround i increased the limit to 1000 and restart apache2 every night.
alabamatoy
Posts: 360
Joined: Sun Jun 05, 2016 2:53 pm

Re: many hanging zmu processes are not exiting, process leak

Post by alabamatoy »

I created a cron to restart zoneminder (not apache2) every night, seems to have helped. I am also gathering data on ram useage of the various processes to try to figure out what is going wrong. I know the developers have worked hard on trying to resolve these RAM leak issues. 1.36.12 is coming soon, perhaps today, so please upgrade and post up if this has resolved your problems.
dparring
Posts: 5
Joined: Tue Jan 08, 2019 9:10 pm

Re: many hanging zmu processes are not exiting, process leak

Post by dparring »

Adding some info about this issue here in case it helps anyone. The visible symptom is that the web server becomes unresponsive and then breaks Home Assistant, zmNinja, and anything using the web interface.

I'm fairly certain it is triggered by Home Assistant's ZoneMinder integration causing the number of active php-fpm processes to grow until it hits the maximum and cannot serve more requests. I tried it on an existing ZM installation and also on a brand new server with the same result. I'm running 8 Dahua cameras in mocord mode and hosting on a dedicated Fedora 35 virtual machine. ZM is installed using dnf packages and the settings are generally default other than adding the cameras and storage. Turning on the HA integration will reliably cause the issue within a few hours, if HA is off then ZM is stable.

On the server, pairs of hung php-fpm and zmu processes start to accumulate until they are killed. The pids are generally not immediately adjacent - it looks like the php-fpm process serves a number of successful requests before eventually blocking on a failed zmu request. zmu works fine most of the time, but in my case it hangs with a frequency of about 5 times per hour. The command line for the hanging zmu is always "/usr/bin/zmu -mX -s" where X is the monitor ID between 1-8. The -s flag returns the monitor state as an int so it's likely hanging in that function somewhere but I haven't been able to dig further.

If you enable and monitor the php-fpm status page you can see the number of active processes increases in a linear fashion. You can delay the inevitable crash by increasing pm.max_children in /etc/php-fpm.d/www.conf, but eventually it will hit the max and then stop responding. Running "sudo systemctl restart php-fpm httpd" will reliably kill the hanging processes and bring the count back to normal without reboot. I've added that command to a regular cron job until there is a better fix available.

I can provide more info if the devs need anything but my guess is that Home Assistant's repeated querying is causing zmu to hang occasionally when checking monitor state, which eventually causes pm.max_children to block new process creation. Here is a sample ps aux output approximately 3 hours after the last reset if it helps:

Code: Select all

[root@vm-zoneminder ~]# ps aux|grep "zmu\|php-fpm"
root      907317  0.0  0.1 246896 28236 ?        Ss   12:01   0:00 php-fpm: master process (/etc/php-fpm.conf)
apache    907318  0.0  0.1 247688 24760 ?        S    12:01   0:11 php-fpm: pool www
apache    907319  0.0  0.1 247688 24844 ?        S    12:01   0:03 php-fpm: pool www
apache    907320  0.1  0.1 247700 25360 ?        S    12:01   0:14 php-fpm: pool www
apache    907321  0.0  0.1 247676 24528 ?        S    12:01   0:03 php-fpm: pool www
apache    907322  0.0  0.1 247676 24196 ?        S    12:01   0:04 php-fpm: pool www
apache    907349  0.0  0.1 247712 25200 ?        S    12:01   0:09 php-fpm: pool www
apache    907368  0.0  0.1 247680 24032 ?        S    12:01   0:01 php-fpm: pool www
apache    908269  0.0  0.1 247680 24076 ?        S    12:10   0:02 php-fpm: pool www
apache    910424  0.0  0.2 224512 35060 ?        Sl   12:28   0:00 /usr/bin/zmu -m5 -s
apache    912220  0.0  0.2 224512 35356 ?        Sl   12:45   0:00 /usr/bin/zmu -m5 -s
apache    912264  0.1  0.1 247680 24164 ?        S    12:45   0:11 php-fpm: pool www
apache    913080  0.0  0.2 224512 35136 ?        Sl   12:53   0:00 /usr/bin/zmu -m8 -s
apache    913094  0.0  0.1 247680 24140 ?        S    12:53   0:01 php-fpm: pool www
apache    913099  0.0  0.2 224512 35368 ?        Sl   12:53   0:00 /usr/bin/zmu -m3 -s
apache    913146  0.0  0.1 247680 24152 ?        S    12:53   0:03 php-fpm: pool www
apache    914098  0.0  0.1 247680 24076 ?        S    13:02   0:01 php-fpm: pool www
apache    915079  0.0  0.2 224512 35528 ?        Sl   13:10   0:00 /usr/bin/zmu -m3 -s
apache    915269  0.0  0.2 224512 35356 ?        Sl   13:12   0:00 /usr/bin/zmu -m7 -s
apache    915297  0.1  0.1 247680 24152 ?        S    13:12   0:09 php-fpm: pool www
apache    916059  0.0  0.1 247680 24152 ?        S    13:19   0:02 php-fpm: pool www
apache    916197  0.0  0.2 224512 35416 ?        Sl   13:20   0:00 /usr/bin/zmu -m3 -s
apache    916241  0.0  0.1 247680 24156 ?        S    13:21   0:06 php-fpm: pool www
apache    919632  0.0  0.2 224512 35500 ?        Sl   13:51   0:00 /usr/bin/zmu -m2 -s
apache    919894  0.1  0.1 247680 24172 ?        S    13:54   0:07 php-fpm: pool www
apache    920415  0.0  0.2 224512 35316 ?        Sl   13:58   0:00 /usr/bin/zmu -m4 -s
apache    921714  0.0  0.1 247680 24176 ?        S    14:10   0:05 php-fpm: pool www
apache    922612  0.0  0.2 224512 35656 ?        Sl   14:18   0:00 /usr/bin/zmu -m6 -s
apache    923794  0.1  0.1 247680 24176 ?        S    14:28   0:04 php-fpm: pool www
apache    926695  0.0  0.2 224512 35576 ?        Sl   14:55   0:00 /usr/bin/zmu -m7 -s
apache    926972  0.0  0.2 224512 35604 ?        Sl   14:57   0:00 /usr/bin/zmu -m7 -s
apache    927026  0.0  0.1 247680 24108 ?        S    14:58   0:00 php-fpm: pool www
apache    927989  0.0  0.2 224512 35780 ?        Sl   15:06   0:00 /usr/bin/zmu -m2 -s
apache    928053  0.1  0.1 247680 24172 ?        S    15:07   0:02 php-fpm: pool www
apache    928365  0.1  0.1 247680 24116 ?        S    15:10   0:01 php-fpm: pool www
apache    930517  0.0  0.2 224512 35772 ?        Sl   15:29   0:00 /usr/bin/zmu -m2 -s
apache    930636  0.0  0.2 224512 35668 ?        Sl   15:30   0:00 /usr/bin/zmu -m4 -s
apache    930675  0.1  0.1 247680 24180 ?        S    15:31   0:00 php-fpm: pool www
apache    931357  0.2  0.2 224512 36124 ?        Sl   15:37   0:00 /usr/bin/zmu -m4 -s
apache    931404  0.1  0.1 247680 23872 ?        S    15:37   0:00 php-fpm: pool www
edit: I bolded the workaround for anyone just looking for a quick answer.

Also, this is running on ZM version 1.36.12, the package is "zoneminder-common-1.36.12-1.fc35.x86_64.rpm"
Post Reply