Page 1 of 1

Web interface fails after 6 days

Posted: Fri Nov 05, 2021 2:04 am
by jmynheir
Fedora 34
5.14.11-200.fc34.x86_64 Kernel
Zoneminder 1.36.8
nginx 1.20.1-6
mariadb 3:10.5.12

Everything runs fine for about 6 days then I cannot view live feed, or events and I log the following message from nginx error.log:
2021/11/04 20:38:33 [error] 1122#1122: *188783 FastCGI sent in stderr: "PHP message: ERR [Socket /var/lib/zoneminder/sock/zms-913787s.sock does not exist. This file is created by zms, and since it does not exist, either zms did not run, or zms exited early. Please check your zms logs and ensure that CGI is enabled in apache and check that the PATH_ZMS is set correctly. Make sure that ZM is actually recording. If you are trying to view a live stream and the capture process (zmc) is not running then zms will exit. Please go to http://zoneminder.readthedocs.io/en/lat ... window-etc for more information.]" while reading response header from upstream, client: 192.168.0.1, server: =, request: "GET /zm/index.php?view=request&request=stream&connkey=913787&auth=6e20fb74e8870d05ed1ecbfb7a73661f&command=99 HTTP/2.0", upstream: "fastcgi://unix:/run/php-fpm/www.sock:", host: "[omitted hostname]:8081", referrer: "https://[omitted hostname]:8081/zm/?view=watch&mid=1"
When I look in /var/lib/zoneminder/sock I have many .lock files, but no .sock files
# ls
zmdc.sock zms-076094.lock zms-119189.lock zms-197584.lock zms-306950.lock zms-372434.lock zms-438828.lock zms-531659.lock zms-587739.lock zms-684303.lock zms-755243.lock zms-875155.lock zms-981208.lock
zms-017022.lock zms-092636.lock zms-122658.lock zms-262433.lock zms-326439.lock zms-374866.lock zms-500065.lock zms-543023.lock zms-610047.lock zms-684504.lock zms-820759.lock zms-910467.lock zms-988758.lock
zms-031542.lock zms-100095.lock zms-150056.lock zms-265850.lock zms-331815.lock zms-386351.lock zms-500980.lock zms-563864.lock zms-622567.lock zms-684773.lock zms-826640.lock zms-921101.lock zms-994328.lock
zms-046398.lock zms-103567.lock zms-179325.lock zms-276136.lock zms-357394.lock zms-396067.lock zms-504387.lock zms-563947.lock zms-626818.lock zms-731063.lock zms-836795.lock zms-957271.lock
zms-064880.lock zms-118605.lock zms-189086.lock zms-292584.lock zms-369998.lock zms-402043.lock zms-519749.lock zms-575694.lock zms-683177.lock zms-741508.lock zms-846021.lock zms-981052.lock
this can be remedied by restarting the entire computer, and it will work again for about 6 days. Events are captured during this however, so the zmc is still functioning.

please advise

Re: Web interface fails after 6 days

Posted: Fri Nov 05, 2021 7:42 am
by dougmccrary
You're likely running out of memory and swap.

In case you're actually running 1.36, try the latest version, 1.36.10

Re: Web interface fails after 6 days

Posted: Fri Nov 05, 2021 1:19 pm
by iconnor
I suspect the issue here is lack of fastcgi processes. We have seen this before with nginx. I'm not sure what is going on to make it need them. We also have reports of zms's staying around consuming cpu instead of exiting when you stop viewing.

Re: Web interface fails after 6 days

Posted: Fri Nov 05, 2021 3:59 pm
by jmynheir
sorry, mistyped 1.34.8 instead of 1.36.8. Edited original post to correct it. I upgraded to 1.36.10 last night, will report back after a week or so.

Re: Web interface fails after 6 days

Posted: Thu Nov 11, 2021 5:02 pm
by jmynheir
So after about 4 days the camera streams failed again. I have narrowed it down to a problem with fcgiwrap. I am able to restore funtionality without a full restart, by running

Code: Select all

sudo systemctl restart fcgiwrap@nginx
I have 6 cameras, and my /etc/sysconfig/fcgiwrap is set to 14

Code: Select all

# fcgiwrap configuration parameters

# Specify the number of fcgiwrap processes to prefork
DAEMON_PROCS=14

# Specify additional daemon options. See man fcgiwrap.
DAEMON_OPTS=-f
I have run up to 36 forks but this has no effect on reducing failures
ram usage during failure is only 3Gb out of 8Gb available(same as when running properly)
swap usage is virtully nonexistant, however 8Gb is allocated.

Re: Web interface fails after 6 days

Posted: Sun Nov 14, 2021 7:56 am
by moholstein
try this paramater :
http://nginx.org/en/docs/http/ngx_http_ ... cache_lock

Only time I've messed with Nginx is to get a multi-server system working correctly from a single outside address (different discussion) but 6 is a suspicious number because of logrotate.conf .. which expects HUP of the pid to act right, and if not you end up with a thread trying to write to place it can't. but my guess based on your numbers is below.

are you cams rebooting daily like most HiSilicon (Dahua/Hivision/anything that has word "Maintain" on the left" turn that off. Easiest fix here would be just restart fastcgi at 1am by making a cronjob entry .. correct way would be to fix fastcgi. I'd use nginx module for it regardless of what zoneminder does but I'd guess the reason not to is privilege separation in the code the only thing this would break is stuff you'd have already set in the UI anyhow.