Web interface fails after 6 days

Discussions related to the 1.36.x series of ZoneMinder
Post Reply
jmynheir
Posts: 14
Joined: Fri Oct 05, 2018 6:24 am

Web interface fails after 6 days

Post by jmynheir »

Fedora 34
5.14.11-200.fc34.x86_64 Kernel
Zoneminder 1.36.8
nginx 1.20.1-6
mariadb 3:10.5.12

Everything runs fine for about 6 days then I cannot view live feed, or events and I log the following message from nginx error.log:
2021/11/04 20:38:33 [error] 1122#1122: *188783 FastCGI sent in stderr: "PHP message: ERR [Socket /var/lib/zoneminder/sock/zms-913787s.sock does not exist. This file is created by zms, and since it does not exist, either zms did not run, or zms exited early. Please check your zms logs and ensure that CGI is enabled in apache and check that the PATH_ZMS is set correctly. Make sure that ZM is actually recording. If you are trying to view a live stream and the capture process (zmc) is not running then zms will exit. Please go to http://zoneminder.readthedocs.io/en/lat ... window-etc for more information.]" while reading response header from upstream, client: 192.168.0.1, server: =, request: "GET /zm/index.php?view=request&request=stream&connkey=913787&auth=6e20fb74e8870d05ed1ecbfb7a73661f&command=99 HTTP/2.0", upstream: "fastcgi://unix:/run/php-fpm/www.sock:", host: "[omitted hostname]:8081", referrer: "https://[omitted hostname]:8081/zm/?view=watch&mid=1"
When I look in /var/lib/zoneminder/sock I have many .lock files, but no .sock files
# ls
zmdc.sock zms-076094.lock zms-119189.lock zms-197584.lock zms-306950.lock zms-372434.lock zms-438828.lock zms-531659.lock zms-587739.lock zms-684303.lock zms-755243.lock zms-875155.lock zms-981208.lock
zms-017022.lock zms-092636.lock zms-122658.lock zms-262433.lock zms-326439.lock zms-374866.lock zms-500065.lock zms-543023.lock zms-610047.lock zms-684504.lock zms-820759.lock zms-910467.lock zms-988758.lock
zms-031542.lock zms-100095.lock zms-150056.lock zms-265850.lock zms-331815.lock zms-386351.lock zms-500980.lock zms-563864.lock zms-622567.lock zms-684773.lock zms-826640.lock zms-921101.lock zms-994328.lock
zms-046398.lock zms-103567.lock zms-179325.lock zms-276136.lock zms-357394.lock zms-396067.lock zms-504387.lock zms-563947.lock zms-626818.lock zms-731063.lock zms-836795.lock zms-957271.lock
zms-064880.lock zms-118605.lock zms-189086.lock zms-292584.lock zms-369998.lock zms-402043.lock zms-519749.lock zms-575694.lock zms-683177.lock zms-741508.lock zms-846021.lock zms-981052.lock
this can be remedied by restarting the entire computer, and it will work again for about 6 days. Events are captured during this however, so the zmc is still functioning.

please advise
Last edited by jmynheir on Fri Nov 05, 2021 3:59 pm, edited 1 time in total.
dougmccrary
Posts: 1314
Joined: Sat Aug 31, 2019 7:35 am
Location: San Diego

Re: Web interface fails after 6 days

Post by dougmccrary »

You're likely running out of memory and swap.

In case you're actually running 1.36, try the latest version, 1.36.10
User avatar
iconnor
Posts: 3126
Joined: Fri Oct 29, 2010 1:43 am
Location: Toronto
Contact:

Re: Web interface fails after 6 days

Post by iconnor »

I suspect the issue here is lack of fastcgi processes. We have seen this before with nginx. I'm not sure what is going on to make it need them. We also have reports of zms's staying around consuming cpu instead of exiting when you stop viewing.
jmynheir
Posts: 14
Joined: Fri Oct 05, 2018 6:24 am

Re: Web interface fails after 6 days

Post by jmynheir »

sorry, mistyped 1.34.8 instead of 1.36.8. Edited original post to correct it. I upgraded to 1.36.10 last night, will report back after a week or so.
jmynheir
Posts: 14
Joined: Fri Oct 05, 2018 6:24 am

Re: Web interface fails after 6 days

Post by jmynheir »

So after about 4 days the camera streams failed again. I have narrowed it down to a problem with fcgiwrap. I am able to restore funtionality without a full restart, by running

Code: Select all

sudo systemctl restart fcgiwrap@nginx
I have 6 cameras, and my /etc/sysconfig/fcgiwrap is set to 14

Code: Select all

# fcgiwrap configuration parameters

# Specify the number of fcgiwrap processes to prefork
DAEMON_PROCS=14

# Specify additional daemon options. See man fcgiwrap.
DAEMON_OPTS=-f
I have run up to 36 forks but this has no effect on reducing failures
ram usage during failure is only 3Gb out of 8Gb available(same as when running properly)
swap usage is virtully nonexistant, however 8Gb is allocated.
moholstein
Posts: 5
Joined: Fri Nov 12, 2021 1:57 pm

Re: Web interface fails after 6 days

Post by moholstein »

try this paramater :
http://nginx.org/en/docs/http/ngx_http_ ... cache_lock

Only time I've messed with Nginx is to get a multi-server system working correctly from a single outside address (different discussion) but 6 is a suspicious number because of logrotate.conf .. which expects HUP of the pid to act right, and if not you end up with a thread trying to write to place it can't. but my guess based on your numbers is below.

are you cams rebooting daily like most HiSilicon (Dahua/Hivision/anything that has word "Maintain" on the left" turn that off. Easiest fix here would be just restart fastcgi at 1am by making a cronjob entry .. correct way would be to fix fastcgi. I'd use nginx module for it regardless of what zoneminder does but I'd guess the reason not to is privilege separation in the code the only thing this would break is stuff you'd have already set in the UI anyhow.
Post Reply