Page 1 of 1
Getting periodic bogus events
Posted: Sat Jan 23, 2010 4:01 pm
by BlankMan
openSuSE 11.2
Note the the date, this one showed up yesterday 01/22. The date is always 12/31 at 18:00. Can't say for sure if the Cause is always Signal, just noticed that.
Code: Select all
Id Name Monitor Cause Time(^) Duration Frames Alarm Frames Total Score Avg. Score Max. Score
293 Event-293 Monitor-1 Signal 12/31 18:00:00 0.00 24 1 100 100 100
In the Monitor display it shows up like this:
Code: Select all
Id Name Time Secs Frames Score
293 Event-293 1969-12-31 18:00:00 0.00 24/1 100/100
When I click on the Event no video is displayed and when I close it nph-zms keeps running taking up 100% cpu until I kill -9 it (the only way it will die. I have a 4 core Phenom so that's 100% as displayed by top of one core.) This is happening quite often.
Posted: Sat Jan 23, 2010 8:41 pm
by cordel
Bogus dates are usually a sign of a badly configured MySQL server and/or an under powered machine.
Check top and see where your loads are at, If the machine is not showing signs of of being over loaded, go to dev.mysql.com and work out your configuration.
Posted: Sun Jan 24, 2010 2:38 am
by BlankMan
A quad core Phenom 9850 with 8G of ram and a real raid controller with battery backed up write back cache underpowered? With ~75% idle time and no iowait time (munin statistics), I think not. I know I didn't state that but I consider the obvious before seeking help. And a poorly configured mySQL, I sure hope not, all my Oracle training and certification would then have been a waste. Again that was not stated, but again, I usually check the obvious. No, something else is going on and I have one theory that surfaced after I posted this but haven't tested it yet, I thought I'd see if others have encountered this and possibly knew why.
Posted: Sun Jan 24, 2010 3:12 am
by jfkastner
running 8GB RAM makes me wonder if you use a VM or 64bit mode - both have their own problems, so check the documentation for SQL problems ... not saying it's you messing it up, but i've had to troubleshoot a sun virtualbox and that was a nightmare!
Posted: Sun Jan 24, 2010 3:33 am
by cordel
BlankMan wrote:A quad core Phenom 9850 with 8G of ram and a real raid controller with battery backed up write back cache underpowered? With ~75% idle time and no iowait time (munin statistics), I think not. I know I didn't state that but I consider the obvious before seeking help. And a poorly configured mySQL, I sure hope not, all my Oracle training and certification would then have been a waste. Again that was not stated, but again, I usually check the obvious. No, something else is going on and I have one theory that surfaced after I posted this but haven't tested it yet, I thought I'd see if others have encountered this and possibly knew why.
Ah, now some of the blanks are filled. I'll leave you to it then.
Posted: Sun Jan 24, 2010 4:05 am
by BlankMan
I was using VMware had my web server and MTA in one, file server in the another plus a couple of Windoze VM's but I was getting really poor IO performance. Moved the file server to a DNS-321 and moved the web server and MTA to the main box, still got really poor IO performance, iowaits that would pause the server for 10, 30 seconds or more, turned out one of the disks had a Raw Read Error value of "1" and the raid controller wasn't reporting it. But, I'm gald the complexity of VMware is gone, really don't need it.
At this point I don't think the issue is hardware related. All my Weather data, over 5 years worth is in a mySQL DB and it fly's. I think these have something to do with starting and/or stopping ZM but I have to see if I can reproduce it.
Posted: Tue Jan 26, 2010 6:54 am
by BlankMan
Ok, my suspicions were correct, I can reproduce this at will. It occurs if the mySQL password is wrong. I cleared out all events, shutdown ZM, put a bad password in the conf file then started it up, of course the startup failed. Put the correct password in the conf file, started it up, and sure enough I had these 7 bogus events:
Code: Select all
Id Name Monitor Cause Time(^) Duration Frames Alarm Frames Total Score Avg. Score Max. Score
1362 Event-1362 Monitor-1 Motion 12/31 18:00:00 18946646.36 61 31 2032 65 187
1363 Event-1363 Monitor-1 Motion 12/31 18:00:00 18946646.26 62 32 2054 64 187
1365 Event-1365 Monitor-1 Signal 12/31 18:00:00 0.00 16 1 100 100 100
1364 Event-1364 Monitor-2 Signal 12/31 18:00:00 0.00 11 1 100 100 100
1366 Event-1366 Monitor-2 Signal 12/31 18:00:00 18946706.70 45 18 4203 233 303
1367 Event-1367 Monitor-2 Motion 12/31 18:00:00 18946792.13 64 44 5902 134 297
1368 Event-1368 Monitor-2 Motion 12/31 18:00:00 18946791.93 66 46 6527 141 291
I then cleared those events i.e. all events again, put a bad password in the conf file while it was running and then attempted to shut it down. Immediately got 4 more bogus events:
Code: Select all
Id Name Monitor Cause Time(^) Duration Frames Alarm Frames Total Score Avg. Score Max. Score
1369 Event-1369 Monitor-1 Motion 12/31 18:00:00 18947246.00 54 24 1338 55 184
1370 Event-1370 Monitor-1 Motion 12/31 18:00:00 18947246.31 54 24 1340 55 184
1371 Event-1371 Monitor-1 Motion 12/31 18:00:00 18947846.11 59 29 1673 57 185
1372 Event-1372 Monitor-1 Motion 12/31 18:00:00 18947845.94 52 22 1097 49 185
What I basically emulated here is what happened to me when I was building ZM from source, in the process of getting everything working properly thus doing many builds. If you make ZM then want to do a make install and you're sitting in the source directory where the build has created the default zm.conf file and try to stop ZM, it uses that default zm.conf file and not the real one in /usr/local/etc/ ergo it tries to use the wrong mySQL password and whala, you get the bogus events.
So you always have to remember to cd out of the source directory, stop ZM, cd back to the source directory, do the make install, change the password in your /usr/local/etc/zm.conf because the make install overwrites it, cd out of the source directory, and then start ZM.
The problem I have with this is:
1. the make install should *not* overwrite the /usr/local/etc/zm.conf
2. ZM should not use a zm.conf file just because it finds one in your pwd
3. the default zm.conf should be created in a subdirectory of the source directory thus eliminating #2 above, but not totally it could still happen
cordel wrote:Bogus dates are usually a sign of a badly configured MySQL server and/or an under powered machine.
Check top and see where your loads are at, If the machine is not showing signs of of being over loaded, go to dev.mysql.com and work out your configuration.
So much for an under powered machine or a bad mySQL setup. Like I said, I usually rule out the obvious before asking for help ergo crying wolf.
Posted: Tue Jan 26, 2010 1:47 pm
by jfkastner
i still have bogus events (1-10 per day) but i have NEVER had the source - i got a deb package from a gentleman at
ftp.northern-ridge.com.au
which was dated sep 2009 (no idea which SVN that was), and did some customizing - on ubu 904 32bit desktop with zm 1242 on an old P4
the number of events and duration seems very random, but usually less than 30 frames
BUT i have an older slower disk, and noticed that those events coincide with zmaudit running (and cleaning out old events -> lots of IO) ... maybe some bus/timing issue .... microseconds can be very hard to observe!
Posted: Tue Jan 26, 2010 2:00 pm
by BlankMan
Ahh, ok, so others have seen this too, and it possibly can be hardware related. Are yours always 1969-12-31 18:00:00 too? Looks like I may have discovered just one cause.
I had to reload my Weather DB, 5+ years worth of data, a few weeks ago, corruption in it from when I was have disks problems due to a bad disk behind the raid contoller, thus machine lock ups. I was pounding the machine doing so, the process was cpu bound with no iowait so I knew I wasn't having having slow machine/mySQL problems, extremely high transaction rates, lots of data munging/formatting/selects/inserts, something else was going on.
Posted: Tue Jan 26, 2010 2:41 pm
by jfkastner
yes, good'ol' 1969 ... woodstock ... hendrix ...
Posted: Wed Jan 27, 2010 9:37 am
by cordel
BlankMan wrote:
cordel wrote:Bogus dates are usually a sign of a badly configured MySQL server and/or an under powered machine.
Check top and see where your loads are at, If the machine is not showing signs of of being over loaded, go to dev.mysql.com and work out your configuration.
So much for an under powered machine or a bad mySQL setup. Like I said, I usually rule out the obvious before asking for help ergo crying wolf.
You would be one of the few then but still not an excuse to fill in information that might be obvious questions. System information should have been included.
Glad you got it sorted.
As for the configuration file, will be looked into.
Posted: Sat Jan 30, 2010 11:16 pm
by jfkastner
BTW i just observed such an event:
it starts with zmaudit running and deleting events (which are in my case 900 frames each)
top on the terminal tells me 'rm' is running wild, idle is zero % and WA (=iowait) is over 50%
at some point mysqld kicks in to clean up the database, and ZMA complains about buffer overruns
-> a few lost frames, sql messed up -> timeline goes back to 1969
guess there's just too much IO happening at the same time, maybe sql times out while writing the database ...
don't think it's an IDE bug (like M$ mediacenter had) or some other hardware/mobo bug, since i have NO data corruption on both my EXT3 disks and the system runs for 3 weeks now w/o nasty complaints in the logs or crashes
maybe someone with some mysql knowledge could come up with an idea?
MAYBE SOLVED?!
Posted: Tue Feb 09, 2010 8:19 pm
by jfkastner
another probable cause:
my clock skew (difference between software/hardware/"realtime") was about 30 sec per day - might be a very slow disk that's getting hammered by ZM, the MB is OK and runs for weeks w/o problems (w/o ZM or different OS)
anyways a few days ago i finally set up a cronjob that syncs with an NTP server every day -> the bogus 1969 events went away completely!
maybe sql and ZM use the time info in a different way so it creates an overflow/negative value (eg 2xx32 -1) and that screws up the recorded timestamp?
EDIT
there's gotta be more to that, i still get those events, but only 1-2 per day instead of 5-10
however in sql in the field 'note' it always says 'signal lost' or 'reacquired', which is kind'a weird because i have ONLY ip cams, and yes, zma/zmc restart sometimes anyways
Posted: Wed Feb 10, 2010 7:27 pm
by deldued
Anyone got any more ideas?
I see the same issue on my quad core, 4GB 64bit ubuntu box which is only really used for ZM, file server and mail.