Page 1 of 1

Getting periodic bogus events

Posted: Sat Jan 23, 2010 4:01 pm
by BlankMan
openSuSE 11.2

Note the the date, this one showed up yesterday 01/22. The date is always 12/31 at 18:00. Can't say for sure if the Cause is always Signal, just noticed that.

Code: Select all

Id   Name       Monitor    Cause   Time(^)         Duration  Frames  Alarm Frames  Total Score  Avg. Score  Max. Score 	
293  Event-293  Monitor-1  Signal  12/31 18:00:00  0.00      24      1             100          100           100
In the Monitor display it shows up like this:

Code: Select all

Id   Name       Time                 Secs   Frames  Score  	 
293  Event-293  1969-12-31 18:00:00  0.00   24/1    100/100
When I click on the Event no video is displayed and when I close it nph-zms keeps running taking up 100% cpu until I kill -9 it (the only way it will die. I have a 4 core Phenom so that's 100% as displayed by top of one core.) This is happening quite often.

Posted: Sat Jan 23, 2010 8:41 pm
by cordel
Bogus dates are usually a sign of a badly configured MySQL server and/or an under powered machine.
Check top and see where your loads are at, If the machine is not showing signs of of being over loaded, go to dev.mysql.com and work out your configuration.

Posted: Sun Jan 24, 2010 2:38 am
by BlankMan
A quad core Phenom 9850 with 8G of ram and a real raid controller with battery backed up write back cache underpowered? With ~75% idle time and no iowait time (munin statistics), I think not. I know I didn't state that but I consider the obvious before seeking help. And a poorly configured mySQL, I sure hope not, all my Oracle training and certification would then have been a waste. Again that was not stated, but again, I usually check the obvious. No, something else is going on and I have one theory that surfaced after I posted this but haven't tested it yet, I thought I'd see if others have encountered this and possibly knew why.

Posted: Sun Jan 24, 2010 3:12 am
by jfkastner
running 8GB RAM makes me wonder if you use a VM or 64bit mode - both have their own problems, so check the documentation for SQL problems ... not saying it's you messing it up, but i've had to troubleshoot a sun virtualbox and that was a nightmare!

Posted: Sun Jan 24, 2010 3:33 am
by cordel
BlankMan wrote:A quad core Phenom 9850 with 8G of ram and a real raid controller with battery backed up write back cache underpowered? With ~75% idle time and no iowait time (munin statistics), I think not. I know I didn't state that but I consider the obvious before seeking help. And a poorly configured mySQL, I sure hope not, all my Oracle training and certification would then have been a waste. Again that was not stated, but again, I usually check the obvious. No, something else is going on and I have one theory that surfaced after I posted this but haven't tested it yet, I thought I'd see if others have encountered this and possibly knew why.
Ah, now some of the blanks are filled. I'll leave you to it then.

Posted: Sun Jan 24, 2010 4:05 am
by BlankMan
I was using VMware had my web server and MTA in one, file server in the another plus a couple of Windoze VM's but I was getting really poor IO performance. Moved the file server to a DNS-321 and moved the web server and MTA to the main box, still got really poor IO performance, iowaits that would pause the server for 10, 30 seconds or more, turned out one of the disks had a Raw Read Error value of "1" and the raid controller wasn't reporting it. But, I'm gald the complexity of VMware is gone, really don't need it.

At this point I don't think the issue is hardware related. All my Weather data, over 5 years worth is in a mySQL DB and it fly's. I think these have something to do with starting and/or stopping ZM but I have to see if I can reproduce it.

Posted: Tue Jan 26, 2010 6:54 am
by BlankMan
Ok, my suspicions were correct, I can reproduce this at will. It occurs if the mySQL password is wrong. I cleared out all events, shutdown ZM, put a bad password in the conf file then started it up, of course the startup failed. Put the correct password in the conf file, started it up, and sure enough I had these 7 bogus events:

Code: Select all

Id  	Name  	Monitor  	Cause  	Time(^)  	Duration  	Frames  	Alarm Frames 	Total Score 	Avg. Score 	Max. Score 	
1362 	Event-1362 	Monitor-1 	Motion 	12/31 18:00:00 	18946646.36 	61 	31 	2032 	65 	187 	
1363 	Event-1363 	Monitor-1 	Motion 	12/31 18:00:00 	18946646.26 	62 	32 	2054 	64 	187 	
1365 	Event-1365 	Monitor-1 	Signal 	12/31 18:00:00 	0.00 	16 	1 	100 	100 	100 	
1364 	Event-1364 	Monitor-2 	Signal 	12/31 18:00:00 	0.00 	11 	1 	100 	100 	100 	
1366 	Event-1366 	Monitor-2 	Signal 	12/31 18:00:00 	18946706.70 	45 	18 	4203 	233 	303 	
1367 	Event-1367 	Monitor-2 	Motion 	12/31 18:00:00 	18946792.13 	64 	44 	5902 	134 	297 	
1368 	Event-1368 	Monitor-2 	Motion 	12/31 18:00:00 	18946791.93 	66 	46 	6527 	141 	291 	
I then cleared those events i.e. all events again, put a bad password in the conf file while it was running and then attempted to shut it down. Immediately got 4 more bogus events:

Code: Select all

Id  	Name  	Monitor  	Cause  	Time(^)  	Duration  	Frames  	Alarm Frames 	Total Score 	Avg. Score 	Max. Score 	
1369  	Event-1369  	Monitor-1  	Motion  	12/31 18:00:00  	18947246.00  	54  	24  	1338  	55  	184  	
1370 	Event-1370 	Monitor-1 	Motion 	12/31 18:00:00 	18947246.31 	54 	24 	1340 	55 	184 	
1371 	Event-1371 	Monitor-1 	Motion 	12/31 18:00:00 	18947846.11 	59 	29 	1673 	57 	185 	
1372 	Event-1372 	Monitor-1 	Motion 	12/31 18:00:00 	18947845.94 	52 	22 	1097 	49 	185
What I basically emulated here is what happened to me when I was building ZM from source, in the process of getting everything working properly thus doing many builds. If you make ZM then want to do a make install and you're sitting in the source directory where the build has created the default zm.conf file and try to stop ZM, it uses that default zm.conf file and not the real one in /usr/local/etc/ ergo it tries to use the wrong mySQL password and whala, you get the bogus events.

So you always have to remember to cd out of the source directory, stop ZM, cd back to the source directory, do the make install, change the password in your /usr/local/etc/zm.conf because the make install overwrites it, cd out of the source directory, and then start ZM.

The problem I have with this is:

1. the make install should *not* overwrite the /usr/local/etc/zm.conf
2. ZM should not use a zm.conf file just because it finds one in your pwd
3. the default zm.conf should be created in a subdirectory of the source directory thus eliminating #2 above, but not totally it could still happen
cordel wrote:Bogus dates are usually a sign of a badly configured MySQL server and/or an under powered machine.
Check top and see where your loads are at, If the machine is not showing signs of of being over loaded, go to dev.mysql.com and work out your configuration.
So much for an under powered machine or a bad mySQL setup. Like I said, I usually rule out the obvious before asking for help ergo crying wolf.

Posted: Tue Jan 26, 2010 1:47 pm
by jfkastner
i still have bogus events (1-10 per day) but i have NEVER had the source - i got a deb package from a gentleman at

ftp.northern-ridge.com.au

which was dated sep 2009 (no idea which SVN that was), and did some customizing - on ubu 904 32bit desktop with zm 1242 on an old P4

the number of events and duration seems very random, but usually less than 30 frames

BUT i have an older slower disk, and noticed that those events coincide with zmaudit running (and cleaning out old events -> lots of IO) ... maybe some bus/timing issue .... microseconds can be very hard to observe!

Posted: Tue Jan 26, 2010 2:00 pm
by BlankMan
Ahh, ok, so others have seen this too, and it possibly can be hardware related. Are yours always 1969-12-31 18:00:00 too? Looks like I may have discovered just one cause.

I had to reload my Weather DB, 5+ years worth of data, a few weeks ago, corruption in it from when I was have disks problems due to a bad disk behind the raid contoller, thus machine lock ups. I was pounding the machine doing so, the process was cpu bound with no iowait so I knew I wasn't having having slow machine/mySQL problems, extremely high transaction rates, lots of data munging/formatting/selects/inserts, something else was going on.

Posted: Tue Jan 26, 2010 2:41 pm
by jfkastner
yes, good'ol' 1969 ... woodstock ... hendrix ... :)

Posted: Wed Jan 27, 2010 9:37 am
by cordel
BlankMan wrote:
cordel wrote:Bogus dates are usually a sign of a badly configured MySQL server and/or an under powered machine.
Check top and see where your loads are at, If the machine is not showing signs of of being over loaded, go to dev.mysql.com and work out your configuration.
So much for an under powered machine or a bad mySQL setup. Like I said, I usually rule out the obvious before asking for help ergo crying wolf.
You would be one of the few then but still not an excuse to fill in information that might be obvious questions. System information should have been included.

Glad you got it sorted.
As for the configuration file, will be looked into.

Posted: Sat Jan 30, 2010 11:16 pm
by jfkastner
BTW i just observed such an event:

it starts with zmaudit running and deleting events (which are in my case 900 frames each)

top on the terminal tells me 'rm' is running wild, idle is zero % and WA (=iowait) is over 50%

at some point mysqld kicks in to clean up the database, and ZMA complains about buffer overruns

-> a few lost frames, sql messed up -> timeline goes back to 1969

guess there's just too much IO happening at the same time, maybe sql times out while writing the database ...

don't think it's an IDE bug (like M$ mediacenter had) or some other hardware/mobo bug, since i have NO data corruption on both my EXT3 disks and the system runs for 3 weeks now w/o nasty complaints in the logs or crashes

maybe someone with some mysql knowledge could come up with an idea?

MAYBE SOLVED?!

Posted: Tue Feb 09, 2010 8:19 pm
by jfkastner
another probable cause:

my clock skew (difference between software/hardware/"realtime") was about 30 sec per day - might be a very slow disk that's getting hammered by ZM, the MB is OK and runs for weeks w/o problems (w/o ZM or different OS)

anyways a few days ago i finally set up a cronjob that syncs with an NTP server every day -> the bogus 1969 events went away completely!

maybe sql and ZM use the time info in a different way so it creates an overflow/negative value (eg 2xx32 -1) and that screws up the recorded timestamp?

EDIT

there's gotta be more to that, i still get those events, but only 1-2 per day instead of 5-10

however in sql in the field 'note' it always says 'signal lost' or 'reacquired', which is kind'a weird because i have ONLY ip cams, and yes, zma/zmc restart sometimes anyways

Posted: Wed Feb 10, 2010 7:27 pm
by deldued
Anyone got any more ideas?

I see the same issue on my quad core, 4GB 64bit ubuntu box which is only really used for ZM, file server and mail.