Page 1 of 1
Zoneminder system runs into disk problems after 2 months...
Posted: Mon Jan 30, 2012 1:33 am
by caseystone
Hello!
I have a remote ZM server that I need to run reliably. I installed (ZM 1.25.0) on Ubuntu 10.04 LTS server following some instructions linked from the ZM site. It works pretty well, but twice so far has run into disk problems. The first time I am pretty sure it happened shortly after ZM ran a purge when full action. The file system dropped into read only (due to errors being detected) and naturally it stopped working! Manual intervention was required to initiate (or to confirm I guess) FSCK on the next reboot. After repairs the system was back to working fine.
Then again this has happened. Not sure if it was after a disk purge but it seems likely. When I was able to get to the machine and attend to it, and I captured the errors, at least four error like this:
Inode 1048580 was part of the orphaned inode list. FIXED.
So, I suspect these orphaned inodes are being detected, the filesystem gets dropped into read-only, and maybe worse after that since I could not remotely SSH into the box, and my remote power cycle control would not get me back into ssh I guess because someone needed to press some key to start the FSCK.
My question - is Zoneminder doing something wrong to cause this? My file system is EXT4, all on one drive which is a Western Digital AV-GP (dvr optimized) 1TB drive.
I don't really have the budget for a KVM-over-IP (or IPMI) solution. I have placed 'forcefsck' at the root level of the drive which should force it to actually execute the fsck if I remotely power cycle the machine (bios will turn it on for me).
I'm no expert in reading smartmontools outputs, but no errors were reported. Is my hard drive bad? Should I use a different OS? How can I combat this?
Thanks for the help.
-Casey
Re: Zoneminder system runs into disk problems after 2 months
Posted: Wed Feb 22, 2012 10:39 pm
by caseystone
again I find the system has frozen up...
from /var/log/dmesg
Code: Select all
[ 2.068649] EXT4-fs (dm-0): INFO: recovery required on readonly filesystem
[ 2.068653] EXT4-fs (dm-0): write access will be enabled during recovery
[ 3.666203] EXT4-fs (dm-0): orphan cleanup on readonly fs
[ 3.666210] EXT4-fs (dm-0): ext4_orphan_cleanup: deleting unreferenced inode 1048583
[ 3.666235] EXT4-fs (dm-0): ext4_orphan_cleanup: deleting unreferenced inode 1048582
[ 3.666240] EXT4-fs (dm-0): ext4_orphan_cleanup: deleting unreferenced inode 1048581
[ 3.666246] EXT4-fs (dm-0): ext4_orphan_cleanup: deleting unreferenced inode 1048580
[ 3.666250] EXT4-fs (dm-0): ext4_orphan_cleanup: deleting unreferenced inode 1048578
[ 3.666255] EXT4-fs (dm-0): 5 orphan inodes deleted
[ 3.666257] EXT4-fs (dm-0): recovery complete
[ 3.957930] EXT4-fs (dm-0): mounted filesystem with ordered data mode
I don't think there were any deletions, nothing to do with that I guess.
Linux experts -- is it just a bad hard drive?
Re: Zoneminder system runs into disk problems after 2 months
Posted: Thu Feb 23, 2012 10:04 pm
by pezed
The errors you are receiving are filesystem errors not zoneminder. Its likely some sort of hardware issue. Run Memtest to begin with (bad memory will corrupt the filesystem over time). Next I'd run manufacturer long test on the drive and also check power supply stability.
Re: Zoneminder system runs into disk problems after 2 months
Posted: Fri Feb 24, 2012 10:12 am
by caseystone
Thanks pezed. Next time I have access to the machine I'll run those tests.
Re: Zoneminder system runs into disk problems after 2 months
Posted: Thu Mar 29, 2012 10:49 pm
by stonith
pezed wrote:The errors you are receiving are filesystem errors not zoneminder. Its likely some sort of hardware issue. Run Memtest to begin with (bad memory will corrupt the filesystem over time). Next I'd run manufacturer long test on the drive and also check power supply stability.
This. I have a software RAID + LVM + XFS filesystem setup for my partition that saves all my events. It is likely that you are having hardware issues that is causing filesystem errors. I've been trying to figure out what the hell was going on with my system. I had two Samsung (now owned by Seagate) 2.5" 5400 1TB notebook drives for my server in a RAID1. Not a good idea. The amount of writing done by Zoneminder and the RAID activity was basically causing some bad foobar stuff with my harddrives. I eventually had data corruption because my active drive in my degraded RAID array was going out too. Anyway, got them replaced with 1TB 7200 3.5" drives and it seems like it is behaving better (will know about 2 months in). I'm doing some heavy disk activity with ZoneMinder, cacti, RAID, LVM, squid/sarg, and eventually MythTV. It is possible that you are having physical drive errors, bad cables from the drive back to the motherboard, or even the ports on the motherboard are bad. Do an integrity check with your drives to see if they come back faulty. Seagate uses a tool called SeaTools that does this. Maybe WD has something similar. Good luck, I can bear with your pain.
Re: Zoneminder system runs into disk problems after 2 months
Posted: Thu May 31, 2012 2:33 pm
by caseystone
Hello:
Finally had a chance to get to the system. I ran Memtest 86 for 5 full passes (about 2 hours) which completed without error. Then I let smartctl do a long test which came back without error.
I did set up an advance replacement of the hard drive from WD so my next step is to swap out the drive (I'll also swap out the SATA cable and use a different SATA port for it). I'm also going to rebuild the system using CentOS 6.2 I think.
I'm not sure how to test the power supply (as pezed suggested), but it is at least a quality one (Antec earthwatts green EA-380D list price $60).
If I thought I could buy some new drive to be a 'bulletproof' boot drive and then use the WD A/V drive for events storage I would do that (then maybe I would never get into a situation when the OS will not come up after a remote power cycle) -- but I've always figured a single point of failure is better, so I just use one drive.
Suggestions?
Thanks.
-Casey
Re: Zoneminder system runs into disk problems after 2 months
Posted: Thu May 31, 2012 3:52 pm
by KeithB
Seems WD AV drives are not suitable for main OS / database drive because apparently they skip a lot of error checking to increase their AV streaming performance.
http://www.pcreview.co.uk/forums/wd-av- ... 903p3.html
I use a 500GB Samsung as the main drive on my server and a 1TB WD AV drive just to store events. This has the added bonus of protecting the main drive when the events drive overflows. Instead of having to fix various databases or even re-install I just delete some events and its working again.
Re: Zoneminder system runs into disk problems after 2 months
Posted: Thu May 31, 2012 4:21 pm
by caseystone
Thanks Keith. However my reading of that thread leads me to believe the conclusion is the drive should be OK as a boot drive. Other web searching does not help much, though on the drive's info page on the WD site it does not mention it to be unsafe as a boot drive.
I think the drive ALLOW the use of some streaming modes where there will not be retries but in my setup this would not be in use and the benefit of using the drive (if WD is to be believed) is higher reliability in 24/7 environments due to lower operating temperature and specifically being designed for that type of use - possibly some other firmware adaptations to improve multiple stream reads/writes.
-Casey
Re: Zoneminder system runs into disk problems after 2 months
Posted: Tue Jun 05, 2012 9:39 am
by Paranoid
I had similar problems a couple of years ago. I resolved it by changing to ext3 and adding "noatime" to the fstab entry. It solved the problem.
Because I made both changes at the same time I don't know whether both or just one of the changes fixed it.
You say:
and my remote power cycle control
Would this by any chance be Intel's AMT (Active Management Technology) because if it is then you can also remotely connect to a serial console which would allow you to remotely start the fsck during a reboot.
Re: Zoneminder system runs into disk problems after 2 months
Posted: Tue Jun 05, 2012 8:35 pm
by caseystone
Thanks Paranoid! I'll try that when I rebuild the system.
No, it's not Intel AMT it's this, which is pretty cool:
http://www.digital-loggers.com/lpc.html
Re: Zoneminder system runs into disk problems after 2 months
Posted: Tue Jun 05, 2012 11:05 pm
by cordel
Actually the noatime option helps allot on cutting down IO I would definitely give that a go and, the file system used makes quiet a bit of difference to. I use Ext2 allot my self and others depending on the use.
Re: Zoneminder system runs into disk problems after 2 months
Posted: Tue Jun 05, 2012 11:31 pm
by caseystone
Thanks, Cordel. I think CentOS defaults to ext4 now also. Suggestions for partitioning for a single drive system?