Page 1 of 1

CentOS 5.4 & PV149 & ZoneMinder 1.24.= input dies ov

Posted: Wed Dec 23, 2009 12:43 am
by sck_nogas
On my CentOS 5.4 server using video4linux2 with a PV149 capture board installed with 8 local wired cameras running ZoneMinder - v1.24.2, I have the following problem...
  • * Upon reboot, all cameras are good, images appearing, life is happy.
    * Over time, some cameras stop being viewable and the only errors in /var/log/messages that indicate that the hardware failed is...
    Dec 17 19:40:58 house zmdc[2954]: INF [Starting pending process, zmc -d /dev/video0]
    Dec 17 19:40:58 house zmdc[2954]: INF ['zmc -d /dev/video0' starting at 09/12/17 19:40:58, pid = 30269]
    Dec 17 19:40:58 house zmdc[30269]: INF ['zmc -d /dev/video0' started at 09/12/17 19:40:58]
    Dec 17 19:40:58 house zmc_dvideo0[30269]: INF [Debug Level = 0, Debug Log = <none>]
    Dec 17 19:40:58 house zmc_dvideo0[30269]: INF [Starting Capture]
    Dec 17 19:40:58 house kernel: bttv0: timeout: drop=0 irq=41702860/41702860, risc=36cc903c, bits: FMTCHG VSYNC HSYNC RISCI
    Dec 17 19:40:58 house zmc_dvideo0[30269]: WAR [Capture failure, possible signal loss?: Input/output error]
    Dec 17 19:40:58 house zmc_dvideo0[30269]: ERR [Failed to capture image from monitor 1 (0/2)]
    Dec 17 19:40:58 house kernel: bttv0: reset, reinitialize
    Dec 17 19:40:58 house kernel: bttv0: PLL can sleep, using XTAL (28636363).
    Dec 17 19:40:58 house zmdc[2954]: ERR ['zmc -d /dev/video0' exited abnormally, exit status 255]
    * So, the image in ZoneMinder is blank (blue screen) and the device in ZoneMinder goes red.
    * But not all cameras, the video from /dev/video1 and /dev/video3 works, but not /dev/video0 and /dev/video2?
    * Stopping and starting ZoneMinder doesn't solve the issue, but a reboot does.
Interesting factoids?

1) Under Fedora Core [with same PV149] but different motherboard, I never had this problem.
2) It's always the same video devices, but they are on the same board and using the same video chips as the devices that stay working?
3) It is always fixed by a reboot.

So, what I'd like to do is be able to fix the problem permanently, or if not possible detect the problem and fix without a reboot.

Scott
PS> I'm lying... I have 7 cameras working, since I used a VGA cable instead of all 16 pins through so one camera doesn't display. But, I know why. :)

Posted: Wed Dec 23, 2009 12:56 am
by sck_nogas
Just digging some more, I decided to look into the /proc filesystem and I notcied that I could trigger some messages for the /dev/video devices that weren't working in the /var/log/messages file.
[sck@house ~]$ ls -ld /proc/self/root/dev/video*
lrwxrwxrwx 1 root root 6 Dec 12 12:28 /proc/self/root/dev/video -> video0
crw----rw- 1 root root 81, 0 Dec 12 12:28 /proc/self/root/dev/video0
crw----rw- 1 root root 81, 1 Dec 12 12:27 /proc/self/root/dev/video1
crw----rw- 1 root root 81, 2 Dec 12 12:27 /proc/self/root/dev/video2
crw----rw- 1 root root 81, 3 Dec 12 12:27 /proc/self/root/dev/video3
[sck@house ~]$ cat /proc/self/root/dev/video? ; sudo tail -100 /var/log/messages| grep bttv
cat: /proc/self/root/dev/video0: Input/output error
cat: /proc/self/root/dev/video1: Device or resource busy
cat: /proc/self/root/dev/video2: Input/output error
cat: /proc/self/root/dev/video3: Device or resource busy
Dec 22 16:52:21 house kernel: bttv0: reset, reinitialize
Dec 22 16:52:21 house kernel: bttv0: PLL can sleep, using XTAL (28636363).
Dec 22 16:52:22 house kernel: bttv0: timeout: drop=0 irq=41702860/41702860, risc=36cc901c, bits: FMTCHG VSYNC HSYNC RISCI
Dec 22 16:52:22 house kernel: bttv2: reset, reinitialize
Dec 22 16:52:22 house kernel: bttv2: PLL can sleep, using XTAL (28636363).
Dec 22 16:52:22 house kernel: bttv2: timeout: drop=0 irq=10520964/10520964, risc=36e9003c, bits: VSYNC HSYNC RISCI
[sck@house ~]$
Not sure, it will help identify the issue, but...

Scott

Posted: Wed Dec 23, 2009 8:55 pm
by sck_nogas
To identify my system further, I compiled ZoneMinder 1.24.2 from source, though I'm thinking of pulling the SVN repo to see if there's a change.

The other solution I have to detect and fix the issue is to do something like this..

Code: Select all

#!/bin/bash

devices="video0 video1 video2 video3"

while true ; do 
	for device in $devices ; do
		VIDEO=$(cat /proc/sed/root/dev/$device 2>&1) 
		errcode=$?
		if [ "$errcode" -eq "255" ]; then  
			echo " Video Device $device is dead, restarting ZoneMinder"
			service zm stop; modprobe -r bttv; sleep 1; modprobe  bttv; service zm start
		fi
	done
	sleep 30
done
But that is such a hack...

Scott

Posted: Fri Dec 25, 2009 6:50 am
by cordel
Check dmesg and /proc/interrupts for problems and conflicts.
Some MB bios does not play nice with BT878 devices, you might have to replace your MB again worst case.

Posted: Sat Dec 26, 2009 4:16 pm
by sck_nogas
Well, my `dmesg` output just contains alot of...

Code: Select all

[sck@house ~]$ dmesg | tail
bttv2: PLL can sleep, using XTAL (28636363).
bttv2: timeout: drop=0 irq=18645223/18645223, risc=10ba503c, bits: VSYNC HSYNC RISCI
bttv2: reset, reinitialize
bttv2: PLL can sleep, using XTAL (28636363).
bttv2: timeout: drop=0 irq=18645223/18645223, risc=10ba501c, bits: VSYNC HSYNC RISCI
bttv2: reset, reinitialize
bttv2: PLL can sleep, using XTAL (28636363).
bttv2: timeout: drop=0 irq=18645223/18645223, risc=10ba503c, bits: VSYNC HSYNC RISCI
bttv2: reset, reinitialize
bttv2: PLL can sleep, using XTAL (28636363).
bttv2: timeout: drop=0 irq=18645223/18645223, risc=10ba503c, bits: VSYNC HSYNC RISCI
bttv2: reset, reinitialize
bttv2: PLL can sleep, using XTAL (28636363).
bttv2: timeout: drop=0 irq=18645223/18645223, risc=10ba501c, bits: VSYNC HSYNC RISCI
bttv2: reset, reinitialize
bttv2: PLL can sleep, using XTAL (28636363).
bttv2: timeout: drop=0 irq=18645223/18645223, risc=10ba503c, bits: VSYNC HSYNC RISCI
bttv2: reset, reinitialize
bttv2: PLL can sleep, using XTAL (28636363).
bttv2: timeout: drop=0 irq=18645223/18645223, risc=10ba503c, bits: VSYNC HSYNC RISCI
bttv2: reset, reinitialize
bttv2: PLL can sleep, using XTAL (28636363).
bttv2: timeout: drop=0 irq=18645223/18645223, risc=10ba503c, bits: VSYNC HSYNC RISCI
[sck@house ~]$ dmesg | wc -l
2369
[sck@house ~]$ dmesg | sort -u 
8636363).
bttv2: PLL can sleep, using XTAL (28636363).
bttv2: reset, reinitialize
bttv2: timeout: drop=0 irq=18645223/18645223, risc=10ba501c, bits: VSYNC HSYNC RISCI
bttv2: timeout: drop=0 irq=18645223/18645223, risc=10ba503c, bits: VSYNC HSYNC RISCI
[sck@house ~]$
And my script hasn't triggered yet.

Likewise, the /proc.interrupts shows nothing... yet...

Code: Select all

[sck@house ~]$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       
  0:  313366768          0          0          0    IO-APIC-edge  timer
  1:          8          0          0          0    IO-APIC-edge  i8042
  4:         10          0          0          0    IO-APIC-edge  serial
  6:          6          0          0          0    IO-APIC-edge  floppy
  7:          1          0          0          0    IO-APIC-edge  parport0
  8:          3          0          0          0    IO-APIC-edge  rtc
  9:          0          0          0          0   IO-APIC-level  acpi
 12:        126          0          0          0    IO-APIC-edge  i8042
 50:    7712819          0   20946691          0   IO-APIC-level  bttv0
 58:   10673260          0   17494829          0   IO-APIC-level  bttv1
 66:    2908030          0   22055595          0   IO-APIC-level  bttv2
 74:    6951547          0   22264687          0   IO-APIC-level  bttv3
 82:       1571          0          0   56651381         PCI-MSI  eth0
201:          2          0          0          0   IO-APIC-level  ehci_hcd:usb1
209:     197328          0      38194          0   IO-APIC-level  ohci_hcd:usb2
225:      84503     145656    9449494          0         PCI-MSI  ahci
NMI:          0          0          0          0 
LOC:  313355262  313355258  313355470  313355466 
ERR:          1
MIS:          0
[sck@house ~]$ 
I'll keep digging...

Thanks for the assistance,

Scott

Posted: Wed Jan 06, 2010 3:36 am
by sck_nogas
Okay, we have an update!!

My Shell code failed to detect since the error code was always -1 so I wrote an ugly perl version...

Code: Select all

#!/usr/bin/perl

@devices=qw(video0 video1 video2 video3);
$count =2;
while ($count>1) {
	foreach $device(@devices) {
		open (VIDEO, "cat /proc/self/root/dev/$device 2>&1 |");
		while (<VIDEO>){ 
			$line=$_;
			if ($line =~ /Input\/output error/){
				print " Video Device $device is dead, restarting ZoneMinder\n";
				system("service zm stop; modprobe -r bttv; sleep 5; modprobe  bttv; service zm start");
			}
		}
	}
	sleep 30;
}
But that still failed to catch this, so... I looked into this further with /proc/interrupts and dmesg and saw this...

Code: Select all

[sck@house ~]$ sudo cat /proc/self/root/dev/video[0-3]
cat: /proc/self/root/dev/video0: Device or resource busy
cat: /proc/self/root/dev/video1: Device or resource busy
cat: /proc/self/root/dev/video2: Input/output error
cat: /proc/self/root/dev/video3: Input/output error
sck@house ~]$ cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
  0: 1217368931          0          0          0    IO-APIC-edge  timer
  1:          8          0          0          0    IO-APIC-edge  i8042
  4:         10          0          0          0    IO-APIC-edge  serial
  6:          6          0          0          0    IO-APIC-edge  floppy
  7:          1          0          0          0    IO-APIC-edge  parport0
  8:          3          0          0          0    IO-APIC-edge  rtc
  9:          0          0          0          0   IO-APIC-level  acpi
 12:        126          0          0          0    IO-APIC-edge  i8042
 50:   12517132          0   95613228          0   IO-APIC-level  bttv0
 58:   21938330          0   31403216          0   IO-APIC-level  bttv1
 66:    2908030          0   22055595          0   IO-APIC-level  bttv2
 74:   13706973          0  171638751          0   IO-APIC-level  bttv3
 82:       1571          0          0  445758629         PCI-MSI  eth0
201:          2          0          0          0   IO-APIC-level  ehci_hcd:usb1
209:     695908          0     217467          0   IO-APIC-level  ohci_hcd:usb2
225:    1806795     172572   43149540          0         PCI-MSI  ahci
NMI:          0          0          0          0 
LOC: 1217325207 1217325204 1217325812 1217325806 
ERR:          1
MIS:          0
[sck@house ~]$ dmesg 
bttv3: unloading
bttv2: unloading
bttv1: unloading
bttv0: unloading
Linux video capture interface: v2.00
bttv: driver version 0.9.16 loaded
bttv: using 8 buffers with 2080k (520 pages) each for capture
bttv: Bt8xx card found (0).
ACPI: PCI Interrupt 0000:02:08.0[A] -> Link [LNKB] -> GSI 19 (level, low) -> IRQ 50
bttv0: Bt878 (rev 17) at 0000:02:08.0, irq: 50, latency: 64, mmio: 0xfbfff000
bttv0: detected: Provideo PV150A-1 [card=98], PCI subsystem ID is aa00:1460
bttv0: using: ProVideo PV150 [card=98,autodetected]
bttv0: gpio: en=00000000, out=00000000 in=00ffffff [init]
bttv0: using tuner=-1
bttv0: i2c: checking for TDA9875 @ 0xb0... not found
bttv0: i2c: checking for TDA7432 @ 0x8a... not found
bttv0: i2c: checking for TDA9887 @ 0x86... not found
bttv0: registered device video0
bttv0: registered device vbi0
bttv0: PLL: 28636363 => 35468950 . ok
bttv: Bt8xx card found (1).
ACPI: PCI Interrupt 0000:02:09.0[A] -> Link [LNKC] -> GSI 18 (level, low) -> IRQ 58
bttv1: Bt878 (rev 17) at 0000:02:09.0, irq: 58, latency: 64, mmio: 0xfbffd000
bttv1: detected: Provideo PV150A-2 [card=98], PCI subsystem ID is aa01:1461
bttv1: using: ProVideo PV150 [card=98,autodetected]
bttv1: gpio: en=00000000, out=00000000 in=00ffffff [init]
bttv1: using tuner=-1
bttv1: i2c: checking for TDA9875 @ 0xb0... not found
bttv1: i2c: checking for TDA7432 @ 0x8a... not found
bttv1: i2c: checking for TDA9887 @ 0x86... not found
bttv1: registered device video1
bttv1: registered device vbi1
bttv1: PLL: 28636363 => 35468950 . ok
bttv: Bt8xx card found (2).
ACPI: PCI Interrupt 0000:02:0a.0[A] -> Link [LNKD] -> GSI 17 (level, low) -> IRQ 66
bttv2: Bt878 (rev 17) at 0000:02:0a.0, irq: 66, latency: 64, mmio: 0xfbffb000
bttv2: detected: Provideo PV150A-3 [card=98], PCI subsystem ID is aa02:1462
bttv2: using: ProVideo PV150 [card=98,autodetected]
bttv2: gpio: en=00000000, out=00000000 in=00ffffff [init]
bttv2: using tuner=-1
bttv2: i2c: checking for TDA9875 @ 0xb0... not found
bttv2: i2c: checking for TDA7432 @ 0x8a... not found
bttv2: i2c: checking for TDA9887 @ 0x86... not found
bttv2: registered device video2
bttv2: registered device vbi2
bttv2: PLL: 28636363 => 35468950 . ok
bttv: Bt8xx card found (3).
ACPI: PCI Interrupt 0000:02:0b.0[A] -> Link [LNKA] -> GSI 16 (level, low) -> IRQ 74
bttv3: Bt878 (rev 17) at 0000:02:0b.0, irq: 74, latency: 64, mmio: 0xfbff9000
bttv3: detected: Provideo PV150A-4 [card=98], PCI subsystem ID is aa03:1463
bttv3: using: ProVideo PV150 [card=98,autodetected]
bttv3: gpio: en=00000000, out=00000000 in=00ffffff [init]
bttv3: using tuner=-1
bttv3: i2c: checking for TDA9875 @ 0xb0... not found
bttv3: i2c: checking for TDA7432 @ 0x8a... not found
bttv3: i2c: checking for TDA9887 @ 0x86... not found
bttv3: registered device video3
bttv3: registered device vbi3
bttv3: PLL: 28636363 => 35468950 . ok
irq 74: nobody cared (try booting with the "irqpoll" option)
 [<c044d1ea>] __report_bad_irq+0x2b/0x69
 [<c044d3d7>] note_interrupt+0x1af/0x1e8
 [<c044c9e5>] handle_IRQ_event+0x45/0x8c
 [<c044cac7>] __do_IRQ+0x9b/0xd6
 [<c044ca2c>] __do_IRQ+0x0/0xd6
 [<c040749e>] do_IRQ+0x99/0xc3
 [<c0405946>] common_interrupt+0x1a/0x20
 [<c0403ce7>] mwait_idle+0x25/0x38
 [<c0403ca8>] cpu_idle+0x9f/0xb9
 =======================
handlers:
[<f8b67bfa>] (bttv_irq+0x0/0x74a [bttv])
Disabling IRQ #74
So I added the irqpoll statement to the kernel line of my /boot/grub/grub.conf and rebooted.

We'll see if that fixes the problem...

Scott

Posted: Wed Jan 20, 2010 11:34 pm
by sck_nogas
Well it's been several weeks now, and it seems to have fixed the issue.

So, if you are having this issue, use the IRQPOLL kernel parameter and this problem should go away once you reboot into the new kernel.

Scott