The Picolo Tetra is a PCI-X four chip card. Its not cheap but it sounds like you might have the resources.
What you are currently doing is, assuming 3 bytes per pixel (maybe its four?) , 4*640*480*25*3 = 92MB/s .The old PCI standard has only 133MB/s bandwidth but in the real world nothing like that would be available to your capture card. I imagine if the PCI bus is overwhelmed then various processes introduce delays as they wait for access to it.