Um, actually, there was pretty good speaker recognition (is that what you mean by "voice recognition"???) working at Bell Labs in the late 1970's.
Both speech recognition and speaker recognition are for the general case very difficult problems, but word spotting, as well as spotting certain kinds of phonemes, is much more advanced, and pretty easily done for the case where it only serves to occasionally get a human into the loop.
Speaker verification is fairly straight forward, but speaker identification is another matter entirely, and while, again, it works well in
controlled conditions, it does not work well in the uncontrolled conditions of telephone calls.
(Not to mention you have to have a pre-existing voice map to do it)
Seriously... a human is a vastly superior speaker identifier
and speech identifier, yet how often do you have bad connections when you can't fully understand what is being said on the phone?
Ambient noise is a
major hurdle for any sort of voice/speech monitoring software. As more and more phone calls are made on cellphones, the problem only becomes more severe.
The NSA probably
could do the sorts of things being talked about here. It's theoretically possible. But the resources and the personnel would be hugely beyond what the NSA has, and would be impossible to hide due to the sheer scale (think power consumption, personnel, bandwidth, etc).
People think you can get around this with computers, but these still need people to maintain them and service them, and eventually you need humans to come in and filter the enormous number of false positives (because believe me, there would be millions of them).