Ok. People know about this stuff better then I do, so here I turn.
I was watching some stories on the upcoming anniversary of the murder of Jon Bennett Ramsey. It's been almost 20 years. The issue I am wondering about is this. Apparently there was DNA of an unknown male that may belong to the killer. Identifying that male would probably help in solving the case or excluding the suspect. Now if he was in a database he would probably be matched right now. I'd imagine the DNA bases that are out there are huge. If not a match could the DNA be matched to a closest match? Or am I way off base here? I'm operating on the assumption that at least everyone has a 5th cousin that's in the database. I am just wondering if you could find a closest match to narrow your search and from there prob family lines to keep narrowing the search, until inevitably you find the source of the DNA.
I have little idea if this is at all possible, so I defer to others much smarter then I.
This requires a bit of explanation of what these DNA databases contain in the first place. The sequences you look for are deliberately chosen for many parameters, one of which is that they can't be used to identify phenotype traits, such as skin color.
The kits used actually determine the
number of repeats of four nucleotides that appear all over the genome. It turns out the number of repeats doesn't tell you anything particular about the individual (in most cases, it is linked to various diseases in some cases, but that's besides the point and the genes chosen aren't linked to those diseases anyway), but there is sufficient variability between individuals and the sequences are short enough to be used for forensics (100-300 bp) and have other desirable properties. You inherit two copies of each repeat from each parent, so you generate two signals from each locus, one came from mother and the other from father. It is random which variant you get, if mother has 7 and 9 repeats on a locus and father has 8 and 9 on that locus, the possibilities for kids are: 7,8; 7,9; 8,7; 9,9.
The consequence of this is that if a couple has four kids, they can in theory each have a different set of signals on a particular locus and won't even know it unless they test it. For one locus that's not even particularly unlikely. Two kids of any couple do not necessarily have any same signals on any locus or combination of loci, including all of them. Forensic kits use 12-18 loci, which is enough to identify and individual to absolute certainty (i.e. you'd expect 1 in 10
13 or more individuals to have that particular combination, which makes it all but impossible someone with the exact same profile existed within the human population), but they're insufficient to categorize blood relations, except sometimes to exclude them. You can't even tell if DNA came from someones' kid to a degree that would be useful in court, except in special cases you can't even say two random individuals aren't related with normal kits. Y-chromosome and mitochondrial profiles fare somewhat better there, but not enough to say they are related, you can just exclude it with adequate certainty in most cases. Showing they
are related is an order of magnitude more difficult
Parent determination uses the exact same principle, but it uses a much larger number of loci - 50 or more. This is possible because forensic kits will necessarily have to work with old, degraded DNA much of the time and with limited samples, whereas parent determination works with fresh samples of virtually unlimited quantity, so they have much greater flexibility in which genes to choose from and can also work with much longer fragments. However that would still be grossly insufficient to determine a 5
th cousin - barring severe inbreeding for at least five generations

Genetics is fun.
For your use there is a possibility of profiling the Y chromosome and mitochondrial DNA and then find the individuals' parents - the first one determines the father to some marginal likelihood, the second one the mother to about 1 in 1000, then look for a combination that could match. This would require profiling the entire population first though, you'd be better off profiling everyone with the forensic kit from the start. The costs of running one sample are in the hundreds of dollars, so I wouldn't hold my breath on that.
McHrozni