• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

The extent of government data mining of communications

Skeptigirl,

How does the software determine if the group is anti-social? Does it have the potential of labling non antisocial groups as antisocial?
The key is in pattern recognition and these data mining program developers have become very creative in determining patterns. And absolutely the programs have the potential to sweep in all sorts of "social groups" along with the antisocial groups.

The following is only one example. It is a ppt sales presentation by a software company. The following patterns mined for and given as examples reveal the kind of pattern profiling we might expect from these government data mining activities. You might make better sense of it if you see the diagrams that accompany the dialog below.

ISS Webinar – 13th may 2008 - identification of Nomadic Targets
Profiling
Compares Telecomms behaviour of a Target or device with stored historical profile
Searches for deviations from normal behaviour
Uses ThorpeGlen patented processes -statistical engine data analysis
Examples
Target using multiple SIMS, or changed handset
Target has passed his phone onto someone else.
= Target’s behaviour has changed.

Social Profiling
Our system will extract the 2 numbers of interest and examine for commonality (profile)
- classic illustration of same target
- different SIM and Handset
By charting CDR’s on 1 known number our system will automatically display the relationship of an identified IMIE (handset) and the multiple SIM’s used with that Handset (in this example 3 SIM’s)– which is one of the attributes of a Target Call Profile – then we can look for other IMIE used with any of those SIMS
– extracting the multiple device relationship/pattern

Geographic profile
Profile of location updates
– clusters show target is stationery in Base Station area during time of forced “Mobile Location update”

New Developments -
Finding Cliques (people Cells) within mass data
Identifying Cliques (groups that only talk amongst themselves)
• Everyone on a Telephone network is part of a group
• Most groups talk to other groups/individuals/nodes
• Example we have already researched
– We processed all the CDR’s from all subscribers (1 week) in a Mobile network we have access to
– Over 1 billion per day x 7 days = 8 billion+ events
– 1 operator 50 m subscribers
– 48 m – 1 large group
– 400k – large nodes (services, shops, info numbers, etc)
– Remainder - we graded the remaining groups into size (largest to smallest) they ranged in size from 2 to 142 subscribers
– Identified a number of groups that
• only call each other – never to other numbers
• No-one ever calls their numbers
• WHY??
 
Last edited:
It's called speech recognition, not voice recognition (voice recognition is infinitely less accurate, and functionally speaking is science fiction). Speech recognition has impressive real-time levels of accuracy - in optimal controlled conditions. But we're talking about monitoring calls that are about 99% not optimal controlled conditions.....

From a link in the OP:

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 13, NO. 5, SEPTEMBER 2005
The technology for monitoring conversational speech to discover patterns, capture useful trends, and generate alarms is essential for intelligence and law enforcement organizations as well as for enhancing call center operation....

It is also an essential tool in media content management for searching through large volumes of audio warehouses to find information, documents, and news. For this special issue, we solicited original theoretical and practical work offering new and broad view of data mining research for speech, audio, and spoken dialog. In particular, we encouraged submissions in the following areas: data mining theory, algorithms, and methods; core machine-learning algorithms for data mining; topic spotting and classification; pattern discovery and prediction from data; performance analysis and evaluation; information retrieval; tools and solutions for mining data; applications in marketing, business, and security; trend analysis and visualization; and surveillance, authentication, and customer service.

The reaction to our special issue has been excellent. We received 20 submissions spanning a variety of topics in speech data mining. Of the ten papers that were accepted, nine appear in this issue. These papers deal with the challenge of applying speech and language processing algorithms, as well as machine learning methods, to extract business intelligence and trends, retrieve documents and information from a large pool of audio data and conversations, manipulate and process speech and dialog to extract semantic information and reduce analysts’ and development efforts, and indexing and tracking changes in the statistics of multimedia data, including audio and music for segmentation and clustering. A number of interesting applications are also addressed in this issue, including customer care, auto attendant, and audio mining, retrieval, and classification.
(emphasis mine)

This would seem to contradict your estimate of the state of this art. And while this article talks about proposals, advances in the computer technology are extremely rapid.

I imagine that within company proprietary materials we would see the actual state of the art is capable of data mining audio recordings of phone calls.

I put two and two together differently than you do. But your ad hom my conclusions stem from paranoia is unfounded. Spying on anti-social groups in the US goes back beyond the McCarthy era and increased again during Nixon's Presidency. Cheney clearly brought it again to a peak. That is documented history, not paranoid CTism.

I'm not worried they are coming for anti-war protesters anytime soon. But I have grave concerns when a President spies on the news media to suppress unwanted political criticism. It takes citizen vigilance, not complacency, to keep a society free.
 
She's mixing links and not really paying attention to what she's talking about. ...
Maybe you should take those mind reading skills of yours and apply for the MDC?

Then again, if this is an example of your skills, maybe not.



The "anti-social groups" thing monitors call records after the fact and uses them to scan for specific call behaviour - such as a small group of people only ever calling each other and no one else.
Once one identifies the pattern, the implication here is that the specific phones would then be tapped.
 
Last edited:

Back
Top Bottom