News
Events
RSS Feeds

Improving the Search for Intelligence

From New Scientist Magazine Print Edition, 10th February 2007, by Paul Marks

DETECTIVE Gary Williams was investigating a rape last year when his leads dried up. From the victim's statement, he knew there had been a witness to the crime, he even knew his name, but the person had not come forward and could not be found at their registered address.

So Williams, of South Yorkshire Police in the UK, turned to a smart search engine the force had begun trialling only 2 hours earlier. With just the name of the witness, the Intelligent Data Operating Layer (IDOL) trawled the force's database. It turned out that the witness had been mixed up in previous, unrelated police investigations, and the search generated a list of alternative addresses he had been spotted at, mentioned in witness statements.

From this list, Williams was able to track him down. "The trail on this person had gone cold, but we found him within a few hours using this search engine," he says. As a result, they were able to get a statement from the witness and subsequently arrested a suspect.

"The trail on this person had gone cold, but we found him in hours with the search engine"

Developed by software firm Autonomy of Cambridge, UK, IDOL sits on police computer networks and allows police officers to scour archives of witness statements and intelligence and surveillance logs far more thoroughly than before. The system is part of a general push to improve police and intelligence investigations by finding better ways to access and use information. In the US, for example, a number of police forces, including the Los Angeles Police Department, now use a system called Coplink, which connects all their databases together. The system, developed at the Artificial Intelligence Lab at the University of Arizona in Tucson, allows investigators to find out whether people have been arrested or interviewed in other areas, information that was not available to them before.

Such systems, though, are effectively simple keyword search tools, which scan police databases looking for, say, a name or vehicle licence plate number. This is fine if you know exactly who or what you are looking for, but at the start of an enquiry, that is not always the case, says Mike Lynch of Autonomy. "In investigations the police don't often know what they need to know at the outset, they don't know what questions they need to ask."

The problem with witness statements is that they can't be searched in this way. They are written in the exact words of the witness and so may contain slang words and expressions peculiar to the witness. Also, people often describe the same event using very different language. For instance, one witness to a car-related crime may describe a "black VW Golf with alloy wheels", while another would call the same vehicle a "dark hatchback".

This makes keyword searches of their statements virtually worthless. "There is no standard keyword model you could possibly apply to guarantee you would get all the data out on a search," says Lynch.

Plugging a more sophisticated Google-style search engine into a police database would not work either, as most internet search engines simply rank the hits by their popularity, as measured by the number of other sites that link to them.

In contrast, IDOL uses statistical techniques to return information and phrases related to, but not necessarily including, the search terms entered. So a search for "drug dealing in supermarket car parks" will generate the expected returns, as well as related hits that mention none of those terms; IDOL knows that Wal-Mart and Sainsbury's are supermarkets, for example, and that cannabis and crack are drugs.

The system harnesses a probability theory developed by Thomas Bayes, an English cleric and mathematician, in the 18th century. Bayes worked out how to calculate the probabilistic relationships between different variables. In IDOL, this means working out the probability of words appearing near each other in a statement. "Words don't appear randomly in text but in clusters. The meaning of one set of words affects the probability of seeing another set of words nearby. We measure those probabilities," says Lynch.

For instance, the word "dog" has a series of effects on the words that might appear around it: dogs have fur, they get walked, they chase balls. The system learns the prevalence of words in everyday life and other words that are likely to be associated with them: the more text IDOL sees, the smarter it gets. So IDOL will be able to learn that a VW Golf and a hatchback are both associated with the idea of cars. If a witness statement reads: "I bought the stuff from a guy at the Red Lion," IDOL will look for suspicious deals done in pubs and bars. "It'll work out that 'red lion' is not a crimson carnivore," says Lynch. "And it will know a cat from a cat burglar."

It may sound obvious, but this kind of information would otherwise have lain unseen in the police database, simply because it didn't contain the right search terms. "We call it looking for the unknown," says Steve Crowley at software firm Unisys, based in Blue Bell, Pennsylvania, which has incorporated IDOL into its crime investigation system, Holmes2. Investigators can now perform much more powerful searches, linking more people, addresses, cellphone numbers or cars. To further improve the search, IDOL also indicates how certain it is that the results are relevant by assigning a percentage to each return.

What this means is that in the early stage of an enquiry, when there might be little evidence to go on, officers can start looking for common ground in witness statements immediately, in the hope that this will give them at least some leads to follow up. And because the results are weighted, officers can prioritise those that seem the most likely. "By reading and analysing witness statements the software pulls out previously unseen relationships between them," says Lynch.

IDOL is now used by several UK police forces. At South Yorkshire Police, David Rock-Evans, head of information systems, has calculated that, in its first seven months of use, the technology has cut average search times from 15 to 2 minutes (see Graphic). The system is also used by the Australian intelligence service and the US Department of Homeland Security, which plans to make searchable all the data in the 21 domestic security agencies it is bringing under its wing.

Autonomy is now extending the system to telephone calls. It has combined IDOL with speech-recognition software to create a voice-to-text system that translates calls made to the police into searchable statements. That would also be useful for witness statements, says Nick Ross of the Jill Dando Institute of Crime Science in London. "The way the police take statements and then have them typed up elsewhere is incredibly old-fashioned, error-prone and subject to the interpretations of the typist. What's needed is a way for the officer to use speech recognition to record and transcribe a statement and then send it wirelessly." Blackberry-style hand-helds, allowing police patrols access to Bayesian search on the streets, are also being tested, including one trial in Cambridge, UK.

However, not everyone is convinced that such "smart" search engines will be useful. Fabio Corradi at the University of Florence in Italy and Gianluca Baio from University College London point out in a paper soon to be published in Forensic Science International that criminals and their associates could seriously skew a Bayesian-based investigation by supplying fake witness evidence. This would distort search results and throw officers off the scent. "It makes a lot of sense to think about such a possibility. It is important to account for uncertainty in the observations," Baio says.

That's nothing new, Williams says. Police have always had to deal with people providing false information. "People lie in statements already, giving fake names and addresses for instance. And you can skew an investigation with a phone call."

Lynch says the system could be proofed against this by programming it to assume all information is possibly tainted and to flag isolated facts not corroborated by other witnesses.

While it won't solve all cases, IDOL should speed investigations by giving officers leads they might otherwise have missed. Once they are on the right track, though, the age-old methods take over, says Lynch. "A hell of lot of crime is still solved because crooks make stupid mistakes."

+1 415 243 9955

About Us
Technology
Functionality
Products
Solutions
Services
Customers
Partners
News & Events
Contact Us