Maltese engineer leads science project to predict Afghan insurgent attacks
A model predicting insurgent attacks in Afghanistan has been developed by a Maltese scientist of informatics using a mathematical algorithm based on Wikileaks data.
A Maltese scientist of informatics and his team have developed a model that can predict insurgent attacks in Afghanistan, a tool that the United States Pentagon has so far failed to accurately develop.
Andrew Zammit Mangion, 27, an engineering graduate from the University of Malta, with a PhD from Sheffield University, led a team of researchers at the University of Edinburgh to develop a mathematical algorithm that accurately predicted which provinces in Afghanistan would experience more violence in 2010 and how much the level of violence increased or decreased during this time period from a set of written data taken from Wikileaks, recording some 77,000 military logs dated between 2004 and 2009.
Insurgencies are difficult to predict as they are often loosely organised, split into fractions and strike out of nowhere.
"The whole problem boils down to finding which clusters of violence are significant and which are not. If there is some activity in a certain region how do you know whether this is sporadic, what we call noise, or indicative of a changing trend?
"If it is indicative of a changing trend then we assume that the future will follow this trend, if not then we assume that on average things will remain roughly the same as they are now," Zammit Mangion explains.
The researchers sought to find a general pattern of how the violence occurred in Afghanistan from the Wikileaks cables, and thus be able to predict how violence would change in 2010.
"Given the overall trend and volatility then predictions can be made - but by predictions I do not mean that one can know exactly where and when something will happen, rather one can give very accurate probabilities, and probabilities is a key word - of how many conflict events are likely to happen within a certain region and a certain time span," Zammit Mangion said.
The results, published online on Monday by the Proceedings of the National Academy of Sciences, show that in provinces that were less volatile - where the conflict did not tend to whip back and forth between extremes of war and peace - estimated results were very similar to the actual number of incidents recorded.
In the Baghlan province, researchers predicted a 128% increase in violent incidents from 100 incidents in 2009 to 228 in 2010. After comparing these results with the Afghanistan NGO Safety Office the results were remarkably close - 222 incidents were reported, an increase of 120%.
The researchers were surprised to find that the study was accurate even in the more peaceful provinces, where there were fewer data points available for analysis, suggesting that the model "isn't attributed to the noise in the data" Zammit Mangion said.
A strong correlation was found in all 32 provinces of Afghanistan and even where the real results were different from the predicted results, it was still within the expected range of outcomes, which makes statistical sense.
"The model itself is only a part of the whole approach and can be changed within the whole framework. The plan is obviously to make it more accurate for significant practical use. However we have several ideas on how this may be done - the paper is a first step in the accurate prediction of conflict.
"The primary aim of the work was to provide a method through which one can give an objective account, supplanted with rigorous statistical results, of the current, and near future, situation in Afghanistan," Zammit Mangion said.
Zammit Mangion said that the results from the model will mainly be of use to NGOs who wish to ensure the safety of their employees as the military more than likely have their own prediction tools.
However, Zammit Mangion also said that the reliability of the model does depend on an accurate data set. Relying on data reported in the media is simply not enough to relay reliable predictions.
"The problem with news items is that they are relatively unreliable as a data sample - although they give an idea of what is going on they may be biased, subjective and uneven in coverage. For instance one might find many more reporters in Kabul than in the quiet areas such as Nimroz or Sar-e-Pul.
"The Wikileaks dataset provides a ground-level view of the fighting and is, arguably, a much more detailed account of the war in Afghanistan. It is, in many ways, a reliable source of information," Zammit Mangion said.