As big data gained more widespread attention in the mid 2000s, Naren Ramakrishnan and his research team moved to the forefront of innovation with an automated 24/7 continuous system that used open source indicators to forecast population-scale events across countries in Latin America. They named it EMBERS, an acronym for Early Model-based Event Recognition Using Surrogates.
These events included disease outbreaks, civil unrest, elections, and domestic political crises – essentially events with precursor signals that might manifest in mass media.
In 2014, Ramakrishnan and team presented their results on one type of event, civil unrest in “‘Beating the News’ with EMBERS: Forecasting Civil Unrest Using Open Source Indicators” at the Association of Computing Machinery’s Special Interest Group on Knowledge Discovery in Data Conference. The paper explained how the researchers had designed, implemented, and evaluated EMBERS by using open source indicators such as tweets, news sources, blogs, economic indicators, and other data sources.
At that time, EMBERS, sponsored by a contract from the Intelligence Advanced Research Projects Activity Open Source Indicators Program, had successfully forecast events such as the Brazilian Spring in 2013, Hantavirus outbreaks in Argentina and Chile in 2013, and student-led protests in Venezuela in 2014.
EMBERS subsequently predicted protests stemming from the kidnappings and killings of student-teachers in Mexico in 2014 and the following year, forecast domestic political crises in Bahrain and Egypt and protests in Paraguay against a new public-private partnership law.
Earlier this month, Ramakrishnan, the Thomas L. Phillips Professor of Engineering and director of the Sanghani Center for Artificial Intelligence and Data Analytics and Chang-Tien Lu, professor of computer science and the center’s associate director, attended the conference to accept its Applied Data Science Test of Time Award on behalf of the team of 30 data scientists from across academia and industry who co-authored the 2014 paper. This award recognizes research published a decade back that has had a lasting impact in knowledge discovery and data mining.
“We thank the award committee for recognizing our work and are honored to accept the award on behalf of all who were involved in this project,” said Ramakrishnan.
Of the co-authors, Ramakrishnan, Lu, and research associate Nathan Self have remained at Virginia Tech and continue work on event forecasting. Today, Ramakrishnan and Lu are both faculty members with the university’s new Institute for Advanced Computing in Alexandria.
Over the past 10 years, EMBERS has expanded its application to forecast significant societal events such as mass migrations, cyberattacks, and traffic incidents, and more than 80 people have contributed to the project.
“Designing EMBERS was about how to transform raw data into actionable knowledge or intelligence at scale,” Ramakrishnan said. “There isn’t one specific, magic algorithm or strategy in EMBERS. Instead, data is ingested at scale, moves through the system and is transduced by a range of algorithms which are trained to map specific precursors of events into `alerts.’ These initial alerts are then fused to generate a final set of forecasts. Once a forecast is generated it can be traced back to the raw data via ‘audit trails,’ which serve as an explanation for the forecast.”
At its heyday, EMBERS ingested half a terabyte of information a day per world region and distilled them into 45 to 50 forecasts per day across multiple countries of Latin America and the Middle East. Forecasts are highly structured, capturing when an event, like a protest, is forecast to happen; where, with a city-level granularity; which subgroup of the population is associated; why they will be protesting; and a probability associated with the forecast.
With such detailed information, the EMBERS forecasting system contributes to anticipatory intelligence.
“It was among the first to demonstrate the potential of open-source indicators for intelligence, preceding many similar services,” said Lu. “It also pioneered population-scale social media analytics. Today, numerous companies offer services to gather and process cultural data and human intelligence at scale.”
Ramakrishnan and Lu were joined at the conference by former students who worked on the project while earning their Ph.Ds.:
- Rupinder Paul Khandpur, now at Meta
- Feng Chen, now at the University of Texas, Dallas
- Parang Saraf, now at Apple
- Liang Zhao, now at Emory University
- Ting Hua, now at the University of Notre Dame