Home Research

My research interests are in two major areas of artificial intelligence, viz., multi-agent systems and machine learning. The main research objective in multi-agent systems is to design autonomous intelligent problem solving agents (physical robots or virtual/software agents) as well as the frameworks and the rules of encounter for them to interact and coordinate in a shared environment, for distributed problem solving. In machine learning, the main objective is to enable an intelligent agent to improve its performance at a given task with experience. My doctoral dissertation was at the intersection of these broad areas, where multiple reinforcement learners not only achieve game-theoretically desirable solutions through concurrent exploration of the shared environment, but also offer theoretical guarantees on their performance in different kinds of tasks (viz., collaborative, competitive and others).

My current focus is on understanding the activities of multiple agents from observations, particularly inferring which agents are working in teams, what these teams are trying to accomplish, and how these teams change over time. Besides applications in security, intelligence analysis, opponent modeling, human computer collaboration and other areas of current interest, this capability is also central to the philosophy of autonomy of an individual agent in a multi-agent environment. For instance, to form an ad-hoc team with other agents that were not designed to compatibility, an agent needs to first understand the status-quo from mere observations (possibly incomplete and noisy). I believe this capability will also prove to be a game-changer in multi-agent learning, where agents have hitherto failed to accommodate dynamic team structures among the other agents.

Current Projects

mapr logo

Multi-agent Plan Recognition: We have made fundamental contributions to the theory of multi-agent plan recognition -- inferring dynamic teams and their plans from observations. Here are some papers:

* AAAI-10 paper: Introduces an initial definition and establishes hardness results.

* AAAI-11 paper: Advances the problem definition from AAAI-10 paper in several ways, and develops and tests an Operations Research based solution technique.

We have also created a data generator (called TraceGen) to help with the evaluation of plan recognition algorithms designed for single or more agents. The code is available publicly here. For other similar resources, check out planrec.org

This project is partly funded by NASA.



Multi-agent Policy

Multi-agent Control Learning: We are developing reinforcement learning algorithms for distributed learning of multi-agent control in cooperative sequential decision tasks, modeled by decentralized partially observable Markov decision processes (Dec-POMDP). Current approaches to Dec-POMDP solve it mostly via centralized, model-based techniques. Our goal is to apply model-free distributed reinforcement learning techniques instead, with focus on sample complexity and the quality of policies learned. Here are some papers:

* AAAI-12 paper: Introduces the MCQ-Alt algorithm for sample bounded distributed RL in Dec-POMDPs.

* AAAI-13 paper: Improves the MCQ-Alt algorithm with pruning for lower sample complexity, yet preserving the performance guarantees.

* AAMAS -13 short paper: Introduces a new framework for RL in Dec-POMDPs: rehearsal based RL, that can greatly reduce sample complexity. More investigation and   evaluation of RL in this framework are being conducted.

This project is funded by the US Army Research Office.



Past Projects

Crowd Simulation: We developed a crowd simulation system for evacuation simulation in a spectator sports stadium. Here are some of the major research contributions of this work:

* A principled method for the validation of simulated crowd egress behavior, and the formal comparison of competing crowd egress simulation systems. Here is a paper published on this topic. Also check out a demo of the simulation.

* An efficient layered method to simulate varying goal selection, and dynamic obstacle avoidance during evacuation. Here is a paper (and an extended version that appeared in the SIMULATION journal) published on this topic. Also check out the demo.

* An efficient solution to the problem of dynamic congestion rerouting in crowds. Check out the demo.

This work was funded by the Department of Homeland Security (DHS) through the Southeast Region Research Initiative (SERRI) program.


Anomaly Detection: We developed a technique based on sequential Monte Carlo for detecting anomalies in a sensor network. This has been tested and validated on data collected from rocket engine test stands at NASA Stennis Space Center. We have further extended this work to perform prognosis of the diagnosed anomalies, to predict which other sensors might show sufficient error to abort the test automatically, and when. This may prevent costly failures in the future. The latest results were recently presented at the Infotech conference organized by the American Institute of Aeronautics and Astronautics (AIAA).The paper is available here.

This work was funded by NASA, both directly and through a subcontract from the University of Mississippi.



cloud attack

Cloud Computing Security: We have investigated the security of cloud computing environments in the face of distributed SQL injection attacks, and found that existing IDSs are inadequate. We have applied our work in multi-agent plan recognition to successfully detect such attacks.A paper based on our findings was presented at SAM-2013 conference.

This work was supported by NASA.

Last Updated on Thursday, 22 August 2013 13:37
Copyright © 2017 Bikramjit Banerjee. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.