Home Research
Research

My research interests are in two major areas of artificial intelligence, viz., multi-agent systems and machine learning. The overarching objective in multi-agent systems is to design autonomous intelligent problem solving agents (physical robots or virtual/software agents) as well as the frameworks and the rules of encounter for them to interact and coordinate in a shared environment, for distributed problem solving. In machine learning, the main objective is to enable an intelligent agent to improve its performance at a given task with experience. My doctoral dissertation was at the intersection of these broad areas, where multiple reinforcement learners not only achieve game-theoretically desirable solutions through concurrent exploration of the shared environment, but also offer theoretical guarantees on their performance in different kinds of tasks (viz., collaborative, competitive and others).

My current focus is on developing and analyzing new variants of reinforcement learning for agents, particularly robots, to autonomously acquire their controllers for a given collaborative task through repeated interactions. While reinforcement learning is geared for this objective, it has a typically high sample complexity in multi-agent decision problems. My goal is to reduce this sample complexity.

Current Projects

Robot AlignmentReinforcement Learning of Robot Controllers:We have developed a new framework called reinforcement learning as a rehearsal (RLaR) to enable autonomous agents to learn their own control policies in collaborative tasks. Agents are allowed to perceive hidden features in a controlled (lab/simulator) setting, as if in a rehearsal, but then must learn control policy that does not depend on these hidden features to execute outside the lab/simulator. Here are some papers:

Neurocomputing-16 paper: Develops, analyzes and evaluates RLaR.

* AAMAS-16 paper: Polynomially bounded learning with Monte Carlo exploring starts.


Work is currently underway to leverage this framework for behavior based robot control.


This project is funded by NSF.

 

 

 


Past Projects

mapr logo

Multi-agent Plan Recognition: We have made fundamental contributions to the theory of multi-agent plan recognition -- inferring dynamic teams and their plans from observations. Here are some papers:

* AAAI-10 paper: Introduces an initial definition and establishes hardness results.

* AAAI-11 paper: Advances the problem definition from AAAI-10 paper in several ways, and develops and tests an Operations Research based solution technique.

* JAAMAS-15 paper: Establishes the complexity of multi-agent plan recognition and associated problems.

We have also created a data generator (called TraceGen) to help with the evaluation of plan recognition algorithms designed for single or more agents. The code is available publicly here. For other similar resources, check out planrec.org

This project was partly funded by NASA.

 

Multi-agent Policy

Multi-agent Control Learning: We are developing reinforcement learning algorithms for distributed learning of multi-agent control in cooperative sequential decision tasks, modeled by decentralized partially observable Markov decision processes (Dec-POMDP). Current approaches to Dec-POMDP solve it mostly via centralized, model-based techniques. Our goal is to apply model-free distributed reinforcement learning techniques instead, with focus on sample complexity and the quality of policies learned. Here are some papers:

* AAAI-12 paper: Introduces the MCQ-Alt algorithm for sample bounded distributed RL in Dec-POMDPs.

* AAAI-13 paper: Improves the MCQ-Alt algorithm with pruning for lower sample complexity, yet preserving the performance guarantees.

* AAMAS-13 short paper: Introduces the reinforcement learning as a rehearsal (RLaR) framework.

This project was funded by the US Army Research Office.


Crowd Simulation: We developed a crowd simulation system for evacuation simulation in a spectator sports stadium. Here are some of the major research contributions of this work:

* A principled method for the validation of simulated crowd egress behavior, and the formal comparison of competing crowd egress simulation systems. Here is a paper published on this topic. Also check out a demo of the simulation.

* An efficient layered method to simulate varying goal selection, and dynamic obstacle avoidance during evacuation. Here is a paper (and an extended version that appeared in the SIMULATION journal) published on this topic. Also check out the demo.

* An efficient solution to the problem of dynamic congestion rerouting in crowds. Check out the demo.

This work was funded by the Department of Homeland Security (DHS) through the Southeast Region Research Initiative (SERRI) program.

 


Anomaly Detection: We developed a technique based on sequential Monte Carlo for detecting anomalies in a sensor network. This has been tested and validated on data collected from rocket engine test stands at NASA Stennis Space Center. We have further extended this work to perform prognosis of the diagnosed anomalies, to predict which other sensors might show sufficient error to abort the test automatically, and when. This may prevent costly failures in the future. The latest results were recently presented at the Infotech conference organized by the American Institute of Aeronautics and Astronautics (AIAA).The paper is available here.

This work was funded by NASA, both directly and through a subcontract from the University of Mississippi.

 


 

cloud attack

Cloud Computing Security: We have investigated the security of cloud computing environments in the face of distributed SQL injection attacks, and found that existing IDSs are inadequate. We have applied our work in multi-agent plan recognition to successfully detect such attacks.A paper based on our findings was presented at SAM-2013 conference.

This work was supported by NASA.

Last Updated on Thursday, 06 July 2017 09:42
 
Copyright © 2017 Bikramjit Banerjee. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.