Collaborative Deep Reinforcement Learning for Joint Object Search（CVPR2017)

协助的多智能体 deep RL algorithm 来学习进行联合物体定位的最优策略。我们的 proposal 服从现有的 RL 框架，但是允许多个智能体之间进行协作。在这个领域当中，有两个开放的问题：
　　1. how to make communications effective in between different agents ;
　　2. how to jointly learn good policies for all agents.
　　
　　本文提出通过 gated cross connections between the Q-networks 来学习 inter-agent communication。

方法：

多智能体联合搜索不同的物体。
智能体之间的message通道通过网络层互换，创建新的vitural agent训练比较自由。交互通道使用gate cross connections控制，选用自己的Q网络动作or 选用与其他物体交互的vitural agent的Q网络动作。
实验创造了关联的数据集子集，验证某些物体之间存在关系，比单智能体方法快。不是所有result都很powerful。

所提出的创新点：

　　1. 是物体检测领域的第一个做 collaborative deep RL algorithm ；
　　2. propose a novel multi-agent Q-learning solution that facilitates learnable inter-agent communication with gated cross connections between the Q-networks；
　　3. 本文方法有效的探索了相关物体之间有用的 contextual information，并且进一步的提升了检测的效果。

多个agent移动