|
A Combined Linear/Integer Programming Heuristic to Control a Network of semi-Markov Decision Processes
March 2010
David W. J. Stein, The MITRE Corporation
Steven F. Baker, The MITRE Corporation
ABSTRACT
For even moderately sized networks and a small number of actions finding an optimal policy, i.e., determining which actions to apply to which
nodes of the network given the states of the nodes, may be computationally
infeasible. Heuristics have been developed that are applicable to the case of
multiple homogeneous actions such that each action affects exactly one entity.
For many applications, e.g., remote sensing with localization and identification sensors, multiple disparate actions may be available, an action may aect
multiple entities, and actions may require different amounts of time to complete. The present work develops a two-step heuristic to generate action plans
under these more general circumstances. Linear programming applied either
to the individual nodes of the network or to all nodes collectively but with
an average aggregate utilization constraint is used to obtain optimal policies
and reduced cost coefficients for each node. Integer programming, using the
reduced cost coefficients, is then used at each time-step to assign resources
to projects. These methodologies are applied to a simulated remote sensing
problem, and the performance of the current methods is compared with an
optimal solution where its computation is feasible and with a greedy solution.
Results show minimal degradation in comparison with an optimal policy and
substantial improvement over the greedy approach. Computation time studies
show that the method is practical for large scale real time applications.

Additional Search Keywords
Markov decision process, dynamic programming, resource allocation, stochastic scheduling
|