Goal-Oriented Action Planning (GOAP)

Most of the AI we met on the Game AI page is reactive: steering behaviours respond to what is around the agent right now, and pathfinding answers "how do I get there?". Goal-Oriented Action Planning answers a more interesting question: "what should I do next, and in what order?" Instead of hand-authoring every behaviour sequence, you give the agent a set of actions and a goal, and let a planner work out the sequence for itself [1].

From Reactions to Intentions

Finite state machines and behaviour trees are wonderful until the design grows. Every new behaviour means new transitions or new branches, and the combinations multiply: a guard who can fight with a rifle, a pistol or fists, take cover, call for help and retreat needs a lot of hand-wired logic. GOAP, introduced to games by Jeff Orkin for F.E.A.R. (2005), flips the problem around [1] [2]:

  • Actions declare themselves — each action states what must be true before it can run (preconditions) and what becomes true afterwards (effects).
  • Goals are just desired facts — "the threat is removed", "I am safe", "the alarm is raised".
  • A search algorithm finds the sequence — the same A* family you already use for pathfinding, but searching through world states instead of nav polygons.

The famous result: F.E.A.R.'s soldiers ran on a state machine with only three states (goto, animate, use smart object), because the interesting decisions had moved into the planner [1].

The Core Idea

A GOAP system needs four ingredients:

  • World state — a set of symbols the agent believes to be true or false: armed, has_ammo, in_range.
  • Goal — the subset of symbols we want to make true.
  • Actions — operators with preconditions, effects and a cost. Cost is your tuning dial: raising the cost of FindAmmo makes melee more attractive.
  • Planner — a search from the current state towards the goal (or, in mature implementations, backwards from the goal).
graph LR S["weapon_holstered
enemy_visible"] -->|DrawWeapon| A["armed"] A -->|FindAmmo| B["armed
has_ammo"] B -->|LoadWeapon| C["loaded"] C -->|Approach| D["in_range"] D -->|Attack| G(("Goal:
threat_removed")) style G fill:#2ECC71,color:#fff

A Worked Example

Give a guard five actions and the goal threat_removed:

ActionPreconditionsEffectsCost
DrawWeaponweapon_holsteredarmed, ¬weapon_holstered1
FindAmmohas_ammo3
LoadWeaponarmed, has_ammoloaded2
Approachenemy_visiblein_range2
Attackarmed, loaded, in_rangethreat_removed1

No sequence is authored anywhere, yet the planner discovers DrawWeapon → Approach → FindAmmo → LoadWeapon → Attack. Delete FindAmmo from the action set (the level has no ammo pickups) and the same code simply finds a different plan — or reports that no plan exists, which is your cue to fall back to a flee goal.

A Minimal Planner

The heart of GOAP is small. This is a forward uniform-cost search — production planners add a heuristic (making it A*), regressive search from the goal, and per-agent action sets:

import heapq
from itertools import count

class Action:
    def __init__(self, name, cost, preconditions, effects):
        self.name = name
        self.cost = cost
        self.preconditions = preconditions  # dict of symbol -> bool
        self.effects = effects              # dict of symbol -> bool

    def applicable(self, state):
        return all(state.get(k, False) == v for k, v in self.preconditions.items())

    def apply(self, state):
        new_state = dict(state)
        new_state.update(self.effects)
        return new_state

def plan(start, goal, actions):
    """Cheapest action sequence that satisfies the goal (uniform-cost search)."""
    tie = count()  # tie-breaker so dict states never get compared
    frontier = [(0.0, next(tie), start, [])]
    seen = set()
    while frontier:
        cost, _, state, path = heapq.heappop(frontier)
        if all(state.get(k, False) == v for k, v in goal.items()):
            return path
        key = frozenset(state.items())
        if key in seen:
            continue
        seen.add(key)
        for action in actions:
            if action.applicable(state):
                heapq.heappush(frontier, (cost + action.cost, next(tie),
                                          action.apply(state), path + [action.name]))
    return None  # No plan satisfies the goal

actions = [
    Action("DrawWeapon", 1, {"weapon_holstered": True},
                            {"weapon_holstered": False, "armed": True}),
    Action("FindAmmo",   3, {}, {"has_ammo": True}),
    Action("LoadWeapon", 2, {"armed": True, "has_ammo": True}, {"loaded": True}),
    Action("Approach",   2, {"enemy_visible": True}, {"in_range": True}),
    Action("Attack",     1, {"armed": True, "loaded": True, "in_range": True},
                            {"threat_removed": True}),
]

start = {"weapon_holstered": True, "enemy_visible": True}
goal = {"threat_removed": True}
print(plan(start, goal, actions))
# ['DrawWeapon', 'Approach', 'FindAmmo', 'LoadWeapon', 'Attack']
#include <algorithm>
#include <map>
#include <queue>
#include <set>
#include <string>
#include <vector>

using WorldState = std::map<std::string, bool>;

static bool satisfies(const WorldState& state, const WorldState& conditions) {
    return std::all_of(conditions.begin(), conditions.end(), [&](const auto& kv) {
        auto it = state.find(kv.first);
        return (it != state.end() ? it->second : false) == kv.second;
    });
}

struct Action {
    std::string name;
    float cost;
    WorldState preconditions;
    WorldState effects;

    bool applicable(const WorldState& state) const {
        return satisfies(state, preconditions);
    }

    WorldState apply(WorldState state) const {
        for (const auto& [k, v] : effects) state[k] = v;
        return state;
    }
};

struct PlanNode {
    float cost;
    WorldState state;
    std::vector<std::string> path;
    bool operator>(const PlanNode& o) const { return cost > o.cost; }
};

// Cheapest action sequence that satisfies the goal (uniform-cost search)
std::vector<std::string> plan(const WorldState& start, const WorldState& goal,
                              const std::vector<Action>& actions) {
    std::priority_queue<PlanNode, std::vector<PlanNode>, std::greater<>> frontier;
    std::set<WorldState> seen;
    frontier.push({0, start, {}});

    while (!frontier.empty()) {
        PlanNode current = frontier.top();
        frontier.pop();
        if (satisfies(current.state, goal)) return current.path;
        if (!seen.insert(current.state).second) continue;

        for (const Action& action : actions) {
            if (!action.applicable(current.state)) continue;
            auto path = current.path;
            path.push_back(action.name);
            frontier.push({current.cost + action.cost,
                           action.apply(current.state), std::move(path)});
        }
    }
    return {};  // No plan satisfies the goal
}
import java.util.*;

public record Action(String name, double cost,
                     Map<String, Boolean> preconditions,
                     Map<String, Boolean> effects) {

    public boolean applicable(Map<String, Boolean> state) {
        return preconditions.entrySet().stream()
            .allMatch(e -> state.getOrDefault(e.getKey(), false).equals(e.getValue()));
    }

    public Map<String, Boolean> apply(Map<String, Boolean> state) {
        var next = new HashMap<>(state);
        next.putAll(effects);
        return next;
    }
}

public class Planner {
    private record Node(double cost, Map<String, Boolean> state, List<String> path) {}

    /** Cheapest action sequence that satisfies the goal (uniform-cost search). */
    public static List<String> plan(Map<String, Boolean> start, Map<String, Boolean> goal,
                                    List<Action> actions) {
        var frontier = new PriorityQueue<Node>(Comparator.comparingDouble(Node::cost));
        var seen = new HashSet<Map<String, Boolean>>();
        frontier.add(new Node(0, start, List.of()));

        while (!frontier.isEmpty()) {
            Node current = frontier.poll();
            boolean satisfied = goal.entrySet().stream()
                .allMatch(e -> current.state().getOrDefault(e.getKey(), false)
                                              .equals(e.getValue()));
            if (satisfied) return current.path();
            if (!seen.add(current.state())) continue;

            for (Action action : actions) {
                if (!action.applicable(current.state())) continue;
                var path = new ArrayList<>(current.path());
                path.add(action.name());
                frontier.add(new Node(current.cost() + action.cost(),
                                      action.apply(current.state()), path));
            }
        }
        return null;  // No plan satisfies the goal
    }
}
using System.Collections.Generic;
using System.Linq;

public record GoapAction(string Name, float Cost,
                         Dictionary<string, bool> Preconditions,
                         Dictionary<string, bool> Effects)
{
    public bool Applicable(IReadOnlyDictionary<string, bool> state) =>
        Preconditions.All(kv => state.GetValueOrDefault(kv.Key) == kv.Value);

    public Dictionary<string, bool> Apply(IReadOnlyDictionary<string, bool> state)
    {
        var next = new Dictionary<string, bool>(state);
        foreach (var (k, v) in Effects) next[k] = v;
        return next;
    }
}

public static class Planner
{
    private sealed record Node(float Cost, Dictionary<string, bool> State, List<string> Path);

    // Cheapest action sequence that satisfies the goal (uniform-cost search)
    public static List<string>? Plan(Dictionary<string, bool> start,
                                     Dictionary<string, bool> goal,
                                     IReadOnlyList<GoapAction> actions)
    {
        var frontier = new PriorityQueue<Node, float>();
        var seen = new HashSet<string>();
        frontier.Enqueue(new Node(0, start, new List<string>()), 0);

        while (frontier.Count > 0)
        {
            var current = frontier.Dequeue();
            if (goal.All(kv => current.State.GetValueOrDefault(kv.Key) == kv.Value))
                return current.Path;

            var key = string.Join(",", current.State.OrderBy(kv => kv.Key));
            if (!seen.Add(key)) continue;

            foreach (var action in actions)
            {
                if (!action.Applicable(current.State)) continue;
                var path = new List<string>(current.Path) { action.Name };
                frontier.Enqueue(new Node(current.Cost + action.Cost,
                                          action.Apply(current.State), path),
                                 current.Cost + action.Cost);
            }
        }
        return null;  // No plan satisfies the goal
    }
}
Action = Struct.new(:name, :cost, :preconditions, :effects) do
  def applicable?(state)
    preconditions.all? { |k, v| state.fetch(k, false) == v }
  end

  def apply(state)
    state.merge(effects)
  end
end

# Cheapest action sequence that satisfies the goal (uniform-cost search)
def plan(start, goal, actions)
  frontier = [[0.0, start, []]]
  seen = []
  until frontier.empty?
    frontier.sort_by!(&:first)
    cost, state, path = frontier.shift
    return path if goal.all? { |k, v| state.fetch(k, false) == v }
    next if seen.include?(state)

    seen << state
    actions.each do |action|
      next unless action.applicable?(state)

      frontier << [cost + action.cost, action.apply(state), path + [action.name]]
    end
  end
  nil # No plan satisfies the goal
end

actions = [
  Action.new("DrawWeapon", 1, { weapon_holstered: true },
             { weapon_holstered: false, armed: true }),
  Action.new("FindAmmo",   3, {}, { has_ammo: true }),
  Action.new("LoadWeapon", 2, { armed: true, has_ammo: true }, { loaded: true }),
  Action.new("Approach",   2, { enemy_visible: true }, { in_range: true }),
  Action.new("Attack",     1, { armed: true, loaded: true, in_range: true },
             { threat_removed: true })
]

start = { weapon_holstered: true, enemy_visible: true }
goal = { threat_removed: true }
p plan(start, goal, actions)
# => ["DrawWeapon", "Approach", "FindAmmo", "LoadWeapon", "Attack"]

Making It Production-Ready

  • Replan on surprise — plans are cheap. When a precondition breaks mid-plan (the ammo crate is empty), throw the plan away and search again from the new state.
  • Sensors feed the world state — perception systems write symbols (enemy_visible), the planner only reads them. This separation keeps both sides testable.
  • Costs are behaviour — a cowardly enemy is one whose TakeCover is cheap; an aggressive one discounts Attack. Same actions, different personalities.
  • Prefer regressive search — searching backwards from the goal usually visits far fewer states, because most of the world is irrelevant to any one goal [2].
  • Watch the state space — boolean symbols keep the search tractable; if you need numeric state (health, distance), quantise it or handle it in preconditions.

Planning Beyond Games

GOAP is a games-friendly repackaging of STRIPS-style planning from 1970s AI research [3] — a reminder that good ideas travel between fields. And the travel needn't stop at the games border: a project plan is also "a sequence of actions, each with preconditions, effects and costs, searching for a goal state". We explore that thought — using GOAP to plan software projects, and the wider habit of borrowing ideas across domains — on Cross-Pollination of Ideas in the Software Engineering track.

References

  1. Orkin, J. (2006). "Three States and a Plan: The A.I. of F.E.A.R." Game Developers Conference 2006. https://www.gamedevs.org/uploads/three-states-plan-ai-of-fear.pdf
  2. Millington, I. (2019). AI for Games (3rd ed.). CRC Press. Chapter 5: Decision Making. https://www.routledge.com/AI-for-Games/Millington/p/book/9781138483972
  3. Fikes, R.E. & Nilsson, N.J. (1971). "STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving." Artificial Intelligence, 2(3–4), 189–208. https://doi.org/10.1016/0004-3702(71)90010-5