Tasks #2526

EwoutH · 2024-12-03T08:15:54Z

EwoutH
Dec 3, 2024
Maintainer

Introduction

ABM started quite simple, with Agents doing (or not doing) things in a fixed timestep. Then, we evolved it to allow agents to do things at any time, first with the DiscreteEventScheduler and then even better with full support for discrete event scheduling. However, Mesa doesn't really have a concept of a time duration to do something. You can emulate it by scheduling the next task for an agent some duration in advance, but this has several limitation:

You don't know if an Agent is still busy
You can accidentally schedule things while the agent was supposed to (still) be doing something else
It's non-obvious to give the reward for completing something when it's actually finished
There's no good way to cancel or interrupt something an agent is doing
There's no convenient way to let an Agent set priorities

To solve all of this, I would like to propose the Task.

The Task

A Task is an object that represents a specific activity an Agent is performing over a period of time, with a defined duration, priority, and completion state. It encapsulates all the information needed to track what an Agent is doing, when they started, when they'll finish, what happens upon completion, and how it can be interrupted.

In my vision, it has a few handy properties:

Time and progress: A tasks has an explicit duration (distribution) function and status. It can either be scheduled
Reward function: Completing the tasks gives your some reward when it's finished. Optionally, it can also give you a reward on partially finishing a tasks. The can be a function. For example, y=x gives you a linear reward: 50% completion maps to 50% of the reward. But y=x^2 will only give you 25% reward onto completing a tasks 50%.
- This also allows for doing something until a desired threshold is reached, for example sleep until you're 80% rested (scheduling this might be a bit tricky with stochastic reward functions).
Occupation: A tasks occupies an agent, so when it's doing one tasks, it can't be simultaneously doing something else (see future considerations below).
Requirements: To perform a tasks, there might be certain requirements. These can be both from your internal states or the environment. To eat, you might need to be hungry and your might need to be on a cell with grass.
- You can get into traditional discrete-event modelling here if you also allow to claim certain resources, like cells, servers, agents, etc.
Interruptible: A tasks can be interrupted by either the agent itself, other agents or an environmental event or condition.
Resumable: An interrupted tasks can be resumable or not (a boolean). This means you can pause doing it, so something else (of higher priority) in between, and continue where you left of. Internally this can be modelled as it being rescheduled with with a non-zero progress % status, which results in a shorter remaining time when resumed then the full tasks would be.

Examples

Let's talk through a few examples.

Wolf-Sheep

Imagine you are a sheep. Most of the day you're searching for vegetation and eating it. This could be modelled as a relatively long tasks, you might do this for hours a day. However, you're also always spending some attention on spotting wolves. Once you spot one, you break of your food-search tasks, and highly priority you're fleeing tasks.

Ant Colony

Consider ants building a new nest chamber. Individual ants perform a "Dig Soil" task with progress determined by soil hardness. When they've gathered enough soil, they start a "Transport Material" task with duration based on distance and load. However, if they encounter a colony member with a stronger urgent pheromone signal, they can interrupt their current task to assist - perhaps there's a cave-in that requires immediate attention. The reward function for construction tasks is non-linear: partial nest chambers provide little benefit, but completing rooms gives large colony-wide bonuses. This creates emergent cooperation where ants dynamically adjust their tasks based on colony needs and local information.

Medieval Market

Picture merchants in a medieval marketplace. Each has a primary "Sell Goods" task that's continuously active but with variable reward rates based on time of day and location. They can temporarily interrupt selling to perform a "Haggle" task when high-value customers appear - this has a uncertain duration and could result in either high rewards or wasted time. Meanwhile, they're also managing a "Watch for Thieves" task that runs parallel at low efficiency. If they spot suspicious behavior, they can initiate an immediate "Protect Goods" task that prevents theft but halts all trading. The priority of these tasks shifts based on crowd density, reported thefts, and remaining inventory - creating dynamic market behaviors.

Desire-based modelling

One of the things that this might allow is what I call desire-based modelling. Agents have internal desires they want to fulfil, and they have states with how much these desires are fulfilled. These desires can decline based on time, environment or tasks involved. For example, agents might have desire to stay warm, be fed and have sex. The larger the gap between an agent state and their desired state, the higher a task can be prioritized.

(Future) ideas to further develop

Priorities and scheduling

It seems obvious, that once you have tasks, you would like a system to agents to automatically order and schedule them. Some mechanisms to do that would be useful, either based on priorities, urgency, desires and/or expected rewards.

Time sensitives rewards.

Generally there is a benefit of doing something, which diminishes over time. Sometimes this is a hard cut-off, where the full reward is available upon some time and then it springs to 0. Other times it might just diminish over time.

Nestedness

It seems logical that tasks could be nested. There is the big tasks "Feed yourself" which might consists about searching for food and eating it. Searching for food again consists about looking for potential directions and moving to a potential direction.

Permission management

Who can interrupt which tasks? Who can claim which resources? Some mechanisms to let agents fight over that might be nice.

Multiple simultaneous tasks

The most simple implementation here would say an Agent has a capacity to handle X tasks, and if a tasks takes up some part of X the remainder. However, that might also need to include a function that scales the tasks efficiency with the portion (this might not always be linear).

Task States and Transitions

Tasks would benefit from more granular states beyond just active/interrupted/completed, such as "preparing", "waiting", or "paused". This needs clear rules about valid transitions between states and should track transition history for analysis.

Failing

Tasks need robust failure handling - what happens when requirements become false during execution, how many retry attempts are allowed, and how to handle timeouts for stuck tasks. This connects to both state management and multi-agent coordination.

Multi-Agent Coordination

Some tasks require multiple agents working together, raising questions about delegation, sharing, communication and conflict resolution. This could include task trading, bidding systems, and coordinated resource usage.

Learning and Adaptation

Tasks could improve in efficiency with repetition, agents could learn optimal task sequences, and historical performance could affect future decisions. This allows for dynamic adaptation to changing conditions.

Let's discuss

Very curious what everybody thinks about this. Give any thoughts and feedback you like, but I also have a few specific points:

Is this useful in core Mesa? Considering the complexity and such?
What scientific literature exists around this concept?
What kind of use cases would you have for Tasks?
Conceptually, is anything crucial here missing?

quaquel · 2024-12-03T08:51:05Z

quaquel
Dec 3, 2024
Maintainer

This seems closely related to the process interaction formalisms for discrete event simulation. See e.g. Jacobs & Verbraeck (2004).

0 replies

EwoutH · 2024-12-03T09:22:43Z

EwoutH
Dec 3, 2024
Maintainer Author

@projectmesa/maintainers, I would love to hear everybody thoughts!

@harshmahesheka, you spend quite a bit of time on Reinforcement Learning with Mesa. Would you view this as an useful construct to build learning models on?

0 replies

EwoutH · 2024-12-05T20:53:17Z

EwoutH
Dec 5, 2024
Maintainer Author

Tasks turned out to not really be useful on their own. Tasks become really powerful when an Agent can initiate (schedule) them based on it's current states (and maybe based on request from other agents). Therefore, you need a few components:

Internal states that degrade based on time / events
A loss function that determines the negative effect of each degraded state.
Agent decision logic that minimizes the combined loss over all internal states

Ideally you want this modular and extendable, so that you can use RL to tune loss functions and/or decision logic.

Scope creep is hitting hard. Going to spend some more brain cycles on this on how to make it minimal. If anyone knows something that does this already, highly recommended (again @harshmahesheka).

3 replies

EwoutH Dec 5, 2024
Maintainer Author

Existing concepts / projects in this space seem to be:

SPADE (Smart Python Agent Development Environment)
- Python library that implements FIPA agent communication protocols. Has built-in behavior scheduling and finite state machines. Agents can have different behaviors triggered by internal states. However, it's more focused on agent communication than decision-making
PDDL (Planning Domain Definition Language) with Python implementations like pddlpy and pypddl
- These allow defining states, actions, and goals formally. Good for planning sequences of actions. But they're more rigid/formal.
Behavior Trees implementations like py_trees:
- Used heavily in game AI and robotics. Good at handling state-based decision making. Can compose complex behaviors from simple ones. However, they're more focused on immediate decision-making than long-term planning

PettingZoo is also interesting, which is like OpenAI Gym but for multi-agent RL. It has Support for internal agent states, Reward/loss functions, Multi-agent interactions and Compatibility with various RL algorithms.

EwoutH Dec 5, 2024
Maintainer Author

We're hitting a weird spot here that we want Agents to have some autonomy, but we don't want to go full black-box RL (because then the value of ABM disappears).

EwoutH Dec 5, 2024
Maintainer Author

There's a lot more here:

Belief-Desire-Intention (BDI) architecture from AI - agents have beliefs about the world, desires they want to achieve, and intentions (current committed plans)
Subsumption architecture - layered behaviors with priorities
Utility-based agents from Russell & Norvig's AI text
Homeostatic agent architectures where agents try to maintain optimal internal states
PECS (Physical, Emotional, Cognitive, Social) reference model for agent behavior
GOAP (Goal Oriented Action Planning) uses A* pathfinding through possible actions to satisfy goals. Each action has preconditions and effects, goals have associated costs/priorities and is used successfully in game AI.

The challenge is finding something that's:

Lightweight enough to integrate with Mesa
Flexible enough for different types of agent behavior
Has good support for both hand-tuned and learned policies
Can handle both discrete events and continuous state evolution

jackiekazil · 2024-12-13T03:53:07Z

jackiekazil
Dec 13, 2024
Maintainer

While this might be useful to Mesa, I wonder if this would be best external to Mesa to start as a add-on? And it conceptually structures the tasks around the current time construct, where a task takes X unites of time to complete with all sorts of different way to structure.

2 replies

jackiekazil Dec 13, 2024
Maintainer

Dare I say -- the proposal for how this is constructed and the background research might be a paper in itself.

EwoutH Dec 13, 2024
Maintainer Author

Thanks! I think personally the basic blocks could be in Mesa as a module, on which everyone can build.

It amounts to quite a serious task indeed ;)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tasks #2526

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Tasks #2526

EwoutH Dec 3, 2024 Maintainer

Introduction

The Task

Examples

Wolf-Sheep

Ant Colony

Medieval Market

Desire-based modelling

(Future) ideas to further develop

Priorities and scheduling

Time sensitives rewards.

Nestedness

Permission management

Multiple simultaneous tasks

Task States and Transitions

Failing

Multi-Agent Coordination

Learning and Adaptation

Let's discuss

Replies: 4 comments · 5 replies

quaquel Dec 3, 2024 Maintainer

EwoutH Dec 3, 2024 Maintainer Author

EwoutH Dec 5, 2024 Maintainer Author

EwoutH Dec 5, 2024 Maintainer Author

EwoutH Dec 5, 2024 Maintainer Author

EwoutH Dec 5, 2024 Maintainer Author

jackiekazil Dec 13, 2024 Maintainer

jackiekazil Dec 13, 2024 Maintainer

EwoutH Dec 13, 2024 Maintainer Author

EwoutH
Dec 3, 2024
Maintainer

Replies: 4 comments 5 replies

quaquel
Dec 3, 2024
Maintainer

EwoutH
Dec 3, 2024
Maintainer Author

EwoutH
Dec 5, 2024
Maintainer Author

EwoutH Dec 5, 2024
Maintainer Author

EwoutH Dec 5, 2024
Maintainer Author

EwoutH Dec 5, 2024
Maintainer Author

jackiekazil
Dec 13, 2024
Maintainer

jackiekazil Dec 13, 2024
Maintainer

EwoutH Dec 13, 2024
Maintainer Author