kitsu.cafe/content/blog/on-ecs/index.md

396 lines
26 KiB
Markdown
Raw Normal View History

2024-12-20 14:28:58 +01:00
{% extends "../../../layouts/post.html" %}
{% block article %}
2024-12-20 16:43:05 +01:00
I've been using ECS on and off for a couple years now. I definitely haven't fully commited to it and I still have a lot to learn, but I'm really enjoying it. Learning it was a struggle though and it was hard to wrap my head around what *exactly* it was. I want to share why I decide to learn ECS, what I've learned, and what I feel it's good for.
<div class="callout warning">
<div class="header">Note</div>
I am *not* an expert with ECS. I wouldn't even call myself good. I have a very fundamental understanding that's enough to allow me to make small games. Please don't take anything here as objective. Go do your own research and learning, it's worth it.
</div>
2024-12-20 14:28:58 +01:00
## Theseus' Shield
In the context of game design and development, object oriented data modeling can feel intuitive. Players, NPCs, and Enemies can all derive from a common "Creature" class which handle things like health and damage. Weapons, armor, and consumables can derive from a common Item class which handle inventory management.
Thinking exclusively in terms of hierarchal, nested patterns can come at a cost: software brittleness and inflexible design. Let's use an `Item` class as an example.
```cs
abstract class Item {
public string Name { get; protected set; }
}
abstract class Weapon: Item {
public float Damage { get; protected set; }
public float Range { get; protected set; }
}
abstract class Armor: Item {
public float Defense { get; protected set; }
}
class Sword: Weapon {}
class Bow: Weapon {}
class Helmet: Armor {}
class Shield: Armor {}
```
In the above scenario, we have [three layers of inheritance](https://wiki.c2.com/?MaxThreeLayersOfInheritance). We can imagine that `Item` handles behaviors like item management such as being added or removed from an inventory and highly generic item behavior. `Weapon` would then be responsible for an item which is capable of dealing damage at any range. We can expect that the specifics of its attack behavior would be implemented by a subclass. Similarly for `Armor`, this would handle generic damage interactions such as common mitigation or avoidance calculations while leaving specific implementation details to its subclasses.
Seasoned game devs, disciplined object-oriented programmers, and existing ECS developers can likely already see what comes next: a new requirement.
How do we add a Spiked Shield item -- one which derives the behaviors of `Weapon` and `Armor`? If we were using a language which supports multiple inheritance this could potentially be a nonissue: except that multiple inheritance is often [purposefully missing](https://en.wikipedia.org/wiki/Multiple_inheritance#The_diamond_problem) in many languages. We could decide that the item belongs more to one class than the other and just duplicate the missing behavior but these types of decisions often introduce unforeseen complexity: will the damage algorithm need to do a specific type check for `SpikedShield`? What about the equipment screen? And of course, what happens when we need to implement a damaging potion?
Perhaps the most appropriate solution for this case would be to forgo the inheritance pattern in favor of [composition](https://en.wikipedia.org/wiki/Composition_over_inheritance). In C#, this could be achieved with interfaces and default implementations.
```cs
interface ICollectable {
void Take();
void Remove();
}
interface IDamaging {
// by default, we'll just pass the damage directly to
// the receiving damage handler
void ApplyDamage(IDamageable damageable, float value) {
damageable.ReceiveDamage(this, value);
}
}
interface IDamageable {
void ReceiveDamage(IDamaging source, float value);
}
class SpikedShield: ICollectable, IDamaging, IDamageable {}
```
We can use this approach to refactor our existing items and systems to derive/override only the behaviors they use. The respective systems for these interfaces now only need to check for the existence of these interfaces in order to act on them.
Engines like Unity and Godot use variations of the [Entity-Component (EC)](https://gameprogrammingpatterns.com/component.html) pattern (similar to but distinct from Entity-Component-System (ECS)). These patterns favor composition over inheritance by allowing developers to isolate behaviors into discrete components that can be applied to entities. In the Spiked Shield example, a developer could make a "Damage Source" and "Damage Target" component and add both to the item. In essence, this is the same as the interface-based approach.
Unfortunately, these patterns dont alleviate other issues that are much more difficult to solve.
## In the event of my demise
Due to the nature of video games, important events may need to be handled at any moment. For example, a player may've dealt a fatal blow to a boss enemy on the same frame that they received fatal damage. In which order should this damage be processed? Depending on the handling order, this is likely the difference between clearing a potentially difficult boss battle and needing to do it again.
In my experience, this would likely be handled by a traditional event system where the order is difficult to predict. This isn't to make the claim that syncronous event systems are *unpredictable*, but ensuring that a given event will be handled in a way that is predictable to the designer/developer without additional abstractions is difficult.
It's often useful to explicitly define the processing order of certain interactions and events. To solve this problem, we could implement a priority [event queue](https://gameprogrammingpatterns.com/event-queue.html).
```cs
public class DamageEventArgs : EventArgs {
public readonly IDamaging Source;
public readonly IDamageable Target;
public readonly float Value;
}
// this is more of a dispatcher, but that's
// an unnecessary implementation detail.
// in a real scenario where other systems would be reading
// these events too, this would subscribe to a Mediator object
class DamageEvent {
// only one of these should exist and should be globally
// accessible so we make it a singleton
private static DamageEvent instance = new Instance();
public static DamageEvent Instance => instance;
private PriorityQueue<DamageEventArgs, int> queue = new();
private void Raise(DamageEventArgs args) {
// before enqueuing, determine the priority
// based on the object dealing damage
var priority = args.Source switch {
Player => 2,
Enemy => 1,
_ => 0
};
queue.Enqueue(args, priority);
}
// we'll dispatch events with the tick rate
// so they're all handled at the same time
private void OnTick(float deltaTime) {
// for simplicity we'll dequeue everything each frame
// and pass it to the damage handler
while(queue.TryDequeue(out var e, out int priority)) {
e.Source.ApplyDamage(e.Target, e.Value);
}
}
}
// bossEnemy's damage should be processed after player's damage
// if they're raised during the same frame
DamageEvent.Raise(new DamageEventArgs(bossEnemy, player, 10));
DamageEvent.Raise(new DamageEventArgs(player, bossEnemy, 10));
```
There's a lot of issues with this code that I'm going to pretend were deliberate decisions for brevity. The point of this example is to show that we can ingest events and sort them arbitrarily based on the requirements of the game. We've made a step in the right direction by identifying a need to explicitly order these events. Even if this implementation isn't ideal, it properly encodes the requirements of the design (ie. player damage should be processed before all other types).
There is still a timing issue with this approach however. Events can be raised at any point: before, during, or after `DamageEvent` queue has already done its work for the frame. If `bossEnemy` raises its damage event before `DamageEvent` processes its queue but `player` raises their event after, we still have the original issue.
Depending on the engine and implementation, there may be a few options for solving this. Rather than using `OnTick`, the damage can be handled in `OnLateTick` which runs after all `OnTick` systems have been processed (Unity's version of these methods are `Update` and `LateUpdate`). In Unity the `DamageEvent` singleton script could have its order explicitly modified in the settings or with the `DefaultExecutionOrder` attribute. In Godot, this would likely be resolved by moving the `DamageEvent` node lower in the tree since nodes are processed from top to bottom while resolving children first.
An alternative solution would be to identify the behavior which raises events and isolate it into its own singleton. Doing this would allow it to easily be ordered before the damage handling system. We'll revisit this idea later.
## The Offspring would be proud
Managing a game state as it grows in size and complexity is *difficult*. It's not uncommon for a game to have many thousands of active entities at a time. In many cases, it makes sense to decouple separate-but-related behaviors into their own systems to make them easier to manage.
One such separation would be game logic and UI. Rather than directly coupling the player's health to the UI representation of the player's health, it makes sense to have them communicate via some type of messaging system. An event seems like a natural fit.
```cs
public class PlayerSpawnedArgs : EventArgs {
public readonly Player Player;
}
public class HealthChangedArgs : EventArgs {
public readonly double PreviousValue;
public readonly double CurrentValue;
}
class Healthbar : UIElement {
private float _currentHealth;
// use getter/setter to automatically redraw
// the UI on state changes
private double currentHealth {
get => _currentHealth;
set {
if (value != _currentHealth) {
_currentHealth = value;
Redraw();
}
}
}
protected void OnInitialize() {
// register to the player spawn event
PlayerSpawnEvent.Register(OnPlayerSpawned);
}
private void OnPlayerSpawned(object sender, PlayerSpawnedArgs args) {
// subscribe specifically to that player's events
args.Player.health.Register(OnHealthChanged);
}
private void OnHealthChanged(object sender, HealthChangedArgs args) {
currentHealth = args.CurrentValue;
}
}
```
This will work for many cases but it's unfortunate it needs to manage subscriptions to two events in order to get the updates it needs. If we ever needed to add more than one player that renders a healthbar, this solution would no longer be sufficient. We could circumvent this by making health updates dispatched globally but that wold come at the cost of checking *every* health update just to find a player. This solution would be much more appealing, however, if every entity with health had a healthbar. One happy medium would be to create a more specific global event for player health updates. `Healthbar` can then simply register to that event without needing to concern itself with the particulars of spawning.
With all this event-driven programming comes another downside: events are opaque and difficult to debug. When an architecture relies heavily on events, it becomes increasingly important to have outstanding documentation. It's an unfortunate sacrifice to be made in exchange for the highly decoupled nature of events. Some game engines have made attempts to offer better insight into event connections but it's often only a mild remedy.
In order to better query the state of an event-driven application, [Martin Fowler's Event Sourcing post](https://martinfowler.com/eaaDev/EventSourcing.html) provides some additional respite. It's worth reading if you're working with many events in your game. Essentially, tracking the source of an event *in* the newly raised event allows subscribers to walk the chain of source events in reverse to determine additional context.
## Spreadsheet-oriented programming
These were just a few pain points of game architecture that I've come across when making games. I've tried different solutions each time to varying levels of success. While I don't think there exists a one-size-fits-all solution to every architectural decision in video games, I *do* believe that reframing how we think about our architectural goals can make some problems diminish or even disappear. This is especially helpful if the affected problems are persistent in a given domain.
Entity-Component-System (ECS) is a data-oriented approach that has resolved many of the above issues for me. It comes with its own architectural challenges, especially since the pattern has been rapidly evolving due to its recent explosion of popularity.
There are many guides attempting to explain ECS in a simple terms. This can be a bit challenging since the approach may run counter to the fundamental understanding of game architecture for many uninitiated readers. Additionally, there are different types and implementations of ECS which sometimes pollute the overall message. [Sander Mertens](https://ajmmertens.medium.com/), the author of [FLECS](https://github.com/SanderMertens/flecs), has contributed a substantial amount to the development and education of ECS. Their [FAQ](https://github.com/SanderMertens/ecs-faq) is a valuable resource to have. I'm going to try to provide a high level explanation of ECS but I recommend looking to other resources if this doesn't make sense.
### Entities
An entity is an identifier for an object. It has no data, properties, or behaviors by itself. It is simply a marker for something that exists. In some implementations of ECS, this could be as simple as an unsigned integer. These identifiers are the primary way of fetching *Component* data.
### Components
Components hold data and belong to an entity. Some implementations have limitations on what kind of data can be housed but conceptually it can be anything. A player's level, their position in the world, their current health, and the current input state of a gamepad would all be stored in a component.
Components are often stored contiguously in memory (such as in an array). The entity is used to fetch data from that container.
### An example
Let's pause for a moment to consider the relationship between entities and components. If we were to focus purely on simplicity, we could implement these concepts in the following way.
```cs
// this is our component definition
struct Health {
public double Current;
public double Max;
}
Health[] health = new Health[100]; // this is our component container
int player = 0; // this is our entity
double currentHealth = health[player].current; // accessing our component data
Console.WriteLine($"The player's current health is {currentHealth}");
```
<div class="callout info">
<p class="header">Note</p>
I would like to restate that the actual interface will depend on the library and the decisions its developers have made. Different implementations have different tradeoffs and limitations that may change the internal representation of an entity or component.
</div>
If we extend our example just a little bit to include multiple components, we would end up with multiple containers (arrays) too. This has an interesting implication in that it allows us to visualize our data in a more intuitive way: a table.
| Entity | Name | Health.Current | Health.Max |
| ------ | -------- | -------------- | ---------- |
| 0 | "Kain" | 100 | 100 |
| 1 | "Raziel" | 5 | 75 |
| 2 | "Janos" | 0 | 1000 |
Our entity is the row ID and each successive column is its associated component data. In cases where an entity does not have a component, we can think of its value as `NULL`.
| Entity | Name | Health.Current | Health.Max | Weapon.Name |
| ------ | -------- | -------------- | ---------- | ------------- |
| 0 | "Kain" | 100 | 100 | "Soul Reaver (Physical)" |
| 1 | "Raziel" | 5 | 75 | "Soul Reaver (Spectral)" |
| 2 | "Janos" | 0 | 1000 | `NULL` |
Adding a new row is as simple as making a new entity. All of the columns except for the entity ID would be `NULL` because we haven't added any components.
| Entity | Name | Health.Current | Health.Max | Weapon.Name |
| ------ | -------- | -------------- | ---------- | ------------- |
| 0 | "Kain" | 100 | 100 | "Soul Reaver (Physical)" |
| 1 | "Raziel" | 5 | 75 | "Soul Reaver (Spectral)" |
| 2 | "Janos" | 0 | 1000 | `NULL` |
| 3 | `NULL` | `NULL` | `NULL` | `NULL` |
So to sum up, an entity is a row, a component is a column. Data lives in the cells where entities and components intersect.
### Systems
A system represents a behavior. This is where game and application logic lives. Systems receive a list of entities and iterate over that list to perform work on their component data. They may also create and destroy entities or attach and remove components. If we want to write a system which applies movement to an entity, we could check for the existence of a `Position` and `Velocity` components.
```cs
private void ApplyMovementVelocity(Entity[] entities) {
foreach(var entity of entities) {
if (entity.HasComponent(velocity) && entity.HasComponent(position)) {
position.x[entity] += velocity.x[entity]
position.z[entity] += velocity.z[entity]
}
}
}
```
This system would likely be run every [physics tick](https://web.archive.org/web/20241219080529/https://www.gafferongames.com/post/fix_your_timestep/#:~:text=Free%20the%20physics) ([FixedUpdate](https://docs.unity3d.com/6000.0/Documentation/Manual/fixed-updates.html) in Unity or [_physics_process](https://docs.godotengine.org/en/stable/tutorials/scripting/idle_and_physics_processing.html) in Godot).
### Queries
Checking for entities with only certain components is such a common pattern that most ECS engines have a separate concept for doing exactly this: queries. In these frameworks we may ask to only receive a list of entities that meet certain conditions. These conditions are typically limited to a simple check of whether or not an entity has a component.
Since the syntax for querying is highly dependent on the ECS library, I'll use Unity's as an example and try to recreate the above system.
```cs
public void ApplyMovementVelocity(ref SystemState state) {
foreach(var (transform, velocity) in SystemAPI.Query<RefRW<LocalTransform>, RefRO<Velocity>>()) {
transform.position += velocity;
}
}
```
This seems like it has a lot going on but it's not that bad. `SystemAPI.Query` is doing most of the hard work. Using a few generics, we're able to declare which component types we'd like to query for: `LocalTransform` and `Velocity`. `RefRW` and `RefRO` describe how we'd like to access that data. `Ref` means it will be a reference (as opposed to a value) while `RW` means read-write and `RO` means read only. So a `RefRW<LocalTransform>` means we'll have read-write access to the `LocalTransform` reference.
One of the other nice things about `SystemAPI.Query` is that it returns an enumerable that we can use with `foreach`. Each enumerable item is a `Tuple` containing whatever we queried. In the example, we use tuple destructuring to access those components (ie. `var (transform, velocity)`).
Other than being more convenient, there are significant benefits to querying entities this way that will be covered in just a moment.
2024-12-20 16:43:05 +01:00
## Clockwork
Given the range of implementations and libraries, there are also some other patterns that have emerged as somewhat standard. Queries are present in nearly every library I've used, but there's a couple other useful concepts.
### Schedules
Not only is it important to be able to request the exact entities which are relevant to a system, it's equally as important to ensure that systems execute at the proper time. Some ECS engines such as Bevy provide tools to order and schedule systems to run at specific times. It's possible to run them at different tick rates, when certain conditions are met, or before/after another system.
### Tag components
While these aren't *really* a separate concept from regular components, they're worth mentioning. Tag components have no data and exist only to tag an entity. "Player" would be a good example -- all their data would likely exist in other components: `Health`, `Transform`, etc -- but it would help with identifying the player in order to perform special tasks. or example, applying movement input from the controller to the player's velocity could be a case for tag components.
### Performance
One thing that I've deliberately not mentioned until now: ECS is usually *very* fast. I often see this touted as one of the main selling points of ECS. It's true that the performance is important to an application which is running anywhere from 60-144 times per second but I think there are *many* other benefits worth talking about.
Many of the performance benefits of ECS are due to its overlap with data-oriented design concepts. One example of data-oriented design, parallel arrays, is a common implementation in many ECS frameworks. I'm *especially* not qualified to speak on this issue because I've very little exposure to the concept separate from using it via ECS in game development, so I'll let others explain it.
## Revisiting our design problems
### Spiked Shield
### Damage ordering
### Health and UI
Did I cherry pick these examples to highlight the strengths of ECS? Kind of, but it's not like there are any
## Weaknesses
Okay, yeah. I've had problems using it too. There are two stand-out pain points that I have with ECS and I will readily acknowledge that they're my own failings.
### Modeling
Although conceptually simple, I find it difficult to model games in ECS. This could be due to my inexperience with it. I have *20 years* of experience with traditional object-oriented modeling and only a couple of years with data-oriented modeling.
I have trouble figuring out when to make new components, when to decouple data by making new entities, and how many things a system should be responsible for. It's *really* nice to have just a few extremely powerful tools rather than a seemingly infinite number of highly specific patterns, but it hasn't really ever been intuitive for me.
### Complex queries
This one ends up frustrating me a lot. Because queries are essentially a way of asking which entities have a specific set of components, it's often easy to end up with *very* complex systems that have to figure out relationships or context from the game state.
This example will probably be a little contrived, but let's take a damage system. We will assume three components: `Weapon`, `Damage`, and `Health`. In our system, when a weapon has a physical collision with an entity that has a `Health` component, it will create a new entity with a `Damage` component. For simplicity's sake this component will only have two properties: `target` and `value` which are the entity ID of the damaged entity and the amount of damage respectively.
What would the system for applying damage to health look like?
```cs
public void ApplyDamageSystem(ref SystemState state) {
// this is an object which can alter the state of the game world
// eg. adding and removing entities or components
EntityCommandBuffer ecb = new EntityCommandBuffer(Allocator.TempJob);
// this is effectively a query for every entity with a Health component
var allHealth = GetComponentDataFromEntity<Health>(false); // the bool argument is whether its read-only
foreach(var (damage, entity) in SystemAPI.Query<RefRO<Damage>>().WithEntityAccess()) {
if(!allHealth.HasComponent(damage.target)) {
return; // return if our target doesn't have health
}
var targetHealth = allHealth[damage.target].Value;
var newHealth = targetHealth.current - damage.value;
targetHealth.current = Mathf.Max(newHealth, 0);
// destroy the entity so it isn't processed on the next frame too
ecb.DestroyEntity(entity);
}
ecb.Playback(EntityManager);
ecb.Dispose();
}
```
We have to create second query, check if our entity actualy has that component, apply our damage, and then remove the damage entity (which is acting like an event). Let's illustrate this with tables again.
| Entity | Damage.Target | Damage.Value |
| ------ | ------------- | ------------ |
| 1 | 2 | 10 |
This is our entity with the `Damage` component. `Target` holds an entity ID so we know exacty what is taking damage.
| Entity | Health.Current | Health.Max |
| ------ | -------------- | ---------- |
| 0 | 70 | 100 |
| 2 | 50 | 50 |
These represent our entities with a `Health` component. We have two -- one that dealt damage and another that received damage. We're joining `Damage` and `Health` where `Entity HAS Health AND Damage.Target = Entity`.
Could we do this differently?
Instead of creating a separate entity for damage, we could add the component directly to the damaged entity. With this approach, other sources of damage will have to make sure to check for an existing instance of damage and modify that one if it exists, otherwise add a new one. This eliminates our join but increases the complexity of adding and removing multiple sources of damage to a single target.
This is actually a pretty simple query too. It can get complicated quickly though I'm sure some of this is my own inexperience.
## Anyway
2024-12-20 14:28:58 +01:00
2024-12-20 16:43:05 +01:00
This post is getting way too long and I've already cut it in half. There's going to be a part two at some point.
2024-12-20 14:28:58 +01:00
2024-12-20 16:43:05 +01:00
ECS is conceptually easy but really difficult to learn and master. Building an intuition for it takes time. As far as I'm concerned it's been worth it -- it's another tool to solve a diverse range of problems. If you're interested in learning and using ECS, especially for game development, I would highly recommend going out and using it in order to start building an intuition for it.
2024-12-20 14:28:58 +01:00
{% endblock article %}