kitsu.cafe/content/blog/on-ecs/index.md

419 lines
26 KiB
Markdown

{% extends "../../../layouts/post.html" %}
{% block article %}
I've been using ECS on and off for a couple years now. I definitely haven't fully committed to it and I still have a lot to learn, but I'm really enjoying it. Learning it was a struggle though and it was hard to wrap my head around what *exactly* it was. I want to share why I decided to learn ECS, what I've learned, and what I feel it's good for.
<div class="callout warning">
<div class="header">Note</div>
I am *not* an expert with ECS. I wouldn't even call myself good. I have a very fundamental understanding that's enough to allow me to make small games. Please don't take anything here as objective. Go do your own research and learning, it's worth it.
</div>
## Spreadsheet-oriented programming
There are some reoccurring problems that I encounter when developing a game. I've tried different solutions each time to varying levels of success. While I don't think there exists a one-size-fits-all solution to every architectural decision in video games, I *do* believe that reframing how we think about our architectural goals can make some problems diminish or even disappear. This is especially helpful if the affected problems are persistent in a given domain.
Entity-Component-System (ECS) is a data-oriented approach that has resolved many of the above issues for me. It comes with its own architectural challenges, especially since the pattern has been rapidly evolving due to its recent explosion of popularity.
There are many guides attempting to explain ECS in a simple terms. This can be a bit challenging since the approach may run counter to the fundamental understanding of game architecture for many uninitiated readers. Additionally, there are different types and implementations of ECS which sometimes pollute the overall message. [Sander Mertens](https://ajmmertens.medium.com/), the author of [FLECS](https://github.com/SanderMertens/flecs), has contributed a substantial amount to the development and education of ECS. Their [FAQ](https://github.com/SanderMertens/ecs-faq) is a valuable resource to have. I'm going to try to provide a high level explanation of ECS but I recommend looking to other resources if this doesn't make sense.
### Entities
An entity is an identifier for an object. It has no data, properties, or behaviors by itself. It is simply a marker for something that exists. In some implementations of ECS, this could be as simple as an unsigned integer. These identifiers are the primary way of fetching *Component* data.
### Components
Components hold data and belong to an entity. Some implementations have limitations on what kind of data can be housed but conceptually it can be anything. A player's level, their position in the world, their current health, and the current input state of a gamepad would all be stored in a component.
Components are often stored contiguously in memory (such as in an array). The entity is used to fetch data from that container.
### An example
Let's pause for a moment to consider the relationship between entities and components. If we were to focus purely on simplicity, we could implement these concepts in the following way.
```cs
// this is our component definition
struct Health {
public double Current;
public double Max;
}
Health[] health = new Health[100]; // this is our component container
int player = 0; // this is our entity
double currentHealth = health[player].current; // accessing our component data
Console.WriteLine($"The player's current health is {currentHealth}");
```
<div class="callout info">
<p class="header">Note</p>
I would like to restate that the actual interface will depend on the library and the decisions its developers have made. Different implementations have different tradeoffs and limitations that may change the internal representation of an entity or component.
</div>
If we extend our example just a little bit to include multiple components, we would end up with multiple containers (arrays) too. This has an interesting implication in that it allows us to visualize our data in a more intuitive way: a table.
| Entity | Name | Health.Current | Health.Max |
| ------ | -------- | -------------- | ---------- |
| 0 | "Kain" | 100 | 100 |
| 1 | "Raziel" | 5 | 75 |
| 2 | "Janos" | 0 | 1000 |
Our entity is the row ID and each successive column is its associated component data. In cases where an entity does not have a component, we can think of its value as `NULL`.
| Entity | Name | Health.Current | Health.Max | Weapon.Name |
| ------ | -------- | -------------- | ---------- | ------------- |
| 0 | "Kain" | 100 | 100 | "Soul Reaver (Physical)" |
| 1 | "Raziel" | 5 | 75 | "Soul Reaver (Spectral)" |
| 2 | "Janos" | 0 | 1000 | `NULL` |
Adding a new entity is as simple as making a new row. All of the columns except for the entity ID would be `NULL` because we haven't added any components.
| Entity | Name | Health.Current | Health.Max | Weapon.Name |
| ------ | -------- | -------------- | ---------- | ------------- |
| 0 | "Kain" | 100 | 100 | "Soul Reaver (Physical)" |
| 1 | "Raziel" | 5 | 75 | "Soul Reaver (Spectral)" |
| 2 | "Janos" | 0 | 1000 | `NULL` |
| 3 | `NULL` | `NULL` | `NULL` | `NULL` |
So to sum up, an entity is a row, a component is a column. Data lives in the cells where entities and components intersect.
### Systems
A system represents a behavior. This is where game and application logic lives. Systems receive a list of entities and iterate over that list to perform work on their component data. They may also create and destroy entities or attach and remove components. If we want to write a system which applies movement to an entity, we could check for the existence of a `Position` and `Velocity` components.
```cs
private void ApplyMovementVelocity(Entity[] entities) {
foreach(var entity of entities) {
if (entity.HasComponent(velocity) && entity.HasComponent(position)) {
position.x[entity] += velocity.x[entity]
position.z[entity] += velocity.z[entity]
}
}
}
```
This system would likely be run every [physics tick](https://web.archive.org/web/20241219080529/https://www.gafferongames.com/post/fix_your_timestep/#:~:text=Free%20the%20physics) ([FixedUpdate](https://docs.unity3d.com/6000.0/Documentation/Manual/fixed-updates.html) in Unity or [_physics_process](https://docs.godotengine.org/en/stable/tutorials/scripting/idle_and_physics_processing.html) in Godot).
### Queries
Checking for entities with only certain components is such a common pattern that most ECS engines have a separate concept for doing exactly this: queries. In these frameworks we may ask to only receive a list of entities that meet certain conditions. These conditions are typically limited to a simple check of whether or not an entity has a component.
```cs
public void ApplyMovementVelocity(SystemState state) {
foreach(var (position, velocity) in Query<Position, Velocity>(state)) {
position += velocity;
}
}
```
Other than being more convenient, there are significant benefits to querying entities this way that will be covered in just a moment.
## Inexorable
Given the range of implementations and libraries, there are also some other patterns that have emerged as somewhat standard. Queries are present in nearly every library I've used, but there's a couple other useful concepts.
### Schedules
Not only is it important to be able to request the exact entities which are relevant to a system, it's equally as important to ensure that systems execute at the proper time. Some ECS engines such as Bevy provide tools to order and schedule systems to run at specific times. It's possible to run them at different tick rates, when certain conditions are met, or before/after another system.
### Tag components
While these aren't *really* a separate concept from regular components, they're worth mentioning. Tag components have no data and exist only to tag an entity. "Player" would be a good example — all their data would likely exist in other components: `Health`, `Transform`, etc — but it would help with identifying the player in order to perform special tasks. For example, applying movement input from the controller to the player's velocity could be a case for tag components.
### Performance
One thing that I've deliberately not mentioned until now: ECS is generally fast. I often see this touted as one of the main selling points of ECS. It's true that the performance is important to an application which is running anywhere from 60-144 times per second but I think there are *many* other benefits worth talking about.
Most of the performance benefits of ECS are due to its overlap with data-oriented design concepts. One example of data-oriented design, parallel arrays, is a common implementation in many frameworks. I'm *especially* not qualified to speak on this issue because I've very little exposure to the concept separate from using it in game development, so I'll let someone else explain it.
> # Is ECS fast?
>
> Generally yes, though this of course depends on what is being measured, and the ECS implementation. Different implementations make different tradeoffs, and as such an operation that is really fast in one framework is quite slow in another.
>
> Things that ECS implementations are generally good at are querying and iterating sets of entities linearly, or dynamically changing components at runtime. Things that ECS implementations are generally not good at are queries or operations that require highly specialized data structures, such as binary trees or spatial structures. Knowing the tradeoffs of an implementation and levering its design ensure you get the most performance out of an ECS.
>
> -- <cite>[Sander Mertens](https://github.com/SanderMertens/ecs-faq?tab=readme-ov-file#is-ecs-fast)</cite>
If you want to know more about *why* it's fast, read Sander's article about [how to build ECS](https://ajmmertens.medium.com/building-an-ecs-2-archetypes-and-vectorization-fe21690805f9).
## Theseus' Shield
With all of that out of the way, I want to talk a bit about the issues I've encountered in my own design. I primarily use Unity for developing games so that may influence some of the issues I encounter, but I've also run into this in other engines and environments. I *feel* like these issues are general enough to not be engine-specific.
### Using object-oriented modeling
In the context of game design and development, object oriented data modeling can feel intuitive (at least for me). Thinking of data as things which do stuff and interact with each other makes sense.
One of the more common tools available in OOP is inheritance: the ability to inherit the behaviors of one class by extending it. Players, NPCs, and Enemies can all derive from a common "Creature" class which handle things like health and damage. Weapons, armor, and consumables can derive from a common Item class which handle inventory management.
Thinking exclusively in terms of hierarchical, nested patterns can come at a cost: software brittleness and inflexible design. Let's use an `Item` class as an example.
```cs
abstract class Item {
public string Name { get; protected set; }
}
abstract class Weapon: Item {
public float Damage { get; protected set; }
public float Range { get; protected set; }
}
abstract class Armor: Item {
public float Defense { get; protected set; }
}
class Sword: Weapon {}
class Bow: Weapon {}
class Helmet: Armor {}
class Shield: Armor {}
```
In the above scenario, we have [three layers of inheritance](https://wiki.c2.com/?MaxThreeLayersOfInheritance). We can imagine that `Item` handles behaviors like item management such as being added or removed from an inventory and highly generic item behavior. `Weapon` would then be responsible for an item which is capable of dealing damage at any range. We can expect that the specifics of its attack behavior would be implemented by a subclass. Similarly for `Armor`, this would handle generic damage interactions such as common mitigation or avoidance calculations while leaving specific implementation details to its subclasses.
Seasoned game devs, disciplined object-oriented programmers, and existing ECS developers can likely already see what comes next: a new requirement.
How do we add a Spiked Shield item — one which derives the behaviors of `Weapon` and `Armor`? If we were using a language which supports multiple inheritance this could potentially be a nonissue: except that multiple inheritance is often [purposefully missing](https://en.wikipedia.org/wiki/Multiple_inheritance#The_diamond_problem) in many languages. We could decide that the item belongs more to one class than the other and just duplicate the missing behavior but these types of decisions often introduce unforeseen complexity. Will the damage algorithm need to do a specific type check for `SpikedShield`? What about the equipment screen? And of course, what happens when we need to implement a damaging potion?
### Using composition
Perhaps the most appropriate solution for this case would be to forgo the inheritance pattern in favor of [composition](https://en.wikipedia.org/wiki/Composition_over_inheritance). In C#, this could be achieved with interfaces and default implementations.
```cs
interface ICollectable {
void Take();
void Remove();
}
interface IDamaging {
// by default, we'll just pass the damage directly to
// the receiving damage handler
void ApplyDamage(IDamageable damageable, float value) {
damageable.ReceiveDamage(this, value);
}
}
interface IDamageable {
void ReceiveDamage(IDamaging source, float value);
}
class SpikedShield: ICollectable, IDamaging, IDamageable {}
```
We can use this approach to refactor our existing items and systems to derive/override only the behaviors they use. The respective systems for these interfaces now only need to check for the existence of these interfaces in order to act on them.
Engines like Unity and Godot use variations of the [Entity-Component (EC)](https://gameprogrammingpatterns.com/component.html) pattern (similar to but distinct from Entity-Component-System (ECS)). These patterns favor composition over inheritance by allowing developers to isolate behaviors into discrete components that can be applied to entities. In the Spiked Shield example, a developer could make a "Damage Source" and "Damage Target" component and add both to the item. In essence, this is the same as the interface-based approach.
### Using ECS
Rather than modeling a hierarchy of inheritance, let's try handling our items with components instead. We can do something similar to the interface-based approach since that's almost exactly how ECS works anyway. I'll go back to using a fake ECS syntax here for simplicity.
```cs
struct Damage {
public float Value;
}
struct Defense {
public float Value;
}
var sword = Entity.create()
.withComponents(new Damage { Value: 25 });
var helmet = Entity.create()
.withComponents(new Defense { Value: 5 });
var spikedShield = Entity.create()
.withComponents(new Damage { Value: 10 }, new Defense { Value: 10 });
```
All we have to do is determine required components for handling damage and add those. The identity of the spiked shield is determined by its component composition.
## In the event of my demise
Due to the nature of video games, important events may need to be handled at any moment. For example, a player may've dealt a fatal blow to a boss enemy on the same frame that they received fatal damage. In which order should this damage be processed? Depending on the handling order, this is likely the difference between clearing a potentially difficult boss battle and needing to do it again.
### Using events
In my experience, this would likely be handled by a traditional event system where the order is difficult to predict. This isn't to make the claim that synchronous event systems are *unpredictable*, but ensuring that a given event will be handled in a way that is predictable to the designer/developer without additional abstractions is difficult.
It's often useful to explicitly define the processing order of certain interactions and events. To solve this problem, we could implement a priority [event queue](https://gameprogrammingpatterns.com/event-queue.html).
```cs
public class DamageEventArgs : EventArgs {
public readonly IDamaging Source;
public readonly IDamageable Target;
public readonly float Value;
}
// this is more of a dispatcher, but that's
// an unnecessary implementation detail.
// in a real scenario where other systems would be reading
// these events too, this would subscribe to a Mediator object
class DamageEvent {
// only one of these should exist and should be globally
// accessible so we make it a singleton
private static DamageEvent instance = new Instance();
public static DamageEvent Instance => instance;
private PriorityQueue<DamageEventArgs, int> queue = new();
private void Raise(DamageEventArgs args) {
// before enqueuing, determine the priority
// based on the object dealing damage
var priority = args.Source switch {
Player => 2,
Enemy => 1,
_ => 0
};
queue.Enqueue(args, priority);
}
// we'll dispatch events with the tick rate
// so they're all handled at the same time
private void OnTick(float deltaTime) {
// for simplicity we'll dequeue everything each frame
// and pass it to the damage handler
while(queue.TryDequeue(out var e, out int priority)) {
e.Source.ApplyDamage(e.Target, e.Value);
}
}
}
// bossEnemy's damage should be processed after player's damage
// if they're raised during the same frame
DamageEvent.Raise(new DamageEventArgs(bossEnemy, player, 10));
DamageEvent.Raise(new DamageEventArgs(player, bossEnemy, 10));
```
There's a lot of issues with this code that I'm going to pretend were deliberate decisions for brevity. The point of this example is to show that we can ingest events and sort them arbitrarily based on the requirements of the game. We've made a step in the right direction by identifying a need to explicitly order these events. Even if this implementation isn't ideal, it properly encodes the requirements of the design (i.e. player damage should be processed before all other types).
There is still a timing issue with this approach however. Events can be raised at any point: before, during, or after `DamageEvent` queue has already done its work for the frame. If `bossEnemy` raises its damage event before `DamageEvent` processes its queue but `player` raises their event after, we still have the original issue.
Depending on the engine and implementation, there may be a few options for solving this. Rather than using `OnTick`, the damage can be handled in `OnLateTick` which runs after all `OnTick` systems have been processed (Unity's version of these methods are `Update` and `LateUpdate`). In Unity the `DamageEvent` singleton script could have its order explicitly modified in the settings or with the `DefaultExecutionOrder` attribute. In Godot, this would likely be resolved by moving the `DamageEvent` node lower in the tree since nodes are processed from top to bottom while resolving children first.
An alternative solution would be to identify the behavior which raises events and isolate it into its own singleton. Doing this would allow it to easily be ordered before the damage handling system. We'll revisit this idea later.
### Using ECS
Since our systems are handled in one place in a well-defined order, this shouldn't be an issue unless we go out of our way to make it one.
I'll use Bevy as an example for system ordering.
```rs
App::new()
.add_plugins(DefaultPlugins)
.add_systems(Update, (
detect_damage_collisions,
apply_damage.after(detect_damage_collisions)
))
.run();
```
Not all libraries allow ordering in this way, but they should *all* have ways to establish some order between systems.
## A Joke about The Offspring
Managing a game state as it grows in size and complexity is *difficult*. It's not uncommon for a game to have many thousands of active entities at a time. In many cases, it makes sense to decouple separate-but-related behaviors into their own systems to make them easier to manage.
### Using events
One such separation would be game logic and UI. Rather than directly coupling the player's health to the UI representation of the player's health, it makes sense to have them communicate via some type of messaging system. An event seems like a natural fit.
```cs
public class PlayerSpawnedArgs : EventArgs {
public readonly Player Player;
}
public class HealthChangedArgs : EventArgs {
public readonly double PreviousValue;
public readonly double CurrentValue;
}
class Healthbar : UIElement {
private float _currentHealth;
// use getter/setter to automatically redraw
// the UI on state changes
private double currentHealth {
get => _currentHealth;
set {
if (value != _currentHealth) {
_currentHealth = value;
Redraw();
}
}
}
protected void OnInitialize() {
// register to the player spawn event
PlayerSpawnEvent.Register(OnPlayerSpawned);
}
private void OnPlayerSpawned(object sender, PlayerSpawnedArgs args) {
// subscribe specifically to that player's events
args.Player.health.Register(OnHealthChanged);
}
private void OnHealthChanged(object sender, HealthChangedArgs args) {
currentHealth = args.CurrentValue;
}
}
```
This will work for many cases but it's unfortunate it needs to manage subscriptions to two events in order to get the updates it needs. If we ever needed to add more than one player that renders a health bar, this solution would no longer be sufficient. We could circumvent this by making health updates dispatched globally but that wold come at the cost of checking *every* health update just to find a player. This solution would be much more appealing, however, if every entity with health had a health bar. One happy medium would be to create a more specific global event for player health updates. `Healthbar` can then simply register to that event without needing to concern itself with the particulars of spawning.
With all this event-driven programming comes another downside: events are opaque and difficult to debug. When an architecture relies heavily on events, it becomes increasingly important to have outstanding documentation. It's an unfortunate sacrifice to be made in exchange for the highly decoupled nature of events. Some game engines have made attempts to offer better insight into event connections but it's often only a mild remedy.
In order to better query the state of an event-driven application, [Martin Fowler's Event Sourcing post](https://martinfowler.com/eaaDev/EventSourcing.html) provides some additional respite. It's worth reading if you're working with many events in your game. Essentially, tracking the source of an event *in* the newly raised event allows subscribers to walk the chain of source events in reverse to determine additional context.
### Using ECS
To accomplish the logical separation between game state and its representation, all that really needs to happen is for our health UI system to query for the entities that it cares about and then render those elements.
```cs
// we're going back to our imaginary C# ECS library
public void UpdateHealthbarSystem() {
// let Player be a tag component
var playerHealthQuery = Query<Health, Player>();
var healthbarsQuery = Query<Target, Healthbar>();
foreach(var (target, healthbar) in healthbarsQuery) {
if (playerHealthQuery.HasComponent(target.entity)) {
// throw away the Player tag component
var (health, _) = playerHealthQuery[target.entity];
healthbar.value = health.current;
}
}
}
```
Did I cherry pick these examples to highlight the strengths of ECS? Kind of, but it's not like there are any
## Weaknesses
Okay, yeah. I've had problems using it too. There are two stand-out pain points that I have with ECS and I will readily acknowledge that they're my own failings.
### Modeling
Although conceptually simple, I find it difficult to model games in ECS. This could be due to my inexperience with it. I have *20 years* of experience with traditional object-oriented modeling and only a couple of years with data-oriented modeling.
I have trouble figuring out when to make new components, when to decouple data by making new entities, and how many things a system should be responsible for. It's *really* nice to have just a few extremely powerful tools rather than a seemingly infinite number of highly specific patterns, but it hasn't really ever been intuitive for me.
### Complex queries
Every existing query system I've encountered with ECS is limited to "does this component have this component or not?" This is functional for many simple queries as well as performant. Unfortunately it means that anything more complicated than querying for components will have to be handled by the system logic.
If I want to query for any entity with a `Health` component, that's easy: `Query<Health>`. What if I want to refine that query to any entity with a `Health.current` value of less than 50? We can certainly check for that in our system but it would be nice to know that our given entity array *only* contains those entities which meet that condition.
What about a series of complex joins? What if we want to query all the NPCs that the player has met which are carrying a specific type of item? Perhaps we're working on a deduction game where it would be useful to query relationships between the player, NPCs, locations, and items all at once? It may seem a bit contrived but I feel that having a language which can accurately capture and represent what we want is incredibly valuable.
## Anyway
This post is getting way too long and I've already cut it in half. There's going to be a part two at some point.
ECS is conceptually easy but really difficult to learn and master. Building an intuition for it takes time. As far as I'm concerned it's been worth it — it's another tool to solve a diverse range of problems. If you're interested in learning and using ECS, especially for game development, I would highly recommend going out and using it in order to start building an intuition for it.
{% endblock article %}