An Introduction to ECS
Data Oriented Design is pretty simple at its core, all we really have is data.
As a means of contrast, consider Object Oriented Programming,
where it's a common approach
to consider objects in the problem domain rather than the programming domain.
There are two major problems with this approach. Firstly, inheritance comes with a cost as virtual functions are traced to their concrete
implementations. Indeed, if we're working in Python then the oop overhead is already going to be horrendous. Secondly, although
grouping object data together is great in the problem domain, it's not good for performance. When we step through an array of monsters
and grab a monster to work with, some number of nearby bytes are loaded into the CPU cache, meaning those bytes will be very fast to access.
The problem, then is that if the stride (the memory footprint) is needlessly large, then the amount of useful cached data is reduced.
The solution to these two problems (and many more) is to split apart an object's data into various arrays.
Entity
An entity is the primary key of a database. It's an integer, nothing more.
Component
A component is a small chunk of related data. For instance in the above diagram, a transform component holds just a position and rotation.
In C or C++ these would be structs, but in Python Arts et Metiers uses numpy arrays (for more details, see the tutorial on packed arrays).
Every component has an associated entity, so that different sets of components can work together to describe objects. Say for instance we
want entity 7 to represent a cube, the cube will have a Transform component, and a Velocity component (and other things, but this is keeping
things simple). Upon creation, an insert operation is performed on the transform and velocity arrays. To simplify this, each of those arrays
is managed by a "ComponentSet". The ComponentSet also has a simple array of integers tracking the associated entity for each component, so
in our example, both the Transform and Velocity ComponentSets would also insert an integer of 7 into their entity arrays.
System
A system is a small piece of logic, it's a function that takes in some sets of components and uses them in some simple way. By splitting
large game logic operations into pieces, we ensure that the cache hit rate is high. In other words, the CPU doesn't have to worry about
handling all sorts of different objects. The transform system sees an array of positions, contiguous in memory with no gaps.
To better illustrate, here are some examples:
- Velocity System: takes in Transform and Velocity components, uses velocity to update positions.
- Input System: manages mouse and keyboard callback functions/polling, updates the camera's orientation data, uploads to uniforms
- Render System: takes in Render Batch descriptions and Model matrices, manages meshes and materials, draws everything
That's the basic philosophy of Data Oriented Design. Of course, just because a system is data oriented, that doesn't mean that
it can't have object oriented parts. Hopefully an awareness of the issues Data Oriented Design addresses will help you strike
the right balance in your own coding.