/ GDC, TOOLS_TUTORIAL_DAY, DATA

Notes for "A Tale of Three Data Schemas"

GDC 2018, Tools Tutorial Day, Ludovic Chabant, Senior Software Engineer @ Electronic Arts Vancouver shared the know-how of the Frostbite engine team when it comes to designing the game’s data.

These are my notes that I wrote down from watching Ludovic's presentation. If you like what you read I hight recommend getting a GDC Vault subscription and watching the full talk link to talk.

Main Takeaway

The same data is used in different ways. Optimizing the data’s logical representation for different usage scenarios is key in achieving great performance, efficient use of storage and a sublime user experience. The way that an artist thinks of the data that he or she sees in the editor window might not be the best way to represent the data to load it as fast as possible into the game or might not be the best way to store it on disk. Different data consumption scenarios demand different optimization. Off course maintaining more than one data layout isn’t free but it is worth the investment.

different data schemas to solve different problems

a picture of a side from Ludovic Chabant's GDC 2018 presentation

Data Schema

Definition: a data schema is the formal description of the structures a system is working with.

Examples of data schemas

  • Public properties of script components in Unity
  • Decorated public properties of a UClass in Unreal Engine
  • Blind Data in Maya
  • Table columns in a database management system
  • Frostbite DDF (Data Definition Format)

Basic idea

A data schema is a for communicating about the existence of a type of something which has:

  • properties like
    • ints
    • floats
    • etc.
  • references to other types

Usage scenarios

A data schema is used to:

  • display a type’s properties in an editor context.
  • store the type on disk.
  • load the type at runtime.

Designing a data schema

Designing data schemas requires consideration

a picture of a side from Ludovic Chabant's GDC 2018 presentation


If you design a data schema with a programmer bias you will get a data schema that will be great for loading at runtime.

It will be:

  • flat and packed
  • optimized for loading
  • unintelligible for designers and artists

If you design a data schema with a content creator bias you will get a data schema that will be great for human understanding and iteration.

It will be:

  • easy to use and edit by multiple people
  • not cache friendly
  • not efficiently laid out in memory

Three different data schemas

Not just one data schema

a picture of a side from Ludovic Chabant's GDC 2018 presentation

Runtime data schema

used by

  • the game for loading the data into memory
  • the programmers in the code

purpose

  • performance
  • patching
  • loading
  • nicely packed in memory

optimized for

  • reading

Storage data schema

the tools backend a.k.a disk version

Tools usually use this to save what the user is working on (the logical peace of data). Not to be confused with a data format like XML, JSON or YAML.

used by

  • tools to save users work on some form of persistent storage

purpose

  • can be versioned and easily merged in revision control
  • can be used by tools for automation
  • supports multi-user editing (splitting into different parts)
    • for example, the runtime data schema doesn’t need this

optimized for

  • writing to disk

Tool data schema

the tools frontend

How the content creators see it in their minds. How the game data is displayed in an editor as a graph, property grid or a gizmo. You can think of this as a subset of UX. Usually, very Object Oriented.

used by

  • content creators for content creation
  • engineers for analysis and decision making
  • tools developers for creating a great editing experience

purpose

  • better understanding by humans
  • better UX
  • workflow and iteration oriented

optimized for

  • editing
  • iteration

Frostbite Engine tails

Ludovic brings up 4 tails about how the 3 data schemas are used in the Frostbite Engine.

The main points of the tails:

  1. Use a more user-friendly data schema for user editing, but keep an eye out for where you do the conversion from one data schema to another.
  2. Use different data schemas to solve different problems, and optimize for different use cases.
  3. UX and Data Schemas have a strong relationship. Changing one might affect the other.
  4. Use conditional compilation to include tools/storage data schemas in the runtime data schemas. Conditionally include appropriate pipeline code in the runtime to transform data on the fly during live-edit.

The most important Data Schema

The Storage Data Schema is the most important Data Schema. It is only Data Schema that is persistent.

it the most expensive to change, because it is persistent

Designing Data Schemas

  1. Start with designing the Storage Data Schema
  2. Build your Tools around the Storage Data Schema
  3. Implement your Runtime Code around the Storage Data Schema
    • if it is obvious that the Storage Schema is not performant enough go to the next subsection to find out how to deal with it

Dealing with performance issues

If the Runtime Data Schema is not performant enough

  1. Adjust the Runtime Data Schema and the code
  2. Write the pipeline code to convert from the Storage Schema to the Runtime Schema

Mixing Data Schemas

If your engine does not formally support different types of data schemas, use conditional compilation to have ways to remove the Tools/Storage Data Schemas from the production build.

What we have learned

  1. There are different ways to organize the same logical data (Tools, Storage, Runtime)
  2. Use more user-friendly data organization for the users of your Tools and a more performant way of data organization for your Runtime
  3. With this way of organizing data, you can solve different problems without a lot of sacrifices
  4. Always remember that a change in a Data Schema can lead to a change in the UX of a tool and vice versa
  5. Have a pipeline in place to convert from the Storage Schema to the Runtime Schema

Personal notes

This reminded me of Mike Acton’s Data-Oriented Design talk where every problem in programming is a data transformation problem.

Mindmap / Mental Representation

Click to see it in fullscreen

Link to mindmap source

You can open the file with XMind

The Toolsmiths

The Toolsmiths logo Ludovic Chabant is a member of the Toolsmiths community. The Toolsmiths are a community of Game Tool Developers that are passionate about improving the way people make games.

Join us on Slack.

Join us on Twitter.