Skip to content

Instantly share code, notes, and snippets.

@Drullkus
Last active September 29, 2024 23:16
Show Gist options
  • Save Drullkus/1bca3f2d7f048b1fe03be97c28f87910 to your computer and use it in GitHub Desktop.
Save Drullkus/1bca3f2d7f048b1fe03be97c28f87910 to your computer and use it in GitHub Desktop.

Mojang Codecs

Mojang's DataFixerUpper library ("DFU") includes a flexible system that are used for ease of encoding and decoding JSON files into game objects for Minecraft to use. Its main feature are Codecs, useful for focusing on the file schema and worrying little about methods of (de)serialization.

A lot of simplifications are assumed in this guide, for the convenience of explaining the fundamentals.

Getting Started

The included class CodecExample is provided as an example. Note that it contains fields boolean "foo", List<Integer> "bar", and BlockState "blockState" as well as a class constructor containing parameters for those fields in the same order. There is also have a json file to match:

{
  "foo": false,
  "bar": [ 0, 1, 2 ],
  "blockstate_example": {
    "Name": "minecraft:dark_oak_log",
    "Properties": {
      "axis": "y"
    }
  }
}

For supporting this schema, the Codec looks like this:

Codec<Foobar> CODEC = RecordCodecBuilder.create(
    instance -> instance.group(
        Codec.BOOL.fieldOf("foo").forGetter((Foobar o) -> o.foo), // Basic boolean Codec
        // Codec for building a list
        Codec.INT.listOf().fieldOf("bar").forGetter((Foobar o) -> o.bar),
        // Example usage of using a different class's Codec
        BlockState.CODEC.fieldOf("blockstate_example").forGetter((Foobar o) -> o.blockState)
    ).apply(instance, (fooC, barC, blockStateC) -> new Foobar(fooC, barC, blockStateC))
);

Note the similarities between the Java Class code and JSON file contents with each line:

Quick Crash Course with Codecs

Working with Primitive Codecs

Codec.BOOL.fieldOf("foo").forGetter((Foobar o) -> o.foo) // Basic boolean Codec

A Codec can be made of other Codecs, the same way a JSON object is made of JSON elements, whether it's an integer, String, or another JSON Object. Our first line inside instance.group defines a primitive Codec Codec.BOOL.fieldOf("foo").forGetter((Foobar o) -> o.foo). fieldOf sets the schema key for the data pair, and forGetter accepts a Function<Foobar, Boolean> to extract the value of Foobar.foo to serialize. This entire line is a Codec by itself. Not only this, this line also sets the Boolean Type for our first parameter in the initializer. Pretty wack!

Lists using Codecs

// Codec for building a list
Codec.INT.listOf().fieldOf("bar").forGetter((Foobar o) -> o.bar),

To process our array of integers we use the exact same process as before but we now have .listOf() in the method chain to wrap our Codec in a list Codec. The rest is easy with fieldOf and forGetter accepting Function<Foobar, List<Int>>. The existence of this Codec alone also sets the Type for the second parameter of our object initializer with List<Int>.

Using other classes as Codecs

BlockState.CODEC.fieldOf("blockstate_example").forGetter((Foobar o) -> o.blockState)

The BlockState part of the schema hopefully paints a clearer picture of the fact that Codecs can be made from other Codecs. In fact, we're not doing anything much fancier than before. All you need is to reference net.minecraft.block.BlockState.CODEC and then assign the schema key with fieldOf and value-getter with forGetter containing a Function<Foobar, BlockState>.

Constructing it

.apply(instance, (fooC, barC, blockStateC) -> new Foobar(fooC, barC, blockStateC))

Inside .apply() is where we put our object initializer. Since it's just 3 parameters with 3 values, our initialization lambda could be an easy (fooC, barC, blockStateC) -> new Foobar(fooC, barC, blockStateC). However, all of the parameters are in the same order, we can begone with the lambda and slap in a method reference Foobar::new.

.apply(instance, Foobar::new)

And that's it, actually! It's not as simple as ABC, but it's as easy as ABC.

The Foundations

To be expanded

Codecs

What a Codec on the simplest level does is it represents specification of an object. Its generic sets the type and it can encode (serialize) the object into data and decode (deserialize) the data to make the object.

Mapping Codecs

From what you've seen in the crash course above, invoking .fieldOf on a Codec assigns it a field, and it now becomes a MapCodec key-value pair where the key is the field itself and the value is the represented Codec.

Default Values

Relatively straightforward: Adding .orElse into your method chain anywhere that's not after the .forGetter.

Codec.BOOL.orElse(false).fieldOf("foo").forGetter((Foobar o) -> o.foo)

If you wish to use a supplier object, .orElseGet lets you do that.

If you wish to log an error for a missing entry during deserialization, you can put a Consumer<String> or UnaryOperator<String>. Be warned that your IDE may error about an ambiguous method call so unfortunately you may have to cast the lambda or method reference.

BlockState.CODEC.fieldOf("blockstate_example").orElseGet((Consumer<String>) System.out::println, Blocks.ACACIA_WOOD::getDefaultState).forGetter(o -> o.blockState)

Optional

Using optionalFieldOf instead of fieldOf wraps your Codec's type in an Optional, transforming your BlockState into Optional<BlockState>. Using orElse from above topic will cover most use cases if you're actually looking to put a default value but in any case you want to have have actual non-existence of an object, then this will be perfect.

Getting Advanced

To be expanded

Array Building from IntStream and intro to Codec type mapping

This is actually a bit of an outtake of an earlier revision but it's still a few good talking points I'd like to cover. The big idea is that we're mapping IntStream into int[] and back, since Codecs are all about writing serialization and deserialization at the same time. This method allows us to pre-process a Codec into a different type that would be preferred.

Codec.INT_STREAM.xmap( // IntStream Codec into array
    IntStream::toArray, // Deserializing
    Arrays::stream // Serializing
).fieldOf("bar").forGetter((Foobar o) -> o.bar)

Data Results and digging deeper into Codec type mapping

If you wish to have finer control over type-mapping, DataResults exist and are a bit fancier than Optionals that either they are a successful wrapper for an object or will otherwise carry an error message to the top. Mapping exists in two directions, where comapping is turning the object from the underlying Codec into an object of the new type, and mapping turns the new type into an object of the underlying Codec type.

These are the alternative methods to to Codec#xmap (comap/map)

  • Codec#comapFlatMap (comap/flat map): Deserialize into a DataResult instead.
  • Codec#flatComapMap (flat comap/map): Serialize into a DataResult instead.
  • Codec#flatXmap(flat comap/flat map) Two-way flatmapping with DataResults.

This works great if you wish to have finer control over possible exceptions, especially in possible deserialization including parsing an Instant in time from a String or deserializing an item with NBT from a stringified format:

Codec<Instant> FORMATTED_INSTANT = Codec.STRING.comapFlatMap(this::parseInstant, DateTimeFormatter.ISO_INSTANT::format);

DataResult<Instant> parseInstant(String instantString) {
    try {
        return DataResult.success(Instant.from(DateTimeFormatter.ISO_INSTANT.parse(instantString)));
    } catch (DateTimeParseException e) {
        return DataResult.error(e.getMessage());
    }
}

Merging Codecs

You can combine two unique sets of schemas with .and. If you're working with worldgeneration and you want to add some extra methods on top of a schema, for example, AbstractTrunkPlacer's schema. You'd only need to invoke AbstractTrunkPlacer's Codec builder to receive its Instance. You can either merge it with a single Codec inside the .and() or create a new instance group as an .and() parameter instead. (func_236915_a_ is method_28904 in Yarn)

RecordCodecBuilder.create((instance) ->
    AbstractTrunkPlacer.func_236915_a_(instance).and(instance.group(
        Codec.BOOL.fieldOf("foo").forGetter((Foobar o) -> o.foo),
        BlockState.CODEC.fieldOf("blockstate_example").forGetter((Foobar o) -> o.blockState)
    ).apply(instance, ImaginaryConstructor::new)

The resulting initializer Function now has 5 total data pairs, base_height height_rand_a height_rand_b integers being mapped from AbstractTrunkPlacer.func_236915_a_, and then our foo boolean and blockstate_example BlockState. It can method-reference a constructor that has the signature of ImaginaryConstructor(int, int, int, boolean, BlockState).

Mapped Codec Pairs

PairMapCodecs are interesting and stand out from the others as this single MapCodec actually represents two data values once encoded. The applications I see for it is to present an object that's able to be represented with only two data values without creating a new object-element.

For example, if you wanted to store a UUID, a 128-bit object, as two longs instead of as a String, you easily can with:

Codec.mapPair(Codec.LONG.fieldOf("most_sig_bits"), Codec.LONG.fieldOf("least_sig_bits")).xmap(
    pair -> new UUID(pair.getFirst(), pair.getSecond()),
    uuid -> new Pair<>(uuid.getMostSignificantBits(), uuid.getLeastSignificantBits())
);

If we test that code out with:

JsonOps.INSTANCE
        .<UUID>withEncoder(RecordCodecBuilder.create(inst -> inst
                .group(this.UUID_MAPPED.forGetter(o -> o))
                .apply(inst, u -> u)
        ))                               // Function<UUID, DataResult<JsonElement>>
        .apply(UUID.randomUUID())        // DataResult<JsonElement>
        .result()                        // Optional<JsonElement>
        .map(JsonElement::toString)      // Optional<String>
        .ifPresent(System.out::println);

The random UUID outcome in the console is

{"least_sig_bits":-7906069608818915914,"most_sig_bits":-3340978081725788219}

Compound List Codecs

If you have a Map of values and wish to serialize it; instead of serializing it into an array of holder objects containing both values, you can instead serialize into a Codec.compoundList which instead stores the two values as a proper key-val pair.

Storing a group of key-val pairs in objects with a regular list:

[{
  "k": <key1>,
  "v": <value1>
}, {
  "k": <key2>,
  "v": <value2>
}]

Storing a group of key-val pairs as literal key-val pairs with a compound list:

{
  <key1>: <value1>,
  <key2>: <value2>
}
For example, let's say we want to store a list of UUIDs in a data structure besides being an array of Strings. Unfortunately in regular Java especially Java's Streams this turns into messy code disappointingly quick.

So first here, we actually must create an alternative long Codec as the key Codec must ultimately encode a String, leading us this LONG_S Codec:

Codec<Long> LONG_S = Codec.STRING.comapFlatMap(string -> {
    try {
        return DataResult.success(Long.parseLong(string));
    } catch (NumberFormatException e) {
        return DataResult.error(e.getMessage());
    }
}, String::valueOf);

With that Codec out of the way:

Codec<List<UUID>> UUID_MULTIMAPPED = Codec.compoundList(this.LONG_S, Codec.LONG.listOf()).xmap(
    pairList -> pairList
        .stream()
        // Flattening the Multimap structure is thankfully super simple
        .flatMap(pair -> pair
            .getSecond()
            .stream()
            .map(leastSigBits -> new UUID(pair.getFirst(), leastSigBits))
        )
        .collect(ImmutableList.toImmutableList()), 
    uuidList -> uuidList
        .stream()
        // ... Unfortunately, it is not as simple at all to construct the Multimap. Ugly code ahead
        .<HashMap<Long, List<Long>>>collect(
            HashMap::new, 
            (multiMap, uuid) -> multiMap.compute(uuid.getMostSignificantBits(), (aLong, longList) -> { 
                (longList == null ? (longList = new LongArrayList()) : longList).add(aLong);
                return longList; 
            }), 
            (multiMap, multiMap2) -> multiMap2.forEach((keyFrom2, listToDump) -> {
// You normally don't even need this combiner function unless you're parallel-streaming but I wrote it for you anyway
                multiMap.compute(keyFrom2, (v, listReceiving) -> { 
                    (listReceiving == null ? (listReceiving = new LongArrayList()) : listReceiving).addAll(listToDump);
                    return listReceiving; 
                }); 
            })
        )
        .entrySet()
        .stream() // And here we go again
        .map(e -> Pair.of(e.getKey(), e.getValue()))
        .collect(ImmutableList.toImmutableList())
);

Yikes. I'm sorry I wrote that!

I actually wrote the code example initially using [ProjectReactor](https://projectreactor.io/docs/core/release/reference/) as I got too comfortable, forgot and had to write the horrifying abomination of the code seen above. This here is the original code example using ProjectReactor.
Codec<Long> LONG_S = Codec.STRING.comapFlatMap(string -> {
    try {
        return DataResult.success(Long.parseLong(string));
    } catch (NumberFormatException e) {
        return DataResult.error(e.getMessage());
    }
}, String::valueOf);

Codec<List<UUID>> UUID_MULTIMAPPED = Codec.compoundList(this.LONG_S, Codec.LONG.listOf()).xmap(
        pairList -> Flux.fromIterable(pairList)
                .flatMap(pair -> Flux.fromIterable(pair.getSecond())
                        .map(leastSigBits -> new UUID(pair.getFirst(), leastSigBits))
                )
                .collect(ImmutableList.toImmutableList())
                .block(),
        uuidList -> Flux.fromIterable(uuidList)
                .groupBy(UUID::getMostSignificantBits, UUID::getLeastSignificantBits)
                .flatMap(bitFlux -> bitFlux.collectList()
                        .map(leastSigBits -> Pair.of(bitFlux.key(), leastSigBits))
                )
                .collect(ImmutableList.toImmutableList())
                .block()
);

Significantly easier to read than the first mess of an example!

And here's an alternate implementation using Guava's multimaps instead.
Codec<List<UUID>> UUID_MULTIMAPPED = Codec.compoundList(this.LONG_S, Codec.LONG.listOf()).xmap(

    pairList -> pairList
        .stream()
        // Flattening the Multimap structure is thankfully super simple
        .flatMap(pair -> pair
            .getSecond()
            .stream()
            .map(leastSigBits -> new UUID(pair.getFirst(), leastSigBits))
        )
        .collect(ImmutableList.toImmutableList()),
    uuidList -> uuidList
        .stream()
        .collect(Multimaps.toMultimap(UUID::getMostSignificantBits, UUID::getLeastSignificantBits, HashMultimap::create))
        .asMap()
        .entrySet()
        .stream() // And here we go again
        .<Pair<Long, List<Long>>>map(e -> Pair.of(e.getKey(), ImmutableList.copyOf(e.getValue())))
        .collect(ImmutableList.toImmutableList())
);

Other words

Serializing to different file formats

Up to this point, Codecs have been talked about only from the JSON standpoint. The DFU library provides this implementation of the DynamicOps interface. However, Minecraft itself also provides an implementation called NBTDynamicOps that creates an INBT instead of JsonOps's JsonElement.

Limitations

Unfortunately, RecordCodecBuilder only allows maximum 16 fields for a given Codec. If you hit this many, you may want to break things down into lower-level Codecs because readability may suffer otherwise. However, if you are using .and, the limit is unfortunately 8. If the need arises anyway, you can mash together custom-purpose interfaces to serve your needs.

Javadocs

You can find the Javadocs here if you wish to get into the meat of Mojang's DataFixerUpper library.

import com.mojang.datafixers.util.Function3;
import com.mojang.serialization.Codec;
import com.mojang.serialization.codecs.RecordCodecBuilder;
import net.minecraft.block.BlockState;
import java.util.List;
public class CodecExample {
public static final Codec<CodecExample> CODEC = RecordCodecBuilder.create(
instance -> instance.group(
Codec.BOOL.fieldOf("foo").forGetter((Foobar o) -> o.foo), // Basic boolean Codec
// Codec for building a list
Codec.INT.listOf().fieldOf("bar").forGetter((Foobar o) -> o.bar),
// Example usage of using a different class's Codec
BlockState.CODEC.fieldOf("blockstate_example").forGetter((Foobar o) -> o.blockState)
).apply(instance, new Function3<Boolean, List<Integer>, BlockState, Foobar>() {
@Override
public CodecExample apply(Boolean fooC, List<Integer> barC, BlockState blockStateC) {
return new CodecExample(fooC, barC, blockStateC);
}
})
);
public boolean foo;
public BlockState blockState;
public List<Integer> bar;
public CodecExample(boolean foo, List<Integer> bar, BlockState blockState) {
this.foo = foo;
this.bar = bar;
this.blockState = blockState;
}
}
@TheCurle
Copy link

TheCurle commented Nov 3, 2020

This is some good stuff!
I don't think a more thorough explanation of dfu Codecs currently exists.

@Drullkus
Copy link
Author

Drullkus commented Nov 7, 2020

@TheCurle Thank you! I wrote this hoping it would help others. If you have any suggestions for improvements I'm all ears!

@br90218
Copy link

br90218 commented Jul 30, 2021

Hi! I somehow stumbled upon your tutorial through Googling. Thank you very much for this helpful tutorial!

I have a question:
Line 19 -- should it be returning type CodecExample instead of Foobar?

@ByteZ1337
Copy link

Just a quick tip for anyone stumbling upon this: When deserializing Minecraft related JSON data like data packs, make sure to use the RegistryOps DynamicOps instead of just JsonOps or deserializing most registry related data won't work.

RegistryOps.create(JsonOps.INSTANCE, minecraftServer.registryAccess())

@rikka0w0
Copy link

When deserializing Minecraft related JSON data like data packs, make sure to use the RegistryOps

That saved my life! I was implementing a datapack-based mob spawn adder in Fabric and struggled with parsing the "biomes" field. I use RegistryOps.create(JsonOps.INSTANCE, BuiltinRegistries.ACCESS); to create my RegistryOps and it seems to work with vanilla biomes. Not sure if it will work with modded biomes.

In my code, I registered my own reload listener (ResourceManagerHelper.get(PackType.SERVER_DATA) .registerReloadListener(new SimpleSynchronousResourceReloadListener() {}). It will be fired when the datapack is loaded (e.g. just before the logic server starts up). I parse the datapack files there using my own codec, and after I deserialize each one, I call BiomeModifications.addSpawn so that the Fabric API takes care of the rest of things.

@ByteZ1337
Copy link

ByteZ1337 commented Oct 19, 2022

Not sure if it will work with modded biomes.

Pretty sure it won't since BuiltinRegistries.ACCESS isn't the RegistryAccess actively used by Minecraft's. It's more of a template for new RegistryAccess instances. But as I said in my original comment, you can get it via minecraftServer.registryAccess(). Not sure how you can get the MinecraftServer instance in fabric though.

@rikka0w0
Copy link

I'm working on a Mod that supports both Forge and Fabric, it mainly adds vanilla-style mobs. I'm trying to reuse Forge's forge/biome_modifier data files in Fabric. Those files contain a "biome" field, which definitely needs registryAccess to parse.

Not sure how you can get the MinecraftServer instance in fabric though.

In Fabric, events happen in the following order:

  1. Load datapack (listeners registered via ResourceManagerHelper.get(PackType.SERVER_DATA).registerReloadListener())
  2. Constructor of MinecraftServer class gets called. At the very beginning, the registryAccess is saved inside the MinecraftServer instance.
  3. Fabric API applies cached biome modifications (e.g. add mob spawns)

The problem is that, when the datapack reload listener is called, the MinecraftServer instance is not created yet.

Now I have to cache all JsonObjects in the datapack reload listener, then insert my own mixin (to the beginning of net.fabricmc.fabric.impl.biome.modification.BiomeModificationImpl.finalizeWorldGen(RegistryAccess, PrimaryLevelData), which allows me to access the proper registryAccess (the one from MinecraftServer instance) and parse the jsons before Fabric API actually applies them. I somehow feel that Forge is better than Fabric in terms of the quality of APIs.

Thanks for your help again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment