Mojang's DataFixerUpper library ("DFU") includes a flexible system that are used for ease of encoding and decoding JSON files into game objects for Minecraft to use. Its main feature are Codecs, useful for focusing on the file schema and worrying little about methods of (de)serialization.
A lot of simplifications are assumed in this guide, for the convenience of explaining the fundamentals.
The included class CodecExample
is provided as an example. Note that it contains fields boolean
"foo
", List<Integer>
"bar
", and BlockState
"blockState
" as well as a class constructor containing parameters for those fields in the same order. There is also have a json file to match:
{
"foo": false,
"bar": [ 0, 1, 2 ],
"blockstate_example": {
"Name": "minecraft:dark_oak_log",
"Properties": {
"axis": "y"
}
}
}
For supporting this schema, the Codec looks like this:
Codec<Foobar> CODEC = RecordCodecBuilder.create(
instance -> instance.group(
Codec.BOOL.fieldOf("foo").forGetter((Foobar o) -> o.foo), // Basic boolean Codec
// Codec for building a list
Codec.INT.listOf().fieldOf("bar").forGetter((Foobar o) -> o.bar),
// Example usage of using a different class's Codec
BlockState.CODEC.fieldOf("blockstate_example").forGetter((Foobar o) -> o.blockState)
).apply(instance, (fooC, barC, blockStateC) -> new Foobar(fooC, barC, blockStateC))
);
Note the similarities between the Java Class code and JSON file contents with each line:
Codec.BOOL.fieldOf("foo").forGetter((Foobar o) -> o.foo) // Basic boolean Codec
A Codec can be made of other Codecs, the same way a JSON object is made of JSON elements, whether it's an integer, String, or another JSON Object. Our first line inside instance.group
defines a primitive Codec Codec.BOOL.fieldOf("foo").forGetter((Foobar o) -> o.foo)
. fieldOf
sets the schema key for the data pair, and forGetter
accepts a Function<Foobar, Boolean>
to extract the value of Foobar.foo
to serialize. This entire line is a Codec by itself. Not only this, this line also sets the Boolean
Type for our first parameter in the initializer. Pretty wack!
// Codec for building a list
Codec.INT.listOf().fieldOf("bar").forGetter((Foobar o) -> o.bar),
To process our array of integers we use the exact same process as before but we now have .listOf()
in the method chain to wrap our Codec in a list Codec. The rest is easy with fieldOf
and forGetter
accepting Function<Foobar, List<Int>>
. The existence of this Codec alone also sets the Type for the second parameter of our object initializer with List<Int>
.
BlockState.CODEC.fieldOf("blockstate_example").forGetter((Foobar o) -> o.blockState)
The BlockState
part of the schema hopefully paints a clearer picture of the fact that Codecs can be made from other Codecs. In fact, we're not doing anything much fancier than before. All you need is to reference net.minecraft.block.BlockState.CODEC
and then assign the schema key with fieldOf
and value-getter with forGetter
containing a Function<Foobar, BlockState>
.
.apply(instance, (fooC, barC, blockStateC) -> new Foobar(fooC, barC, blockStateC))
Inside .apply()
is where we put our object initializer. Since it's just 3 parameters with 3 values, our initialization lambda could be an easy (fooC, barC, blockStateC) -> new Foobar(fooC, barC, blockStateC)
. However, all of the parameters are in the same order, we can begone with the lambda and slap in a method reference Foobar::new
.
.apply(instance, Foobar::new)
And that's it, actually! It's not as simple as ABC, but it's as easy as ABC.
To be expanded
What a Codec
on the simplest level does is it represents specification of an object. Its generic sets the type and it can encode (serialize) the object into data and decode (deserialize) the data to make the object.
From what you've seen in the crash course above, invoking .fieldOf
on a Codec assigns it a field, and it now becomes a MapCodec
key-value pair where the key is the field itself and the value is the represented Codec.
Relatively straightforward: Adding .orElse
into your method chain anywhere that's not after the .forGetter
.
Codec.BOOL.orElse(false).fieldOf("foo").forGetter((Foobar o) -> o.foo)
If you wish to use a supplier object, .orElseGet
lets you do that.
If you wish to log an error for a missing entry during deserialization, you can put a Consumer<String>
or UnaryOperator<String>
. Be warned that your IDE may error about an ambiguous method call so unfortunately you may have to cast the lambda or method reference.
BlockState.CODEC.fieldOf("blockstate_example").orElseGet((Consumer<String>) System.out::println, Blocks.ACACIA_WOOD::getDefaultState).forGetter(o -> o.blockState)
Using optionalFieldOf
instead of fieldOf
wraps your Codec's type in an Optional, transforming your BlockState
into Optional<BlockState>
. Using orElse
from above topic will cover most use cases if you're actually looking to put a default value but in any case you want to have have actual non-existence of an object, then this will be perfect.
To be expanded
This is actually a bit of an outtake of an earlier revision but it's still a few good talking points I'd like to cover. The big idea is that we're mapping IntStream
into int[]
and back, since Codecs are all about writing serialization and deserialization at the same time. This method allows us to pre-process a Codec into a different type that would be preferred.
Codec.INT_STREAM.xmap( // IntStream Codec into array
IntStream::toArray, // Deserializing
Arrays::stream // Serializing
).fieldOf("bar").forGetter((Foobar o) -> o.bar)
If you wish to have finer control over type-mapping, DataResult
s exist and are a bit fancier than Optionals that either they are a successful wrapper for an object or will otherwise carry an error message to the top. Mapping exists in two directions, where comapping is turning the object from the underlying Codec into an object of the new type, and mapping turns the new type into an object of the underlying Codec type.
These are the alternative methods to to Codec#xmap
(comap/map)
Codec#comapFlatMap
(comap/flat map): Deserialize into aDataResult
instead.Codec#flatComapMap
(flat comap/map): Serialize into aDataResult
instead.Codec#flatXmap
(flat comap/flat map) Two-way flatmapping withDataResult
s.
This works great if you wish to have finer control over possible exceptions, especially in possible deserialization including parsing an Instant
in time from a String or deserializing an item with NBT from a stringified format:
Codec<Instant> FORMATTED_INSTANT = Codec.STRING.comapFlatMap(this::parseInstant, DateTimeFormatter.ISO_INSTANT::format);
DataResult<Instant> parseInstant(String instantString) {
try {
return DataResult.success(Instant.from(DateTimeFormatter.ISO_INSTANT.parse(instantString)));
} catch (DateTimeParseException e) {
return DataResult.error(e.getMessage());
}
}
You can combine two unique sets of schemas with .and
. If you're working with worldgeneration and you want to add some extra methods on top of a schema, for example, AbstractTrunkPlacer
's schema. You'd only need to invoke AbstractTrunkPlacer
's Codec builder to receive its Instance
. You can either merge it with a single Codec inside the .and()
or create a new instance group as an .and()
parameter instead. (func_236915_a_
is method_28904
in Yarn)
RecordCodecBuilder.create((instance) ->
AbstractTrunkPlacer.func_236915_a_(instance).and(instance.group(
Codec.BOOL.fieldOf("foo").forGetter((Foobar o) -> o.foo),
BlockState.CODEC.fieldOf("blockstate_example").forGetter((Foobar o) -> o.blockState)
).apply(instance, ImaginaryConstructor::new)
The resulting initializer Function now has 5 total data pairs, base_height
height_rand_a
height_rand_b
integers being mapped from AbstractTrunkPlacer.func_236915_a_
, and then our foo
boolean and blockstate_example
BlockState. It can method-reference a constructor that has the signature of ImaginaryConstructor(int, int, int, boolean, BlockState)
.
PairMapCodec
s are interesting and stand out from the others as this single MapCodec
actually represents two data values once encoded. The applications I see for it is to present an object that's able to be represented with only two data values without creating a new object-element.
For example, if you wanted to store a UUID, a 128-bit object, as two longs
instead of as a String, you easily can with:
Codec.mapPair(Codec.LONG.fieldOf("most_sig_bits"), Codec.LONG.fieldOf("least_sig_bits")).xmap(
pair -> new UUID(pair.getFirst(), pair.getSecond()),
uuid -> new Pair<>(uuid.getMostSignificantBits(), uuid.getLeastSignificantBits())
);
If we test that code out with:
JsonOps.INSTANCE
.<UUID>withEncoder(RecordCodecBuilder.create(inst -> inst
.group(this.UUID_MAPPED.forGetter(o -> o))
.apply(inst, u -> u)
)) // Function<UUID, DataResult<JsonElement>>
.apply(UUID.randomUUID()) // DataResult<JsonElement>
.result() // Optional<JsonElement>
.map(JsonElement::toString) // Optional<String>
.ifPresent(System.out::println);
The random UUID outcome in the console is
{"least_sig_bits":-7906069608818915914,"most_sig_bits":-3340978081725788219}
If you have a Map of values and wish to serialize it; instead of serializing it into an array of holder objects containing both values, you can instead serialize into a Codec.compoundList
which instead stores the two values as a proper key-val pair.
Storing a group of key-val pairs in objects with a regular list:
[{
"k": <key1>,
"v": <value1>
}, {
"k": <key2>,
"v": <value2>
}]
Storing a group of key-val pairs as literal key-val pairs with a compound list:
{
<key1>: <value1>,
<key2>: <value2>
}
For example, let's say we want to store a list of UUIDs in a data structure besides being an array of Strings. Unfortunately in regular Java especially Java's Streams this turns into messy code disappointingly quick.
So first here, we actually must create an alternative long
Codec as the key Codec must ultimately encode a String
, leading us this LONG_S
Codec:
Codec<Long> LONG_S = Codec.STRING.comapFlatMap(string -> {
try {
return DataResult.success(Long.parseLong(string));
} catch (NumberFormatException e) {
return DataResult.error(e.getMessage());
}
}, String::valueOf);
With that Codec out of the way:
Codec<List<UUID>> UUID_MULTIMAPPED = Codec.compoundList(this.LONG_S, Codec.LONG.listOf()).xmap(
pairList -> pairList
.stream()
// Flattening the Multimap structure is thankfully super simple
.flatMap(pair -> pair
.getSecond()
.stream()
.map(leastSigBits -> new UUID(pair.getFirst(), leastSigBits))
)
.collect(ImmutableList.toImmutableList()),
uuidList -> uuidList
.stream()
// ... Unfortunately, it is not as simple at all to construct the Multimap. Ugly code ahead
.<HashMap<Long, List<Long>>>collect(
HashMap::new,
(multiMap, uuid) -> multiMap.compute(uuid.getMostSignificantBits(), (aLong, longList) -> {
(longList == null ? (longList = new LongArrayList()) : longList).add(aLong);
return longList;
}),
(multiMap, multiMap2) -> multiMap2.forEach((keyFrom2, listToDump) -> {
// You normally don't even need this combiner function unless you're parallel-streaming but I wrote it for you anyway
multiMap.compute(keyFrom2, (v, listReceiving) -> {
(listReceiving == null ? (listReceiving = new LongArrayList()) : listReceiving).addAll(listToDump);
return listReceiving;
});
})
)
.entrySet()
.stream() // And here we go again
.map(e -> Pair.of(e.getKey(), e.getValue()))
.collect(ImmutableList.toImmutableList())
);
Yikes. I'm sorry I wrote that!
I actually wrote the code example initially using [ProjectReactor](https://projectreactor.io/docs/core/release/reference/) as I got too comfortable, forgot and had to write the horrifying abomination of the code seen above. This here is the original code example using ProjectReactor.
Codec<Long> LONG_S = Codec.STRING.comapFlatMap(string -> {
try {
return DataResult.success(Long.parseLong(string));
} catch (NumberFormatException e) {
return DataResult.error(e.getMessage());
}
}, String::valueOf);
Codec<List<UUID>> UUID_MULTIMAPPED = Codec.compoundList(this.LONG_S, Codec.LONG.listOf()).xmap(
pairList -> Flux.fromIterable(pairList)
.flatMap(pair -> Flux.fromIterable(pair.getSecond())
.map(leastSigBits -> new UUID(pair.getFirst(), leastSigBits))
)
.collect(ImmutableList.toImmutableList())
.block(),
uuidList -> Flux.fromIterable(uuidList)
.groupBy(UUID::getMostSignificantBits, UUID::getLeastSignificantBits)
.flatMap(bitFlux -> bitFlux.collectList()
.map(leastSigBits -> Pair.of(bitFlux.key(), leastSigBits))
)
.collect(ImmutableList.toImmutableList())
.block()
);
Significantly easier to read than the first mess of an example!
And here's an alternate implementation using Guava's multimaps instead.
Codec<List<UUID>> UUID_MULTIMAPPED = Codec.compoundList(this.LONG_S, Codec.LONG.listOf()).xmap(
pairList -> pairList
.stream()
// Flattening the Multimap structure is thankfully super simple
.flatMap(pair -> pair
.getSecond()
.stream()
.map(leastSigBits -> new UUID(pair.getFirst(), leastSigBits))
)
.collect(ImmutableList.toImmutableList()),
uuidList -> uuidList
.stream()
.collect(Multimaps.toMultimap(UUID::getMostSignificantBits, UUID::getLeastSignificantBits, HashMultimap::create))
.asMap()
.entrySet()
.stream() // And here we go again
.<Pair<Long, List<Long>>>map(e -> Pair.of(e.getKey(), ImmutableList.copyOf(e.getValue())))
.collect(ImmutableList.toImmutableList())
);
Up to this point, Codecs have been talked about only from the JSON standpoint. The DFU library provides this implementation of the DynamicOps
interface. However, Minecraft itself also provides an implementation called NBTDynamicOps
that creates an INBT
instead of JsonOps
's JsonElement
.
Unfortunately, RecordCodecBuilder
only allows maximum 16 fields for a given Codec. If you hit this many, you may want to break things down into lower-level Codecs because readability may suffer otherwise. However, if you are using .and
, the limit is unfortunately 8. If the need arises anyway, you can mash together custom-purpose interfaces to serve your needs.
You can find the Javadocs here if you wish to get into the meat of Mojang's DataFixerUpper library.
Just a quick tip for anyone stumbling upon this: When deserializing Minecraft related JSON data like data packs, make sure to use the
RegistryOps
DynamicOps instead of justJsonOps
or deserializing most registry related data won't work.