FSE sprite compression / GioCities

This was originally published 2020-07-07 as a reward for sponsors of Befriendus

A Domain-Specific Compression Algorithm — as I later found out this is called — is a compression algorithm that uses the specific nature of the target data as a way to efficiently compress it. The more you know about the structure of the data you’re compressing and what tools you have to reconstruct data, the more efficient the system can be.

I wrote a script for the Fansim Engine that does this with character sprites. It takes character poses, identifies the parts that have changed and the parts that stay the same, and creates identical Ren’py displayables that take up dramatically less room.

Why?

For a little background on why I wanted to do this at all, let me show you how Hiveswap Friendsim handles character sprites.

Hiveswap Friendsim is big.

Given that it’s a fairly simple visual novel with static sprites and a few music tracks, it’s huge. The final version on Steam (with all the DLC and updates) clocks in at 1.3 GB. Pesterquest is also stupid big at 1.43 GB.

Cursory inspection shows that the bulk of this space is taken up by RPA (Ren’Py Archive) files, as is to be expected. You can unpack those, though, which shows what’s taking up all this space:

There are two culprits here: inexplicably uncompressed WAV files (50 MB for a 5 minute loop? what on earth) and character sprites. Both of these stand out as obvious mistakes, because they’re both easily compressible.

I’m not an audio engineer, so I won’t try to explain the WAV thing in-depth, but those files are enormous and could have been compressed down to less than a tenth of their size without any noticeable loss in quality. So, uh, that really should have been done.

What I can speak to personally is the character sprites. Character poses in many visual novels — but particularly Hiveswap — tend to look like this:

The keen-eyed may have already noticed that there is some redundant data being stored here. This is the same character, in the same pose, with a different facial expression. Now, my complaint isn’t that this is somehow “lazy art”; this looks fine in gameplay. The problem is when you’re storing an identical “body” section for every pose, you’re going to end up wasting a ton of space. Which they do.

We can do better.

Sprite compression

If you’ve ever gone on spriters-resource.com (which you probably have done) or opened up a chunk of game memory in yy-chr (which you probably have not done), you’ve probably noticed that sprites aren’t stored as large, copied images, but small pieces that the game puts together at runtime. Like this, or this.

This is especially true for “big” sprites that are significantly larger than the tile size. (Although in ren’py, there isn’t a “tile size”, the principle still stands.) When your space is limited, you want to avoid storing the same thing multiple times if at all possible. Space is always limited, but bad programmers like to pretend it’s not on modern PCs, grumble.

I wanted to create something that took and compressed sprites that were already drawn, in a way that wouldn’t impose extra limits on the artists or really require any additional work.

How it works

To explain the technical details of how I implemented this, I’ll walk through running the program on Mituna’s sprites from Befriendus. This is — mostly — a recreation of how I actually handled Mituna’s sprites for the game proper.

If you’re interested in the nitty-gritty, the source code is available on the Fansim Engine github repo, although that project is no longer in active development.

Basic principles

First, I build a tree of poses. Each pose can have “children”, which are the original pose with a small “patch” somewhere to create a second pose. Ren’py will happily treat a logical “displayable” defined in code like this as an image, and even let you continue making edits recursively, so this design pattern matches perfectly.

Put simply, we can make a full sprite of Mituna frowning by putting a “frowning” segment over the standard idle pose. We don’t need to keep duplicating the whole body pose.

We can simply take the “frown” part and composite it over its base, the idle pose:

Ren’py will do this composite operation at runtime very efficiently if we just tell it what images to composite and where. This requires storing far fewer “pixels” on disk, at the cost of some slight runtime processing.

For convenience, I want all the child poses of a parent to have the same image dimensions. First, I remove any extra whitespace from the sides of the poses, and then I add whitespace until each pose is the same size, centered at the middle.

There’s another gotcha here, too: I can’t just composite the new bit on top of the old sprite, because it’s possible the new pose is transparent in a place the old one wasn’t, and image composition gets that wrong. Instead, I have to “cut a hole” in the base pose by cropping it into 5 pieces, removing the “middle piece”, and replacing it with the patch.

Step by step

I start with a big folder of sprites. Mituna’s sprites look like this:

By default, the “tree” is built based on the filenames of the sprites, although there’s optionally a way to specify a tree manually. After some renaming based on the actual logic of the poses and expressions, the program can interpret the files and build a tree fairly easily:

$ python3 makechars.py --make-demo --character-dirs mituna
INFO: Loaded 30 poses from directory 'mituna'
INFO: Remove old rpydiff_out/characterp/mituna
INFO: Remove old rpydiff_out_demo/characterp/mituna
INFO: Building pose hierarchy...
INFO: Global parent <Pose mituna idle v1* of None>
INFO:  - <Pose mituna idle v1* of None>
INFO:    - <Pose mituna idle frown v1* of !mituna idle>
INFO:      - <Pose mituna idle frown talk v1* of !mituna idle frown>
INFO:    - <Pose mituna idle smile v1* of !mituna idle>
INFO:      - <Pose mituna idle smile talk v1* of !mituna idle smile>
INFO:    - <Pose mituna idle spark v1* of !mituna idle>
INFO:      - <Pose mituna idle spark talk v1* of !mituna idle spark>
INFO:    - <Pose mituna idle talk v1* of !mituna idle>
INFO:    - <Pose mituna angry v1* of !mituna idle>
INFO:      - <Pose mituna angry spark v1* of !mituna angry>
INFO:        - <Pose mituna angry spark talk v1* of !mituna angry spark>
INFO:      - <Pose mituna angry talk v1* of !mituna angry>
INFO:    - <Pose mituna confused v1* of !mituna idle>
INFO:      - <Pose mituna confused talk v1* of !mituna confused>
INFO:    - <Pose mituna cry v1* of !mituna idle>
INFO:    - <Pose mituna defensive v1* of !mituna idle>
INFO:      - <Pose mituna defensive blush v1* of !mituna defensive>
INFO:    - <Pose mituna explode v1* of !mituna idle>
INFO:    - <Pose mituna frustrated v1* of !mituna idle>
INFO:      - <Pose mituna frustrated spark v1* of !mituna frustrated>
INFO:        - <Pose mituna frustrated spark talk v1* of !mituna frustrated spark>
INFO:      - <Pose mituna frustrated talk v1* of !mituna frustrated>
INFO:    - <Pose mituna grin v1* of !mituna idle>
INFO:    - <Pose mituna nervous v1* of !mituna idle>
INFO:      - <Pose mituna nervous alt v1* of !mituna nervous>
INFO:        - <Pose mituna nervous alt talk v1* of !mituna nervous alt>
INFO:      - <Pose mituna nervous spark v1* of !mituna nervous>
INFO:        - <Pose mituna nervous spark alt v1* of !mituna nervous spark>
INFO:          - <Pose mituna nervous spark alt talk v1* of !mituna nervous spark alt>
INFO:        - <Pose mituna nervous spark talk v1* of !mituna nervous spark>

Here, idle frown is a subpose of idle. angry is also a subpose of idle, because here idle is the global root. There needs to be a global root for the tree, which by default is idle or neutral or a variation thereof.

We then iterate through the tree, starting at the leaves, and create patches.

INFO: Expanding canvases...
INFO: Generating patches...
INFO: Creating patch of <Pose mituna idle frown talk v1* of !mituna idle frown>
INFO: Creating patch of <Pose mituna idle smile talk v1* of !mituna idle smile>
INFO: Creating patch of <Pose mituna idle spark talk v1* of !mituna idle spark>
INFO: Creating patch of <Pose mituna angry spark talk v1* of !mituna angry spark>
INFO: Creating patch of <Pose mituna angry spark v1* of !mituna angry>
INFO: Creating patch of <Pose mituna angry talk v1* of !mituna angry>
INFO: Creating patch of <Pose mituna confused talk v1* of !mituna confused>
INFO: Creating patch of <Pose mituna defensive blush v1* of !mituna defensive>
INFO: Creating patch of <Pose mituna frustrated spark talk v1* of !mituna frustrated spark>
INFO: Creating patch of <Pose mituna frustrated spark v1* of !mituna frustrated>
INFO: Creating patch of <Pose mituna frustrated talk v1* of !mituna frustrated>
INFO: Creating patch of <Pose mituna nervous alt talk v1* of !mituna nervous alt>
INFO: Creating patch of <Pose mituna nervous spark alt talk v1* of !mituna nervous spark alt>
INFO: Creating patch of <Pose mituna nervous spark alt v1* of !mituna nervous spark>
INFO: Creating patch of <Pose mituna nervous spark talk v1* of !mituna nervous spark>
INFO: Creating patch of <Pose mituna nervous alt v1* of !mituna nervous>
INFO: Creating patch of <Pose mituna nervous spark v1* of !mituna nervous>
INFO: Creating patch of <Pose mituna idle frown v1* of !mituna idle>
INFO: Creating patch of <Pose mituna idle smile v1* of !mituna idle>
INFO: Creating patch of <Pose mituna idle spark v1* of !mituna idle>
INFO: Creating patch of <Pose mituna idle talk v1* of !mituna idle>
INFO: Creating patch of <Pose mituna angry v1* of !mituna idle>
INFO: Creating patch of <Pose mituna confused v1* of !mituna idle>
INFO: Creating patch of <Pose mituna cry v1* of !mituna idle>
INFO: Creating patch of <Pose mituna defensive v1* of !mituna idle>
INFO: Creating patch of <Pose mituna explode v1* of !mituna idle>
INFO: Creating patch of <Pose mituna frustrated v1* of !mituna idle>
INFO: Creating patch of <Pose mituna grin v1* of !mituna idle>
INFO: Creating patch of <Pose mituna nervous v1* of !mituna idle>

For each patch, the algorithm takes a “base” image and an “updated” image, and tries to find the smallest rectangle of “difference” between the images. The patch is then a note of the rectangle from the updated image that needs to placed over the base image to recreate the updated image.

There are fuzzing options, of course, so very small compression differences go ignored. There are also cases where the “different” area is so large that it makes more sense to just treat the updated pose as a new pose, rather than an awkward edit of the parent. The result should be intuitive as well as efficient.

We then get a new tree made of patches and normal poses:

INFO:  - <Pose mituna idle v1* of None>
INFO:    - <Pose mituna angry v1* of !mituna idle>
INFO:      - <Pose mituna angry spark v1* of !mituna angry>
INFO:        - <PatchPose mituna angry spark talk v1* of !mituna angry spark@314,195>
INFO:      - <PatchPose mituna angry talk v1* of !mituna angry@314,195>
INFO:    - <Pose mituna cry v1* of !mituna idle>
INFO:    - <Pose mituna explode v1* of !mituna idle>
INFO:    - <Pose mituna frustrated v1* of !mituna idle>
INFO:      - <PatchPose mituna frustrated spark v1* of !mituna frustrated@155,169>
INFO:        - <PatchPose mituna frustrated spark talk v1* of !mituna frustrated spark@327,219>
INFO:      - <PatchPose mituna frustrated talk v1* of !mituna frustrated@327,219>
INFO:    - <Pose mituna grin v1* of !mituna idle>
INFO:    - <Pose mituna nervous v1* of !mituna idle>
INFO:      - <Pose mituna nervous spark v1* of !mituna nervous>
INFO:        - <PatchPose mituna nervous spark alt v1* of !mituna nervous spark@310,112>
INFO:          - <PatchPose mituna nervous spark alt talk v1* of !mituna nervous spark alt@324,224>
INFO:        - <PatchPose mituna nervous spark talk v1* of !mituna nervous spark@314,195>
INFO:      - <PatchPose mituna nervous alt v1* of !mituna nervous@310,112>
INFO:        - <PatchPose mituna nervous alt talk v1* of !mituna nervous alt@324,224>
INFO:    - <PatchPose mituna idle frown v1* of !mituna idle@248,45>
INFO:      - <PatchPose mituna idle frown talk v1* of !mituna idle frown@323,221>
INFO:    - <PatchPose mituna idle smile v1* of !mituna idle@311,148>
INFO:      - <PatchPose mituna idle smile talk v1* of !mituna idle smile@324,224>
INFO:    - <PatchPose mituna idle spark v1* of !mituna idle@159,148>
INFO:      - <PatchPose mituna idle spark talk v1* of !mituna idle spark@324,224>
INFO:    - <PatchPose mituna idle talk v1* of !mituna idle@324,224>
INFO:    - <PatchPose mituna confused v1* of !mituna idle@214,45>
INFO:      - <PatchPose mituna confused talk v1* of !mituna confused@312,236>
INFO:    - <PatchPose mituna defensive v1* of !mituna idle@228,45>
INFO:      - <PatchPose mituna defensive blush v1* of !mituna defensive@343,272>

You can see here, for instance, that “angry” was so different from “idle” that it was treated as a separate pose. Looking at the poses together, this makes perfect sense:

The program also (optionally) creates visualizations of the process so you can review the decisions and tweak the settings as needed.

Here, the areas that are the same are in black (or pink), and the differences are highlighted. There are so many spaces that the new patch would have to be almost as large as the original image, and so there is not a significant performance benefit in creating a patch for these two.

Between angry and angry talk, though, there is only a slight difference in the mouth area. This is a prime candidate to compress.

We then get two pieces of output. First, a new set of image files, but only the pieces used by the revised pose tree:

You can see that these have a dramatically reduced size. All we have here are the pieces we need: in Mituna’s case, some base poses that include the whole body, and subsections of his face for expressions.

We also get this middle layer; a Ren’py file that defines the displayable image files:

image !mituna idle = Image("{{assets}}/characterp/mituna/idle.png", yanchor=0.95)
image !mituna angry = Image("{{assets}}/characterp/mituna/angry.png", yanchor=0.95)
image !mituna angry talk = Composite(
    (780, 780),
    (0, 0), Crop((0, 0, 780, 195), "__p__mituna angry"),
    (0, 195), Crop((0, 195, 314, 309), "__p__mituna angry"),
    (314, 195), "{{assets}}/characterp/mituna/angry_talk 314,195.png",
    (462, 195), Crop((462, 195, 780, 309), "__p__mituna angry"),
    (0, 309), Crop((0, 309, 780, 780), "__p__mituna angry"), yanchor=0.95
)
...

mituna idle and mituna angry are fairly standard definitions (“Just show this image”). mituna angry talk is a complex object made up of sliced sprites. This is the “cut a hole” system mentioned earlier. The top, bottom, left, and right are just subsections of the original mituna angry pose, but in the middle, there’s a patch inserted. You can see how this would be completely impractical to do at hand, at scale.

The results

This method frees up a significant amount of space. Using the Mituna example, his original sprites were about 4.3M all together:

$ du -sh mituna/
4.3M    mituna/

After compression, his sprite folder was almost half that, at 2.2M:

$ du -sh rpydiff_out/characterp/mituna/
2.2M    rpydiff_out/characterp/mituna/

(This doesn’t include the mituna.rpy character definition file, which is a negligible 9.41 kb.)

That’s a 50% compression compression ratio that losslessly encodes a character worth of sprites in just a few seconds. And, once the compression is done, the performance benefit for all the users is free.²

At first I thought that the RPA format Ren’py uses to store assets might be smart enough to do some of this compression for me, but a quick test revealed that… not so much.

$ du -sh *
2.1M    mituna_compressed.rpa
4.2M    mituna_uncompressed.rpa

I think there is a much better reason to use RPA archives than just naive copy-protection though, which is the game delivery pipeline. If your renpy build is properly configured, different archives can contain data for different portions of the game. That structure, combined with a smart content updating system like Steam, can let developers push incremental updates that only have to change a few files, instead of having to redownload the whole game from scratch.

Conclusion

This is… really good? Astonishingly good, considering that the program can do the work without intervention. This all took a few days and about 800 SLOC of python that’s available as a script in my Fansim Engine github repo. Hopefully it makes sense; the application logic (especially the recursion) can get very complex in code. This also isn’t optimized at all; it’s all done single-threaded and synchronously, the code that finds the smallest bounding box is native python that looks like this:

    for x in range(w):    # for every col:
        for y in range(h):    # For every row
            op = originalp[x, y]
            up = updatep[x, y]
            same = (op == up) or (op[3] == 0 and up[3] == 0)
            if not same and distance(op, up) > fuzz:
                x1 = min(x1, x)
                y1 = min(y1, y)
                x2 = max(x2, x + 1)
                y2 = max(y2, y + 1)

I guess my takeaways here are:

Think about sensible ways to do things
Choosing not to optimize your game files is rude
Befriendus is good, go play it.

Caveats

There are a few things I have to mention here:

Performance cost

The tradeoff here is between the space savings and a slight performance cost. When Ren’py starts up, it caches the images at runtime, which involves doing image processing essentially on-the-fly. As far as I can tell, the performance cost here is negligable on modern systems, although I haven’t found a reliable way of actually testing that.

Side effects

I have had a single report of a case where Ren’py didn’t stitch the textures together properly, pictured here:

As far as I can tell, what’s happening here is this: When Ren’py starts in full screen (or any nonstandard resolution), it resizes the textures internally and caches them at the desired resolution. What’s happening here, I think, is that the resize is happening before the texture stitching, leading to visual artifacts around the edges. This would be similar behavior to something I saw with the very first version of Befriendus, where the frame background was scaled and aliased before the texture was tiled, leading to subpixel artifacting:

The slightly hacky workaround for the subpixel scaling issue was to simply add a 1-pixel border around the edge of the tiled area, so when it aliased, it would blend to black instead of yellow. This isn’t the best fix, but it works, and rendering is so low-level in Ren’py that it’s probably more stable to leave it alone when possible.

These are made using Wiztree, which is a very good way of looking at disk space consumption. ↩↩
See Performance cost ↩

GioCities

blogs by Gio

FSE sprite compression