trying to 4× upscale a 2016 jrpg's textures
a multi-day rabbit hole of reverse engineering EXG textures, YPAC archives, layout files, and an x86-64 sprite engine - to make Stranger of Sword City Revisited look less blurry on a 4K monitor.
stranger of sword city revisited is a jrpg i was playing on a 4k monitor. the original assets are at a lower resolution and the engine renders them without any high-DPI scaling, so the textures end up looking soft. i wanted to try replacing them with 4× ai-upscaled versions.
what follows is three evenings (so far - this is a story still in progress) of figuring out the file formats, the engine's rendering paths, and where the limits are.
the file format problem
textures live in data/graphic/*.exg. the format starts with EXGr magic,
a 40-byte header, and a zstd-compressed payload. easy enough - write a
python tool that decompresses, hands the bytes to PIL or imagemagick, runs
real-esrgan-ncnn-vulkan on it, and re-compresses.
the header has fields like width, height, width2, height2, and a
mysterious unk20 byte. early on i learned:
unk20 = 0x201or0x200means raw RGBA8 pixels insideunk20 = 0x203means a DXT5 DDS file insideunk20 = 0x3means... uncompressed DXT5 DDS (no zstd at all). found this one accidentally when 145 effect files all errored at once
width and width2 are usually the same. usually. but in atlas-style
files (portraits with cropped visible regions) they differ - the "stored"
size is bigger than the visible. and DXT5 alignment slack means
width=1008, width2=1007. you can't notice this until you write back a
file that the engine then refuses to load.
so the first wall was just understanding the format well enough to round-trip a file byte-identically. it took embarrassingly many attempts.
the texture-renders-at-4× problem
ok, format figured out. swap a bg_001.exg for a 4× upscaled version.
launch the game. the background covers four screens - only the top-left
quadrant is visible. the engine reads the texture's width/height from the
header, computes screen positions as if it's working with 1× textures,
and now everything is 4× too big.
i tried the obvious stuff first:
- modified the shaders. they're literal passthrough -
gl_Position = u_projMtx * u_viewMtx * u_modelMtx * vec4(a_position, 1.0). the positions are pre-computed on the CPU and uploaded as vertex data. scaling in the shader collapses everything toward NDC origin. - changed
display.dat. ignored. - wrote an
LD_PRELOADopengl wrapper to interceptglTexImage2Dand rescale. wine routes texture uploads through internal paths that bypass procaddr interception, so the hook caught 2 calls one run and 3479 the next. unusable.
what i needed was the function that takes a texture's dimensions and computes the quad's screen coordinates. binary patch territory.
ghidra, finally
i'd been avoiding it. firing up ghidra felt like committing. but there was no path forward without it.
apitrace came first - built from source as a mingw cross-compile (just the
wgltrace target because the full build hit gcc 15 strictness issues in
dxerr.cpp). dropped the resulting opengl32.dll into the game folder
with WINEDLLOVERRIDES=opengl32=n,b. now i had ~200mb traces of every
opengl call the engine made per session.
with the trace + ghidra i found FUN_140194a70: a sprite quad builder
that reads width and height from [rdi+0x24] and [rdi+0x26] of a
sprite descriptor struct, converts them to floats, and uses them in the
quad math. the relevant bytes:
0x140194bfa: 66 0f 6e e0 movd xmm4, eax ; eax = sprite_w
0x140194bfe: 0f 5b e4 cvtdq2ps xmm4, xmm4
7 bytes total. i replaced it with:
0x140194bfa: c1 e8 02 shr eax, 2 ; sprite_w /= 4
0x140194bfd: c5 ea 2a e0 vcvtsi2ss xmm4, xmm2, eax
also 7 bytes. the VEX vcvtsi2ss is a 4-byte instruction with a clean
upper-bits source (xmm2 needed to be known-zero, took me a frame to
realize a non-VEX cvtsi2ss was leaving garbage in xmm4's upper bits and
poisoning downstream movaps). same trick for sprite_h.
backgrounds went from 4× oversized to 1× screen size, sampling from 4× texture data — supersampling.
the bg-is-black aha moment
then i tried the same patch on the in-game scene background (a different
function, FUN_140193fe0). black screen.
the symptom: the bg quad was being rendered far below the visible NDC range. y from -6.5 to -4.6 - both off-screen bottom.
the patch was naively dividing sprite_w by 4, but the background's
clip-space coordinates are computed as (pos_pixel - origin) / scaler
where the scaler depends on the texture size. with the texture 4× and the
divide making the scaler 1/4 of expected, the result was a 4× clip-space
quad, perfectly positioned to render off-screen.
the fix needed to be different. instead of patching the sprite dimensions
before the math, i needed to scale the clip coordinates after. a
5-byte JMP at the original divss xmm9, xmm1 site redirected execution
into a code cave at the end of the .text section (~220 bytes of zero
padding, lovely free real estate). the cave:
divss xmm9, xmm1 ; the displaced original
mulss xmm6, [rip+0.25f] ; right clip-X
mulss xmm7, [rip+0.25f] ; left clip-X
mulss xmm8, [rip+0.25f] ; top clip-Y
mulss xmm9, [rip+0.25f] ; bottom clip-Y
jmp <return address>
after this, every bg rendered at 1× size with 4× detail.
the multi-frame format
with bg + sprite paths working, i started a full upscale of every .exg in the game. 540 files later, 32 of them failed to round-trip. they were "non-version-1" files i'd been skipping.
turns out the version field at offset 0x04 of the header isn't a version
at all - it's the frame count. these 32 files are multi-frame asset
bundles. trophy.exg has 49 frames of a 240×240 trophy animation.
title.exg has 6 frames of varying dimensions (the title bg, the logo
text, the "REVISITED" subtitle, etc).
each non-first frame is preceded by a 32-byte sub-header that's exactly the main header's bytes 0x08–0x27 (without magic+frame_count). they contain per-frame dimensions, pitch, and format tag. some files even mix format tags across frames (raw RGBA8 frame 0 followed by DXT5 frame 1!).
extending exg_tool.py to read/write multi-frame files came next.
byte-identical round-trip on the 49-frame trophy confirmed the format
parser was correct.
YPAC archives, uncompressed DDS, and 580 portraits
pc/portrait.dat is 178 mb. inside: 580 character portrait .exg files
plus a _packList.txt. wrapped in a YPAC archive format that turned out
to be straightforward - 16-byte header, 72-byte entry records (filename
padded to 32 bytes, then 32 bytes of padding, then u32 size + u32 offset
into the .dat). wrote ypac_tool.py, byte-identity round-trip on the
first try.
efc/effect.dat was 250 mb of effect sprites in the unk20 = 0x3 raw
DDS format i mentioned earlier. extending exg_tool.py to NOT zstd-wrap
those payloads took 20 minutes.
then i upscaled everything: 581 portraits, 145 effects, 54 encounter sprites, plus all the standalone files. 14 minutes for the portrait archive alone (real-esrgan + ncnn-vulkan + python orchestration through multiprocessing). the upscaled portrait.dat is 2.5 gb.
the portraits-look-wrong wall
launched. the portrait textures were now 4× resolution. but they were all crammed together in the center-right of the bottom HUD, overlapping. all six characters compressed into one spot.
the apitrace told me each portrait's vertex buffer had different positions
- spaced ~0.06 NDC apart instead of the ~0.33 you'd want to spread 6 across the screen. exactly 4× too tight. so the engine had some position math that also read sprite dimensions, but my FUN_140194a70 patch only covered the quad-size path, not the layout path.
i spawned a ghidra agent (multi-step task, claude code's Agent tool +
specialized prompt with the full investigation brief) to find the
function. it found FUN_140195fd0 - a sibling of the quad builder in the
same vtable. patched it. zero visible effect. the agent had been wrong -
turned out FUN_140195fd0 was CWipeSpriteRender, used for screen
transitions, not portraits at all.
a second agent run, with better targeting ("look at the 26 callers of the
generic sprite submitter FUN_1400b2040, find one that loops over 6 party
members"), found FUN_140057b00. inside it, the per-portrait positions
weren't computed - they were looked up from data/table/layout.lod by
ID.
the data-driven layout
layout.lod is a text-record file. 1196 entries. each has an ID, a
name like ShowCockpitPortraitRank3 or MemberCardPartyOrder1, and X/Y/Z
coordinates. the engine reads these to position UI elements.
the positions were hardcoded. my texture upscale didn't affect them. but they were apparently being interpreted by an engine pipeline that applied some 4×-related compression on the way to clip space.
wrote scale_layout.py. multiplied portrait entry X coordinates by 4
around design center 480. relaunched. portraits now positioned correctly
across the screen — 3 left of center, 3 right, with the HUD menu in the
gap.
the limit i can't get past
then i did the same for HP bars, level labels, action icons (72 HUD entries
total). some of them moved to plausible positions. others ended up
half-off-screen. the menu items (Party / Hide / Map / Info / Option / Order) crashed the game when scaled.
different UI elements anchor differently. some are top-left, some are center-pivoted, some are offsets-from-parent-container. a single global "scale 4× from center 480" works for the portrait row because it's a horizontal centered layout. it breaks for vertical menus and corner-anchored widgets.
i tried to find a single point in the engine where coordinates are
"compressed by 4× for upscaled textures". an agent confirmed there's an
add cx, cx; add cx, cx (×4) in FUN_140146b70 that scales all layout
lookups, gated by a per-entry "no-scale" flag. but the agent never found a
separate compression that scales-with-texture-size. portraits got
compressed by 4× because that's how the engine internally positions them
relative to the (4× upscaled) sprite atlas they belong to.
i tried scaling texparts.lod (the file that maps named UI parts to atlas
sub-regions, with LEFT/TOP/WIDTH/HEIGHT/RIGTH/BOTTOM fields). black
screen, then noticed in the trace that menu glyphs were being rendered as
adjacent quads of width 0.2 NDC each - 9 of them extending to x=1.75,
mostly off-screen.
that was the killer insight. WIDTH in texparts.lod serves a dual
purpose: it's both the UV sample width in the atlas AND the
cursor-advance between glyphs in horizontal text rendering. scaling
WIDTH 4× gives 4× bigger glyphs but also 4× spacing between them, so the
text strip runs 4× off the right edge of the screen. there's no way to
decouple these in a static file edit.
where i am right now
three evenings in:
what's at 4×: backgrounds, enemies, NPC sprites, character portraits, 138 visual effects, encounter sprites, box UI graphics, single-frame comtex elements, the picturegate CG scenes.
what's still at 1×: title screen, trophy icons, the menu text
atlases (comtex_en/jp, com_05_en/jp). these are multi-frame
glyph-atlas files where the engine's glyph renderer can't be tricked by
file-level edits.
binary patches alive: just the bg code-cave (4 mulss instructions
on clip-space coordinates and a JMP detour). every sprite_w /= 4 patch
got removed once i realized they conflicted with the texture upscale.
tooling i now have: exg_tool.py (handles 4 different EXG variants,
multi-frame round-trip), ypac_tool.py (extract/repack YPAC archives),
upscale_all_4x.py (batch driver), scale_layout.py,
scale_texparts.py, a built-from-source apitrace wrapper, a ghidra
project full of decompiled functions, and a small graveyard of code-cave
designs that didn't pan out.
most of the in-game art is now being rendered from 4× source textures instead of the original 1×. the HUD layout has bugs i haven't solved.
what's next
next session is frida. it's installed, ready to attach to the wine process. the plan is to hook the function that produces those wrong HUD positions while the game is running on the broken screen, watch what values flow through, and patch the specific transform that needs patching. no more guessing from traces.
if that works, the HP bars + level labels + action icons could finally join the portraits at 4×. and then the only thing left at 1× would be the text glyph atlas - which i suspect needs its own engine-level patch to decouple "UV sample width" from "cursor advance". one wall at a time.
what i'd do differently if i started over
i kept reaching for static file edits because they felt safer than binary patches. but every "let me just change this data file" attempt ran into the same wall: the engine's renderer applies transformations after reading the file, and i don't control those transformations from the file side.
the right tool from evening 2 onward would have been runtime instrumentation. traces tell you the result of the math, not the math itself. you can decode every vertex blob and still not know which engine function produced it. that's the gap frida closes - and i wish i'd reached for it on day one.
also: when a multi-step task is bounded but big (decompile 26 callers,
analyze each), claude code's Agent tool is genuinely useful. the brief
matters a lot - vague briefs return vague reports. but the second agent
run (focused, with concrete hypotheses to test) actually unblocked me.
more updates to come. i'm not done yet.