trying to 4× upscale a 2016 jrpg's textures

stranger of sword city revisited is a jrpg i was playing on a 4k monitor. the original assets are at a lower resolution and the engine renders them without any high-DPI scaling, so the textures end up looking soft. i wanted to try replacing them with 4× ai-upscaled versions.

what follows is three evenings (so far - this is a story still in progress) of figuring out the file formats, the engine's rendering paths, and where the limits are.

the file format problem

textures live in data/graphic/*.exg. the format starts with EXGr magic, a 40-byte header, and a zstd-compressed payload. easy enough - write a python tool that decompresses, hands the bytes to PIL or imagemagick, runs real-esrgan-ncnn-vulkan on it, and re-compresses.

the header has fields like width, height, width2, height2, and a mysterious unk20 byte. early on i learned:

unk20 = 0x201 or 0x200 means raw RGBA8 pixels inside
unk20 = 0x203 means a DXT5 DDS file inside
unk20 = 0x3 means... uncompressed DXT5 DDS (no zstd at all). found this one accidentally when 145 effect files all errored at once

width and width2 are usually the same. usually. but in atlas-style files (portraits with cropped visible regions) they differ - the "stored" size is bigger than the visible. and DXT5 alignment slack means width=1008, width2=1007. you can't notice this until you write back a file that the engine then refuses to load.

so the first wall was just understanding the format well enough to round-trip a file byte-identically. it took embarrassingly many attempts.

the texture-renders-at-4× problem

ok, format figured out. swap a bg_001.exg for a 4× upscaled version. launch the game. the background covers four screens - only the top-left quadrant is visible. the engine reads the texture's width/height from the header, computes screen positions as if it's working with 1× textures, and now everything is 4× too big.

i tried the obvious stuff first:

modified the shaders. they're literal passthrough - gl_Position = u_projMtx * u_viewMtx * u_modelMtx * vec4(a_position, 1.0). the positions are pre-computed on the CPU and uploaded as vertex data. scaling in the shader collapses everything toward NDC origin.
changed display.dat. ignored.
wrote an LD_PRELOAD opengl wrapper to intercept glTexImage2D and rescale. wine routes texture uploads through internal paths that bypass procaddr interception, so the hook caught 2 calls one run and 3479 the next. unusable.

what i needed was the function that takes a texture's dimensions and computes the quad's screen coordinates. binary patch territory.

ghidra, finally

i'd been avoiding it. firing up ghidra felt like committing. but there was no path forward without it.

apitrace came first - built from source as a mingw cross-compile (just the wgltrace target because the full build hit gcc 15 strictness issues in dxerr.cpp). dropped the resulting opengl32.dll into the game folder with WINEDLLOVERRIDES=opengl32=n,b. now i had ~200mb traces of every opengl call the engine made per session.

with the trace + ghidra i found FUN_140194a70: a sprite quad builder that reads width and height from [rdi+0x24] and [rdi+0x26] of a sprite descriptor struct, converts them to floats, and uses them in the quad math. the relevant bytes:

0x140194bfa:  66 0f 6e e0  movd     xmm4, eax       ; eax = sprite_w
0x140194bfe:  0f 5b e4     cvtdq2ps xmm4, xmm4

7 bytes total. i replaced it with:

0x140194bfa:  c1 e8 02              shr eax, 2          ; sprite_w /= 4
0x140194bfd:  c5 ea 2a e0           vcvtsi2ss xmm4, xmm2, eax

also 7 bytes. the VEX vcvtsi2ss is a 4-byte instruction with a clean upper-bits source (xmm2 needed to be known-zero, took me a frame to realize a non-VEX cvtsi2ss was leaving garbage in xmm4's upper bits and poisoning downstream movaps). same trick for sprite_h.

backgrounds went from 4× oversized to 1× screen size, sampling from 4× texture data — supersampling.

the bg-is-black aha moment

then i tried the same patch on the in-game scene background (a different function, FUN_140193fe0). black screen.

the symptom: the bg quad was being rendered far below the visible NDC range. y from -6.5 to -4.6 - both off-screen bottom.

the patch was naively dividing sprite_w by 4, but the background's clip-space coordinates are computed as (pos_pixel - origin) / scaler where the scaler depends on the texture size. with the texture 4× and the divide making the scaler 1/4 of expected, the result was a 4× clip-space quad, perfectly positioned to render off-screen.

the fix needed to be different. instead of patching the sprite dimensions before the math, i needed to scale the clip coordinates after. a 5-byte JMP at the original divss xmm9, xmm1 site redirected execution into a code cave at the end of the .text section (~220 bytes of zero padding, lovely free real estate). the cave:

divss   xmm9, xmm1         ; the displaced original
mulss   xmm6, [rip+0.25f]  ; right clip-X
mulss   xmm7, [rip+0.25f]  ; left clip-X
mulss   xmm8, [rip+0.25f]  ; top clip-Y
mulss   xmm9, [rip+0.25f]  ; bottom clip-Y
jmp     <return address>

after this, every bg rendered at 1× size with 4× detail.

the multi-frame format

with bg + sprite paths working, i started a full upscale of every .exg in the game. 540 files later, 32 of them failed to round-trip. they were "non-version-1" files i'd been skipping.

turns out the version field at offset 0x04 of the header isn't a version at all - it's the frame count. these 32 files are multi-frame asset bundles. trophy.exg has 49 frames of a 240×240 trophy animation. title.exg has 6 frames of varying dimensions (the title bg, the logo text, the "REVISITED" subtitle, etc).

each non-first frame is preceded by a 32-byte sub-header that's exactly the main header's bytes 0x08–0x27 (without magic+frame_count). they contain per-frame dimensions, pitch, and format tag. some files even mix format tags across frames (raw RGBA8 frame 0 followed by DXT5 frame 1!).

extending exg_tool.py to read/write multi-frame files came next. byte-identical round-trip on the 49-frame trophy confirmed the format parser was correct.

YPAC archives, uncompressed DDS, and 580 portraits

pc/portrait.dat is 178 mb. inside: 580 character portrait .exg files plus a _packList.txt. wrapped in a YPAC archive format that turned out to be straightforward - 16-byte header, 72-byte entry records (filename padded to 32 bytes, then 32 bytes of padding, then u32 size + u32 offset into the .dat). wrote ypac_tool.py, byte-identity round-trip on the first try.

efc/effect.dat was 250 mb of effect sprites in the unk20 = 0x3 raw DDS format i mentioned earlier. extending exg_tool.py to NOT zstd-wrap those payloads took 20 minutes.

then i upscaled everything: 581 portraits, 145 effects, 54 encounter sprites, plus all the standalone files. 14 minutes for the portrait archive alone (real-esrgan + ncnn-vulkan + python orchestration through multiprocessing). the upscaled portrait.dat is 2.5 gb.

the portraits-look-wrong wall

launched. the portrait textures were now 4× resolution. but they were all crammed together in the center-right of the bottom HUD, overlapping. all six characters compressed into one spot.

the apitrace told me each portrait's vertex buffer had different positions

spaced ~0.06 NDC apart instead of the ~0.33 you'd want to spread 6 across the screen. exactly 4× too tight. so the engine had some position math that also read sprite dimensions, but my FUN_140194a70 patch only covered the quad-size path, not the layout path.

i spawned a ghidra agent (multi-step task, claude code's Agent tool + specialized prompt with the full investigation brief) to find the function. it found FUN_140195fd0 - a sibling of the quad builder in the same vtable. patched it. zero visible effect. the agent had been wrong - turned out FUN_140195fd0 was CWipeSpriteRender, used for screen transitions, not portraits at all.

a second agent run, with better targeting ("look at the 26 callers of the generic sprite submitter FUN_1400b2040, find one that loops over 6 party members"), found FUN_140057b00. inside it, the per-portrait positions weren't computed - they were looked up from data/table/layout.lod by ID.

the data-driven layout

layout.lod is a text-record file. 1196 entries. each has an ID, a name like ShowCockpitPortraitRank3 or MemberCardPartyOrder1, and X/Y/Z coordinates. the engine reads these to position UI elements.

the positions were hardcoded. my texture upscale didn't affect them. but they were apparently being interpreted by an engine pipeline that applied some 4×-related compression on the way to clip space.

wrote scale_layout.py. multiplied portrait entry X coordinates by 4 around design center 480. relaunched. portraits now positioned correctly across the screen — 3 left of center, 3 right, with the HUD menu in the gap.

the limit i can't get past

then i did the same for HP bars, level labels, action icons (72 HUD entries total). some of them moved to plausible positions. others ended up half-off-screen. the menu items (Party / Hide / Map / Info / Option / Order) crashed the game when scaled.

different UI elements anchor differently. some are top-left, some are center-pivoted, some are offsets-from-parent-container. a single global "scale 4× from center 480" works for the portrait row because it's a horizontal centered layout. it breaks for vertical menus and corner-anchored widgets.

i tried to find a single point in the engine where coordinates are "compressed by 4× for upscaled textures". an agent confirmed there's an add cx, cx; add cx, cx (×4) in FUN_140146b70 that scales all layout lookups, gated by a per-entry "no-scale" flag. but the agent never found a separate compression that scales-with-texture-size. portraits got compressed by 4× because that's how the engine internally positions them relative to the (4× upscaled) sprite atlas they belong to.

i tried scaling texparts.lod (the file that maps named UI parts to atlas sub-regions, with LEFT/TOP/WIDTH/HEIGHT/RIGTH/BOTTOM fields). black screen, then noticed in the trace that menu glyphs were being rendered as adjacent quads of width 0.2 NDC each - 9 of them extending to x=1.75, mostly off-screen.

that was the killer insight. WIDTH in texparts.lod serves a dual purpose: it's both the UV sample width in the atlas AND the cursor-advance between glyphs in horizontal text rendering. scaling WIDTH 4× gives 4× bigger glyphs but also 4× spacing between them, so the text strip runs 4× off the right edge of the screen. there's no way to decouple these in a static file edit.

where i am right now

three evenings in:

what's at 4×: backgrounds, enemies, NPC sprites, character portraits, 138 visual effects, encounter sprites, box UI graphics, single-frame comtex elements, the picturegate CG scenes.

what's still at 1×: title screen, trophy icons, the menu text atlases (comtex_en/jp, com_05_en/jp). these are multi-frame glyph-atlas files where the engine's glyph renderer can't be tricked by file-level edits.

binary patches alive: just the bg code-cave (4 mulss instructions on clip-space coordinates and a JMP detour). every sprite_w /= 4 patch got removed once i realized they conflicted with the texture upscale.

tooling i now have: exg_tool.py (handles 4 different EXG variants, multi-frame round-trip), ypac_tool.py (extract/repack YPAC archives), upscale_all_4x.py (batch driver), scale_layout.py, scale_texparts.py, a built-from-source apitrace wrapper, a ghidra project full of decompiled functions, and a small graveyard of code-cave designs that didn't pan out.

most of the in-game art is now being rendered from 4× source textures instead of the original 1×. the HUD layout has bugs i haven't solved.

what's next

next session is frida. it's installed, ready to attach to the wine process. the plan is to hook the function that produces those wrong HUD positions while the game is running on the broken screen, watch what values flow through, and patch the specific transform that needs patching. no more guessing from traces.

if that works, the HP bars + level labels + action icons could finally join the portraits at 4×. and then the only thing left at 1× would be the text glyph atlas - which i suspect needs its own engine-level patch to decouple "UV sample width" from "cursor advance". one wall at a time.

what i'd do differently if i started over

i kept reaching for static file edits because they felt safer than binary patches. but every "let me just change this data file" attempt ran into the same wall: the engine's renderer applies transformations after reading the file, and i don't control those transformations from the file side.

the right tool from evening 2 onward would have been runtime instrumentation. traces tell you the result of the math, not the math itself. you can decode every vertex blob and still not know which engine function produced it. that's the gap frida closes - and i wish i'd reached for it on day one.

also: when a multi-step task is bounded but big (decompile 26 callers, analyze each), claude code's Agent tool is genuinely useful. the brief matters a lot - vague briefs return vague reports. but the second agent run (focused, with concrete hypotheses to test) actually unblocked me.

more updates to come. i'm not done yet.