Published on

[Dev Log] Scrapping 4 Months of Code: The Failure of Photon Fusion

Authors
  • Name
    Logan Kim
    Twitter

Cost, Control, and the Limits of Solo Development

With the pride of a 15-year server architect on the line, I designed this game’s infrastructure to be serviced on a GCP (Google Cloud Platform) Kubernetes environment. Factoring in load balancing and auto-scaling, it was a logically flawless, smart configuration.

However, as soon as I deployed that heavy Advanced KCC module I mentioned in the previous post onto the dedicated server and began stress testing, ominous logs started appearing on my monitor.

1. Screaming CPUs and Brutal Calculations

My local development server is a 13th-gen i5 machine. As I incrementally increased the simulated concurrent users to apply load, the local server's coolers started screaming. I immediately sensed this was more than just a lack of optimization.

To get objective metrics, I cross-validated the issue with AI (Gemini, GPT, Claude). The results were brutal. Based on a GCP Kubernetes medium node (4 vCPUs), the math showed that sustaining 200 concurrent users while maintaining these heavy physics calculations and network synchronization was next to impossible.

2. The Uncontrollable Tick Rate

A virtue of an engineer is knowing how to quickly admit when something doesn't work.

Digging into the root cause, the core issue lay in how Photon Fusion handled tick rates. I had set the network synchronization cycle to 30(Client) / 60(Server) / 30(Client). This meant the client sent data 30 times a second, but the server processed it with a precision of 60 times a second. Lowering this server tick immediately caused fatal jittering, making the characters stutter across the screen.

The foundation of MMORPG architecture is dynamic resource allocation. To protect server CPUs, you must lower the tick rate in crowded towns or peaceful scenes, and increase it during combat where precise hit registration is required.

So, I started refactoring to dynamically adjust the tick rate per scene. But no matter what I did, it wouldn't adjust. The AIs just spat out hallucinated garbage, combining APIs that didn't even exist. After countless decompilations and trial-and-error, the truth I faced was devastating: "Photon Fusion cannot dynamically change its tick rate at runtime after the server has started."

3. Meaningless Struggles

It pained me to lose the 4 months of code and time I had invested in Photon Fusion. I wanted to salvage it somehow.

If I couldn't touch the server's tick, I tried forcefully reducing the rate at which the client sent packets to the server. But structurally, this was a pointless endeavor.

The server engine was still spinning at 60Hz relentlessly, waiting for data and calculating. In a situation where a town is packed with users and the broadcasting load increases exponentially, forcefully manipulating the client's send rate merely added dirty exception-handling code to Unity's Update() loop. My attempt to defend the server CPU was a complete failure.

4. The Decision to Scrap Everything: 4 Realistic Grounds

Ultimately, I decided to scrap all the network code I had written over 4 months while filling the voids in the official documentation. The sunk cost was massive, but it was a diseased limb that had to be amputated if I was going to 'complete' and 'sustain' this project.

The reasoning was as follows:

  1. Financial Unsustainability (Opex Limit): The revenue structure of an indie MMORPG is obvious. I can't charge a monthly subscription fee like Final Fantasy or WoW. I have to maintain servers solely on initial package sales revenue. If I have to pay expensive GCP infrastructure costs on top of Fusion's SaaS fees, it is a structurally impossible setup to turn a profit.

  2. Clash of BM Philosophy: I did not want to induce gacha (loot boxes) or predatory microtransactions just to earn server maintenance fees. It goes against my engineering philosophy, and honestly, modern gamers won't even look at an indie game that does that.

  3. Vendor Lock-in: Letting the fate of my project become dependent on a third-party cloud SaaS where I cannot control the costs is an unacceptable risk for an architect.

  4. Accumulation of Technical Debt: Unfriendly documentation and a black-boxed framework will inevitably return as fatal technical debt when trying to maintain a long-term live service.

Full Circle

I threw four months into the void. It was a time that made me realize, down to my bones, why you rarely see "Solo-Developed MMORPGs" in the market.

But I have absolutely zero intention of giving up. If the system doesn't fit me, there is only one option left. Design an architecture from scratch that fits my tastes and my wallet.

Back to a blank slate. The next goal has become crystal clear.