I've had a small hobby project that I've been working on now and then over the last few weeks. I was trying to improve part of the engine that has largely been unchanged for 15 years and some people thought I was crazy for even caring about it. Lucky for me they put up with my obsessions around here and the results of my work is coming soon to a build near you.
TLDR: the FPS counter in game is now more accurate and if you use the Max Framerate setting instead of Vertical Sync you may notice a smoother experience when you're running fast enough to hit the cap.
Let me start by explaining why I even care about this, and to do that I need to tell you about my craptop. My play time is split between my high-end PC and super low-end laptops that don't even have a discrete graphics processors. I've been living in this dual universe for about 5 years now and has been extremely helpful for testing graphics scaling options and has motivated plenty of optimizations.
My Original 2013 MacBook Air (Bootcamp)
One of the features that came to Warframe last year was Dynamic Resolution and it's made it possible for me to enjoy over 30 FPS in places that would have been crushed without it. However, if you remember when we launched this feature, it was a bit of a mess: despite all our testing ahead of time we found that it malfunctioned on a lot of people's PCs and made the game blurry for no good reason.
It turns out that there's a key element of the system's graphics driver that we couldn't seem to depend on: GPU time measurements were often unreliable and inaccurate especially when running Windowed or Borderless fullscreen. For Dynamic Resolution to work we need to be able to measure these times on the GPU and use that information to scale the detail to just high enough to match the speed of your CPU -- if those numbers were wrong we might be tricked into making your game blurry.
In our struggle to find a solution to this problem we found several other game developers with the same issue including one that claimed to be working with the major graphics card vendors to solve it. I was, however, bound and determined to get Dynamic Resolution to work properly with my laptop and continued to poke at it.
Disclaimer: this worked for me, on the Intel IGPs I've tested, with Windows 7 and Windows 10 and may not work for you (but please let me know if it does!)
You might have guessed by now that my trick to get accurate GPU time measurements was to disable Vertical Sync. If you're running Windowed or Borderless Fullscreen this probably won't cause any visible tearing (or at least none that I've noticed) because the Desktop Window Manager (DWM) will composite the game window and present it to your display on a Vertical Sync anyway.
This brings me back to where we started; Max Framerate. If you want to run without Vertical Sync you should set a limit because the faster you run the hotter your system will get; not only can this make your PC's fans annoyingly loud or even damage systems with improper cooling, it is especially important if you're playing on a laptop because running hot can trigger thermal throttling limits and that slow your system down again.
In my case, I set my Max Framerate to my desktop Refresh Rate and have enjoyed the performance boost from Dynamic Resolution for almost a year now. However, as we continued to squeeze out more and more performance I found myself inching closer to that 60 FPS cap I had set and that's when I found a whole new set of problems.
On my PC I play with a mouse and keyboard but on my laptop I play with a controller because I find it more comfortable when I have my feet up in my recliner. One of the things that's easier to appreciate when you're playing with a controller is the consistency of your framerate: if the frame-rate isn't smooth and steady it's harder to aim because your crosshairs jump unpredictably. Paradoxically I found the closer I got to 60 FPS the less consistent my aim became!
As I dug deeper I found these inconsistencies showed up even on my high-end PC; here's a graph of Frame Time taken on my i7-5960X / GTX 1080 Ti standing in the orbiter:
Frame Time is the reciprocal of Frame Rate and a time-spike is often referred to a micro-stutter or dropped frames. In that image those spikes of 10ms or more. If you want to hit 60 FPS that gives you only 16.7ms per frame and so a sudden jump by 10ms is a big deal; if you're trying to cap for a 144 Hz monitor your budget is just under 7ms which is even more sensitive to that size of a jump.
The first thing I did was improved the accuracy of some of the arithmetic used to enforce the limit; this helped but there was still a lot of variance (note that because the really big spikes are gone the graph automatically zoomed in closer).
As I dug deeper I found a fatal flaw in the original code: it naively expected a simple Windows API to be much more accurate than it was in practice. When we needed to wait for our cap we would ask the operating system to put us to sleep and wake us up after a few milliseconds, however, because of the way Windows schedules threads, instead of waking us up promptly it would often wait for our next scheduling quantum instead. Since a scheduling quantum might be 10-15ms that would often overshoot our requested delay by a wide margin!
It took a few experiments but I found a more sophisticated way to work with Windows to get woken up at the right time; it looks a lot better but it's still not quite perfect:
You can't tell from the graph because there's no scale marked but in this case the oscillation is usually about 1ms here! That's good, but not great -- remember that if you want to sync to a 144 Hz monitor your target frame time is about 7ms so a variance of one or two is still a substantial amount.
After a few more failed experiments I made some more progress: the variance in this final version is only around 20 to 30 microseconds!
After all this I was pretty excited but there was one last thing that bothered me: the graph was telling me that I was within spitting distance of a perfectly smooth 144 FPS but the in-game FPS display was showing me a few frames less!
Again, I went digging and found some code written many years ago that was just a little bit off. We smooth the FPS display value slightly (we take the harmonic mean of the last 64 frames to make it a bit easier to read) but this code was not as precise as it should be -- it was skewing the number slightly lower! One last easy fix and my pride was satisfied:
So I now get much more consistent frame-rate when I'm using the Max Framerate instead of Vertical Sync, and the FPS counter in game is more accurate no matter what you use!
Bonus Notes:
- 32-bit Warframe does not benefit because the timers are much less accurate
- The consoles were fine (they use Auto Vertical Sync and Dynamic Resolution)
- Dynamic Resolution is off by default on PC (we will revisit after Windows XP is retired and are able to optimize for more modern operating systems)
- Also coming in this build: NTSC 59.59 Hz frame cap option
I'm super excited to see this change go out today. If you're like me and are using Max Framerate I'd love to hear if you noticed an improvement with this!