NVIDIA DLSS 3’s Frame-Generating Magic Explored In Early Performance Testing
NVIDIA has made no bones about its embrace of AI to continue the march of 3D graphics innovation. The company has developed and refined its Deep Learning Super Sampling (DLSS) techniques significantly since it was introduced way back in 2018. The latest incarnation introduced with the GeForce 40-Series cards has been put to an early test by the image quality connoisseurs over at Digital Foundry.
NVIDIA DLSS 3 Tech Recap And Potential PitfallsDLSS 3 is a major revision to the technology and introduces a new feature that can generate extra frames in-between rendered frames seemingly of thin air. It leverages a significantly faster optical flow accelerator to accomplish this, so don't expect to see this technology extend to prior 30-Series and 20-Series GeForce cards.
In a nutshell, the GPU analyzes sequential rendered frames to identify identical pixels, and then calculates an optical flow map to describe the changes. These are then combined with the game engine’s geometric motion vectors to create an in-between frame, to smooth apparent motion, but in and of itself has some limitations.
For starters, there is latency added because the generated frame appears between the two source frames used. This necessarily means the second rendered frame is effectively delayed slightly to maintain consistent frame pacing, though it should not be noticeable in general. In theory, this is taking a rendered frame that might be displayed for 6ms on-screen, but not showing it until the last 3ms of the window while the “free” generated frame is shown for the first 3ms. This may not be ideal for latency-sensitive games like competitive shooters, but could be a welcome trade-off for immersive visuals-first titles.
In addition, objects that suddenly appear in-frame either from the edges or by appearing from behind something else, a process called de-occlusion, do not have any reference data to pull from. This could also result in visual artifacts which may detract from the viewing experience. If you want more information about how NVIDIA says it addresses these problems, be sure to check out our Ada architecture deep dive.
DLSS Performance Testing
What we (and you) are interested in, is how the technology fares in practice while gaming, of course. Digital Foundry’s breakdown examines its performance in Spider-Man Remastered, Cyberpunk 2077, and Portal RTX, using special builds with these features enabled. Many of the comparisons are shown at reduced speed so the AI-inserted frames can't be seen from 120 fps source footage because of YouTube’s 60 fps limitation. Also keep in mind that our screenshots are after YouTube’s compression and are not necessarily an indicator of true image quality.
The initial DLSS 3 frame generation demo using Spider-Man shows just how much smoother gameplay can appear. Granted, this is a scene with very little de-occlusion as the camera is moving forwards and on-screen subjects are both small and fairly slow moving. This also give us no indication of input latency, though perhaps we here at HH will explore that metric later on.
Cyberpunk 2077 confirms a massive performance uplift delivered by DLSS 3. Digital Foundry compares scenes running on the GeForce RTX 4090 with native 4K rendering, 4K DLSS 2 Performance mode, and 4K DLSS 3 with frame generation. Due to current embargo restrictions, Digital Foundry is not able to provide exact framerates, but relatively speaking, DLSS 2 Performance mode alone provides around a 2.5x uplift over native rendering. Frame generation in DLSS 3 nearly doubles this again for about a 4x total improvement in framerate.
This early testing revealed less of a performance gain in Spider-Man, when tends to be more CPU-limited. Using a very repeatable quick-time-event segment, the tests show an uplift of only a few percentage points with DLSS 2 Performance mode alone. This restricts the potential of DLSS 3 frame generation to around a 2x total gain compared to traditional rendering. Any increase in framerate is welcome, of course, but this shows some titles will benefit much more than others.
On the opposite end of the performance spectrum, Portal RTX testing shows total DLSS 3 frame generation improvements of over 5x. The game’s simpler geometry allows DLSS 2 Performance to operate more efficiently with 3x the performance alone before frame generation even comes into play. Digital Foundry did note that they struggled to find a scene that performed within its 120-fps performance window cap, so while true framerates are not yet disclosed, we can expect them to be very high in this title, even at 4K.
DLSS 3 Frame Generation Latency Impact
Jumping back to the idea of frame latency, Digital Foundry notes that DLSS 3 forces NVIDIA’s Reflex technology to be enabled when frame generation is turned on. As with performance, the latency “cost” varies per game. In Portal RTX, general performance is pushed so high by DLSS in general, that the added latency of frame generation is negligible next to the latency already incurred by rendering and path-traced lighting—just an extra 3ms over DLSS 2 Performance. This amounts to about 1 frame if the game were running at over 300 fps.
Cyberpunk 2077 similarly sees improvements for DLSS 3 latency with frame generation relative to native rendering regardless of the latter’s Reflex setting. Most of this is due to the general performance uplift provided by DLSS frame upscaling in the first place. As such, we see more of a penalty incurred here compared with DLSS 2 Performance mode. We cannot say how much real-world impact this has, though, without knowing the game’s true framerates.
Rounding out the Reflex comparisons, Spider-Man exhibits both the best and worst showing for latency. On the plus side, DLSS 3 frame generation is the fastest of the three titles at just 38ms on average. However, the game just generally runs fast, and this result is hardly and different from native rendering. DLSS 2 Performance mode offers again lower latency, but all these results are within a perfectly acceptable realm to feel fluid.
DLSS 3 Image Quality Comparison
DLSS 3 frame generation is imperfect, but fares far better than both Adobe and Topaz’s solutions. The is particularly impressive as DLSS 3 is performed in real time while the other two act on recorded footage at their own pace. Topaz artifacts tend to manifest as blurs while Adobe injects odd ghosting around areas of motion.
DLSS 3, by contrast, does not struggle with motion. Generated frames remain sharp and ghosting is minimal. The exception is around de-occlusion. In this example, we can see both a blurry ghosting of the post and some unsightly distortions around Spider-Man’s feet. It all happens so fast though that it is imperceptible even, watching the half-speed footage. We can’t say the same of the Adobe and Topaz generated and processed footage.
We will be sure to conduct our own analysis of the DLSS 3 tech now that we have cards in-hand for testing. Nevertheless, this is a promising start for NVIDIA's fledgling neural network-driven upscaling technology. NVIDIA’s Reflex technology is really an unsung hero in the frame generation pipeline and completes the “magic” of helping DLSS 3 generate a frame out of thin air while minimizing latency in the experience.