Thread: Unreal Engine 5 Demo - Dissection

VFX_Veteran

Industry Professional (Verified)
 
Platforms
  1. PC
  2. PlayStation
  3. Nintendo
All:

As I promised, I will continue to make my analysis threads here for all the enthusiasts and those that want to learn a bit more about 3D graphics. Hopefully @brainchild will help me with some of this thread.

First up is taking a look at the editor mode and getting some stats on how our rendering is doing and how Nanite is performing. I don't know Lumen or Nanite in depth, but I will learn as this thread unfolds. Hopefully I can explain things as simple as can be but being a little technical at the same time.

Here are the following specs for my computer:
AMD 3950X Ryzen
RTX 3090 w/24G VRAM
64G DDR4 RAM
EVO 970 2T SSD

Our first video:



A few things to note here: Take a look at the stats on your right.

1) Total RAM Usage to run this demo is around 9-10G. Very light.

2) Total VRAM Usage to run this demo on graphics card is between 14-15G.

3) I wanted to get some metrics at the Nanite streaming and sure enough, they had it in their profiler. Check out the number of stream requests from Nanite: between 9k to 25k streaming requests. Nanite works by streaming in higher res geometry dependent on camera distance. In this demo, we can see that pointing the camera at various instances of the geometry in the scene makes more or less requests. Looking straight up at the sky yields nearly 40 requests (which is expected).

4) Overall performance: 60FPS

This demo is just to get a feel for how it is performing. My intention is to critique this demo. I want to know what makes it tick and understand what it's doing and also how to push the graphics card to the limits. 😂

NOTE:
* The desktop resolution is native 4k
* At first glance, I thought the fidelity of the PS5 demo was sharper, but after looking at the video closely, I can verify that the detail is the same.
* I have not shown a video of actually "playing" the demo yet. We will get to that next.

Questions, comments?
 
Last edited:
I have a question regarding the VRAM usage. For me, the VRAM usage often isn't representative of the real needs by the game. Like a game showing for example 10gb usage but then an 8gb card runs these settings without any issues.

Do we have any info how a 3080 with 10gb deals with this? 14 - 15gb usage - if real - should cause issues, theoretically.

I can't verify it because I won't be able to use my PC till tomorrow night.
 
Nice write-up @VFX_Veteran

Yes, it's clear that UE5 on PC is going to handle data management a bit differently than how it's done on PS5 and XSX. On PC, it can rely on more file caches and pre-fetched data to mitigate the need for constant high IO bandwidth, whereas on consoles the environment is more restricted and more data will have to be streamed directly from the SSD in order to maintain fidelity. If you really want to push your system, you'll have to restrict how much RAM is used. Without much RAM access, the engine will be forced to constantly fetch more data from the disk, which will reveal the limits of your IO bandwidth.

@VFX_Veteran how is the draw distance and LoD? This is a huge issue for me in many games. Open world games are full of pop in and bad LoD transitions. This is an area that really has to improve dramatically this gen I think.

So in this huge open area, how does this demo handle it?

As long as you have the RAM or IO bandwidth to handle all the data requests, there should be no noticeable pop-in. It's really going to depend on the setup.

Streaming buffer size will also affect the streamed asset quality, so it'll need to meet a minimum threshold in other to prevent poor LOD transitions.
 
Last edited:
As long as you have the RAM or IO bandwidth to handle all the data requests, there should be no noticeable pop-in. It's really going to depend on the setup.

Streaming buffer size will also affect the streamed asset quality, so it'll need to meet a minimum threshold in other to prevent poor LOD transitions.

Thanks for the answer. How is it in the current demo with a hardware setup like VFX Veteran owns? Is it enough to avoid ugly pop in? Are the LoD transitions noticable?
 
Awesome write up as usual! Running this on my PC gave me goosebumps fr! The fidelity and smoothness were a God send, as I thought performance would be finicky. I even ran it on a HDD for shits and giggles. Imma export it this afternoon and test that out.
 
I have a question regarding the VRAM usage. For me, the VRAM usage often isn't representative of the real needs by the game. Like a game showing for example 10gb usage but then an 8gb card runs these settings without any issues.

Do we have any info how a 3080 with 10gb deals with this? 14 - 15gb usage - if real - should cause issues, theoretically.

I can't verify it because I won't be able to use my PC till tomorrow night.
Well, to be definitive we have to compile the demo out to an executable and then run it.

You are right that the VRAM usage doesn't necessarily mean that they are using that exact amount because of Windows, etc.. however, it can give us a reference metric to the action on the screen. I can rotate the camera around to more dense meshes and you'd see the VRAM move. So it's good for something. Also VRAM usage starvation will resort back to creating virtual VRAM in main RAM so it can read well over your boards spec and still run without crashing.
 
Thanks for the answer. How is it in the current demo with a hardware setup like VFX Veteran owns? Is it enough to avoid ugly pop in? Are the LoD transitions noticable?
The LOD transitions are noticeable if you look really hard. I have to keep zooming by 1 click and then the other click. It's pretty damn fast and you really won't notice it while playing. I've got a vid coming up today for us to analyze. Stay tuned!
 
The LOD transitions are noticeable if you look really hard. I have to keep zooming by 1 click and then the other click. It's pretty damn fast and you really won't notice it while playing. I've got a vid coming up today for us to analyze. Stay tuned!

This is true. I should have mentioned I meant in the context of normally going through the demo. If you really go out of your way to see the transitions you'll be able to.
 
Nice! Runs surprisingly (to me) well.

Side note: has there been any talk about how UE5 stacks up to the likes of Frostbite, REDengine, CryEngine, etc.? Like, is it looking to be a "I can do anything you can do better" situation here or is it too early?
 
Nice! Runs surprisingly (to me) well.

Side note: has there been any talk about how UE5 stacks up to the likes of Frostbite, REDengine, CryEngine, etc.? Like, is it looking to be a "I can do anything you can do better" situation here or is it too early?
Very good question. For some reason we aren't seeing highlights of these other graphics engines like we used to back in the day. BF6 seems to be more of the same and so does Far Cry. I believe Epic is probably way ahead of other studios (with the exception of 4A and Asobo).
 
Very good question. For some reason we aren't seeing highlights of these other graphics engines like we used to back in the day. BF6 seems to be more of the same and so does Far Cry. I believe Epic is probably way ahead of other studios (with the exception of 4A and Asobo).
Ignore if already answered.

How do you rate the decima engine that ps4 first parties now use. Regarding features and visual quality? Compared to other engines.
 
Great posts guys. It's a little above my head so sorry if this question is "wrong". If VRAM runs out but normal ram is a substitute does the amount of vram even matter at all? I have a 3080 10gb with 32gb 3200 Ram. Resident Evil 8 ran at something like 13gb over/10gb total. Never noticed any problems, ram max everything in 4k.

Also heard vram usage is different between what is used to what is allocated. I.e it will allocate what ever vram it has available but what it actually uses is a lot smaller? Is this true?

Sorry for the very novice post.
 
If VRAM runs out but normal ram is a substitute does the amount of vram even matter at all?
Yes. Your system RAM will be slower and might become a bottleneck if a game or application that's looking for VRAM has to pull from system RAM instead.
Also heard vram usage is different between what is used to what is allocated. I.e it will allocate what ever vram it has available but what it actually uses is a lot smaller? Is this true?

There are cases where this can be true, but it depends on how the game is optimized. Not every game will allocate all available VRAM, but some will, and if they do, it doesn't necessarily mean that the total allocated amount is being used.
 
UPDATE: One of the biggest drawbacks that I see with Lumen is that it uses baked out signed distance field geometry and a scene-based representation of a static proxy world. Here is a link from the UE5 docs on how the two scenes differ. You don't need to know the details about SDF, just realize that it is a crude presentation of the world.

DistanceFields_Global.webp


The problem with this approach is that it doesn't include moving objects in Lumen's rendering pass. Therefore you get the same limitation of other GI light probes algorithms that don't test for occlusion from a static ground. The character is completely out of the original scene's lighting and it looks like they are pasted on the static scene.

Here is a video I took yesterday showing this:

You can see the character's feet are touching the ground in shadow but there is no occlusion (or shadow) projected onto the ground:



Discuss.
 
UPDATE: One of the biggest drawbacks that I see with Lumen is that it uses baked out signed distance field geometry and a scene-based representation of a static proxy world. Here is a link from the UE5 docs on how the two scenes differ. You don't need to know the details about SDF, just realize that it is a crude presentation of the world.

DistanceFields_Global.webp


The problem with this approach is that it doesn't include moving objects in Lumen's rendering pass. Therefore you get the same limitation of other GI light probes algorithms that don't test for occlusion from a static ground. The character is completely out of the original scene's lighting and it looks like they are pasted on the static scene.

Here is a video I took yesterday showing this:

You can see the character's feet are touching the ground in shadow but there is no occlusion (or shadow) projected onto the ground:



Discuss.


Yeah, aside from the 'infinite bounce' aspect of Lumen (which is something 4A has accomplished with their engine as well), I'm seeing the same problems I typically see with other GI solution: no or poor self-occlusion under indirect light. Same deal with the lack of noticeable dynamic ambient occlusion. Hopefully this will improve with time, given the fact that it is an early access build.
 
The problem with this approach is that it doesn't include moving objects in Lumen's rendering pass. Therefore you get the same limitation of other GI light probes algorithms that don't test for occlusion from a static ground. The character is completely out of the original scene's lighting and it looks like they are pasted on the static scene.
Yep. Since the first showcase, the character looks disconnected from what's going on around her, and this is from someone not in the industry.

I think it actually reduces the impact of the visuals a little bit. You have near photorealism in some of these scenes, but a character that doesn't feel particularly grounded in that world, and comes off a bit cartoony.

Of course, this is nit-picking on my part, as I'm blown away by UE5 in general.
 
I feel like now's the time to ask, when we have two people deeply versed in this stuff.

How can hardware capable of displaying only a certain amount of polygons, go far, far above that, like we see in Mira's post? It almost seems a bit like magic. Software irrespective of hardware capabilities. How are we displaying Hollywood quality assets on home-console hardware and budgets?
 
I feel like now's the time to ask, when we have two people deeply versed in this stuff.

How can hardware capable of displaying only a certain amount of polygons, go far, far above that, like we see in Mira's post? It almost seems a bit like magic. Software irrespective of hardware capabilities. How are we displaying Hollywood quality assets on home-console hardware and budgets?
Easy. It's called instancing.

Take all the geometry data from one object and use a matrix to transform that object to other locations and then draw the same object again. It takes 0 memory to do this. If this test was with *unique* geometry it wouldn't fare so well (ala FS2020). Probably choke in the thousands very quickly. Also you'd have to store different materials in memory which would crash the system. It's literally a smoke screen.
 
I feel like now's the time to ask, when we have two people deeply versed in this stuff.

How can hardware capable of displaying only a certain amount of polygons, go far, far above that, like we see in Mira's post? It almost seems a bit like magic. Software irrespective of hardware capabilities. How are we displaying Hollywood quality assets on home-console hardware and budgets?

If your question is about how it can render so many copies of a single object, @VFX_Veteran's explanation of instancing pretty much covers it.

If your question is about how it can render so much detail in general (that would appear to rival CGI assets), the simplest way to put it is that it's merely an optical illusion.

The rendering engineers know that once triangles shrink down to the pixel level, your eyes cannot resolve more detail than that per pixel, so they make sure that the engine only renders just enough to process no more than one triangle per pixel (at least, this is the goal). Traditional rasterization would simply render what's in view of the camera, including any subpixel detail. That method isn't too problematic for game quality meshes, but for CG models it would be too much for the GPU to handle. Nanite gets around this problem by not rendering any geometry at the subpixel level, essentially only rendering what your eyes will be able to perceive, saving a ton of geometry.

As for the models themselves, yes the source can be CG quality but what's actually being rendered is only a fraction of that, you just won't be able to tell the difference, and that's why Nanite works so well (and also scales well with resolution). Also, the assets are compressed to save on disk space.
 
If your question is about how it can render so many copies of a single object, @VFX_Veteran's explanation of instancing pretty much covers it.

If your question is about how it can render so much detail in general (that would appear to rival CGI assets), the simplest way to put it is that it's merely an optical illusion.

The rendering engineers know that once triangles shrink down to the pixel level, your eyes cannot resolve more detail than that per pixel, so they make sure that the engine only renders just enough to process no more than one triangle per pixel (at least, this is the goal). Traditional rasterization would simply render what's in view of the camera, including any subpixel detail. That method isn't too problematic for game quality meshes, but for CG models it would be too much for the GPU to handle. Nanite gets around this problem by not rendering any geometry at the subpixel level, essentially only rendering what your eyes will be able to perceive, saving a ton of geometry.

As for the models themselves, yes the source can be CG quality but what's actually being rendered is only a fraction of that, you just won't be able to tell the difference, and that's why Nanite works so well (and also scales well with resolution). Also, the assets are compressed to save on disk space.

When I read the nvidia geforce guide for MGS V: The Phantom Pain I remember them talking about how the Fox Engine uses "sample points" to decide whether an object should be rendered or not, leading to much larger differences in detail than you would normally see when increasing the resolution in most other game engines, with entire objects being culled if a more than a certain amount of pixels of said object falling outwith the sample point grid. Leading to differences like this:


I'm guessing this is part of the reason why the Fox Engine was so performant. Is what Nanite is doing like a much more complicated version of that or totally unrelated?

edit - This is the quote from the Geforce article:

"Ground Zeroes and The Phantom Pain benefit greatly from higher rendering resolutions, more so than any other games in recent memory, as the rendering of many game elements is directly tied to the number of sample points: if an element isn't within a sufficient number of sample points it simply isn't displayed. By raising the resolution the number of sample points is increased, additional game elements are rendered, and overall fidelity greatly improved.

If this is the first you're hearing of sample points imagine your screen with a grid overlaid: each pixel of a game element, be that a blade of grass, a character, an object, or a building will fall within one or more of the grid's squares, and if half or more of a pixel is inside a square it is correctly rendered. In most instances fine detail is only affected to any great extent by this, with several pixels falling outside of squares, preventing them from being rendered. This leads to grass with visible gaps along the blades, and leaves with missing branches, to name but two examples. To rectify the problem the screen resolution can be increased, adding more squares to the grid, each smaller in size, giving fine detail a greater chance to fall sufficiently within a square and be rendered.

In the case of the latest Metal Gear games we see the expected increase in detail as the resolution increases, but as it does we also see entirely new items rendered, most typically in the background where even guard towers and trees are comparatively small. The cause: these game elements are required to be included in a specific number of sample points before being displayed. This makes sense at sub-HD resolutions as sub-HD players a) wouldn't be able to see the detail, and b) would have their performance improved by the removal of the extra polygons. However, even at 1920x1080 when running max settings at a locked 60 frames per second we still aren't seeing everything Ground Zeroes and The Phantom Pain have to offer, with new detail being introduced even at 3840x2160."
 
Last edited:
  • Like
Reactions: brainchild
When I read the nvidia geforce guide for MGS V: The Phantom Pain I remember them talking about how the Fox Engine uses "sample points" to decide whether an object should be rendered or not, leading to much larger differences in detail than you would normally see when increasing the resolution in most other game engines, with entire objects being culled if a more than a certain amount of pixels of said object falling outwith the sample point grid. Leading to differences like this:


I'm guessing this is part of the reason why the Fox Engine was so performant. Is what Nanite is doing like a much more complicated version of that or totally unrelated?

edit - This is the quote from the Geforce article:

"Ground Zeroes and The Phantom Pain benefit greatly from higher rendering resolutions, more so than any other games in recent memory, as the rendering of many game elements is directly tied to the number of sample points: if an element isn't within a sufficient number of sample points it simply isn't displayed. By raising the resolution the number of sample points is increased, additional game elements are rendered, and overall fidelity greatly improved.

If this is the first you're hearing of sample points imagine your screen with a grid overlaid: each pixel of a game element, be that a blade of grass, a character, an object, or a building will fall within one or more of the grid's squares, and if half or more of a pixel is inside a square it is correctly rendered. In most instances fine detail is only affected to any great extent by this, with several pixels falling outside of squares, preventing them from being rendered. This leads to grass with visible gaps along the blades, and leaves with missing branches, to name but two examples. To rectify the problem the screen resolution can be increased, adding more squares to the grid, each smaller in size, giving fine detail a greater chance to fall sufficiently within a square and be rendered.

In the case of the latest Metal Gear games we see the expected increase in detail as the resolution increases, but as it does we also see entirely new items rendered, most typically in the background where even guard towers and trees are comparatively small. The cause: these game elements are required to be included in a specific number of sample points before being displayed. This makes sense at sub-HD resolutions as sub-HD players a) wouldn't be able to see the detail, and b) would have their performance improved by the removal of the extra polygons. However, even at 1920x1080 when running max settings at a locked 60 frames per second we still aren't seeing everything Ground Zeroes and The Phantom Pain have to offer, with new detail being introduced even at 3840x2160."

It's a different approach to solving a similar problem: avoid rendering too much geometry. How the Fox engine handles it is to decide which elements of each asset to render depending on whether or not they fall within a sample point or not, which is something that will have obvious visual artifacts at low resolutions since the sample points will be lower. In nanite's case, even at low resolutions, CG models will still look like CG models because the 'one triangle per pixel' rule still applies; it will simply have the effect of looking at CG in low resolution, but the perception of the geometry detail on display will not change.

Basically, Nanite rendering is all about not rendering more than your eyes can resolve, at any resolution. The Fox engine's approach is just a different way to handle LOD management (it's really just using NVIDIA DSR as a way to scale LOD) and still renders subpixel detail of any asset being rendered, so CG models wouldn't work using that method.
 
Last edited:
Is the PS5 demo available on PC?

I haven't heard anything about this. I could ask and see what Epic tells me.

It's not available, though it should be noted that there really isn't anything it was doing that the current iteration can't do (you just know PS5 fanboys are gonna act like it's only possible on PS5). The demo should be reproducible in the early access build should one be so inclined to do so.
 
@VFX_Veteran Hey bro, was wondering if you ever got that video put together? Interested to hear your thoughts so far and how you think this will effect games in the near future. Specifically wondering how any of this new tech will effect physics based foliage, if you've messed around with that of course.
 
Easy. It's called instancing.

Take all the geometry data from one object and use a matrix to transform that object to other locations and then draw the same object again. It takes 0 memory to do this. If this test was with *unique* geometry it wouldn't fare so well (ala FS2020). Probably choke in the thousands very quickly. Also you'd have to store different materials in memory which would crash the system. It's literally a smoke screen.


How about atomontage, euclideon, and atom view?(atom view being notably the more impressive of the 3) These seem to be able to take even noninstanced photogrammetry from the real world and run it at high resolution and high framerate.
 
  • Brain
Reactions: VlaudTheImpaler
One of the UE 5 dev.s was asked about unique geometry, suggested no performance cost for millions of unique meshes, said haven't tried it yet might be bugs, but apart from memory consumption the rendering performance shouldn't be affected. So far no one at epic has had any issue using lot of unique geometry.

Said what could cause issue is if there were too many unique materials, but for same material not affected by use of distinct meshes.
 
Will say other comments from the nanite breakdown.

They expect to someday be able to handle skeletal meshes. Said translucency is a bit of a challenge. Also said toughest nut is on the fly realtime procedural geometry, they have no idea how to start tackling that.

What is interesting is that the software based handling of small triangles is 3x faster than hardware rasterization, for big triangles hardware is faster than software.

I wonder if mesh and primitive shading h/w is significantly accelerating this aspect, or if further gains could be had with specialized h/w for small triangles.
 
@VFX_Veteran Hey bro, was wondering if you ever got that video put together? Interested to hear your thoughts so far and how you think this will effect games in the near future. Specifically wondering how any of this new tech will effect physics based foliage, if you've messed around with that of course.
Sorry man, I've been really occupied. Had a close friend die on me and trying to establish some normalcy in my life. I can't promise I will make a video of it since I'm pretty wiped out at the moment.
 
Sorry man, I've been really occupied. Had a close friend die on me and trying to establish some normalcy in my life. I can't promise I will make a video of it since I'm pretty wiped out at the moment.
I know that feeling man. Someone just came into the gas station where my friend worked, on thanksgiving day no less, and shot him right in the face with a sawed off shotgun for a pack of cigarettes. We think it was some sort of gang initiation though. Anyway, I know how much it can change your world. My whole fam is praying for you and yours and your friends as well.

Just know that if/when, after you've taken the time to process, you are ready to get back at it, most here appreciate you and your video game analysis.