• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

PS5 Pro devkits arrive at third-party studios, Sony expects Pro specs to leak

From one extreme to another...yes the ps5 gpu is no 3080 but I have gamed on a 3080 and the way you are presenting it as some unreachable monster for the current timeline is wierd to say the least. The 3080 is not some ludicrious expectation and would be the bare minimum requirement to justify the pro console.

That being said yes the pro is never matching the 4080 magic optimization or not no way in hell.
The guy above also didn’t understand what I said no one said the ps5 is equal to a 3080 raw hardware grunt is different from performance especially in only %5 of titles
 
Nope. Did you see the size of the OG PS5? I am beginning to think that a lot of people here just talk without really knowing what they are talking about.

The OG 2020 PS5, along with its size, drew ~220W on game load. That dropped to 202W with the 6nm revision, which is the same chip the PS5slim uses. That GPU you are talking about on the PC, doesn't even run at 2.7Ghz, it runs at 2.5Ghz and draws over 260W. And that is just a GPU alone. There is no way, not a chance in hell, that the PS5pro has a GPU clocked that high when there is also a CPU to contend with and it needs to fit into a chassis for a console. Just doesn't make sense.

Further, NEVER has a console variant of a PC GPU/CPU matched the PC equivalent in clocks/power consumption. They are always downclocked, and for good reason. If the PS5pro is on 5nm, Do not expect its GPU to be clocked anywhere above 2.4Ghz. If anything, 2.35GHz is more likely.

I am sorry but I don't see how what you just posted here proves they are actually being used. I am not saying they aren't real, just that we aren't seeing them anywhere. And you shouldn't even be using a "benchmarking tool" for this argument, as the argument here is if its even being used in games.

Here's the problem. Let's take 3 GPUs from Nvidia for instance. the 2080, 3080 and 4080. Take a game like RE4 running on all 3 in 1080p and no RT so this way none of those GPUs can be bottlenecked either by RT performance or RAM as RE4 in 1080p peaks at 9.4GB RAM utilization, so only the 2080 may suffer since it has only 8GB, which mind you, should work in favor of the other GPUs.

2080 10TF - 98fps
3080 14.8TF is +48% vs 2080 or if using the claimed 29.7TF its +197% vs 2080 however - 130fps only +32% vs 2080
4080 24.4TF is +140% vs 2080 or if using the claimed 48.7TF its +387% vs 2080) - 200fps only 104% better than 2080

See what's happening there? The resulting performance is more in line with the actual TF difference and nowhere near the claimed TF differences. For giggles the 4090 runs at 228fps. And remember, the 2080 is the only GPU here that is even RAM bottlenecked. See why I am not buying it?
The expectation is it will be on at least 4nm not 5
 
Yeah 3080 for a pro console is reasonable. Though personally I think it would be just below this. But someone was comparing base PS5 to a 3080, and others saying pro will be 4080 class.
you are aware performance fluctuates quite radically between games right for example the ps5 performs mostly like a 2080 but there are games where it’s only a 2070 and others where it’s above a 2080 super or even ti if the pro is anything like that I expect it to perform mostly like a 4070 ti but have very very fringe examples (like %2 of games) where it actually does close in on a 4080 in pure raster
 
I swear you gotta be a troll. The 6800 outperforms the PS5 by 24% in TLOU. Where are you even getting your numbers from?

There isn’t a single game where the PS5 comes within 2% of the 3080 or beats it. At worse, the 3080 is over 25% faster.
The game favors amd cards so the 6800 outperforms the 3080 there…
 
I’m
The 4080 is 50% faster than the 3080 which in turn is over 60% faster than the PS5’s GPU. It’s over 2.2x faster than the PS5’s GPU. How the fuck is it unrealistic to expect the PS5 Pro to land within the ballpark of the 3080 in both raster and RT but fall short of the 4080 by a sizable margin?

At this point, I’ll assume you have a mental handicap.
Rdna is far better at its raster than rt I think its far more likely that the pro can equal the 4080 is raster in select games than it ever could match let alone exceed a 3080 in rt. It’s just insane you label everyone names and call people trolls for disagreeing with you yet have opinions like this
 

ChiefDada

Member
Based on leaks it will be ~7800XT so few % stronger than 4070. It also can be weaker than 7800XT thanks to console power limits (if they just don't go for 300W TDP). 4070ti is ~20% better than that and 4080 is ~30% better than than 4070ti. People expecting 4080 performance are going to be disappointed...

Shouldn't we expect PS5 Pro to have higher clocks/power efficiency than the 7800xt since it is still monolithic?

Exactly! PS5 pro will be close to 3080 until raytracing, then more than likely 3070ti. which is great 😀 don't get me wrong.

Based on rumors with focus on RT and supposed 2x RT performance, PS5 Pro is likely better than 3000 series and more comparable to 4070 RT capabilities.
 

HeisenbergFX4

Gold Member
1200 + replies for a device that doesn't actually exist?
The Sopranos Wink GIF
 

winjer

Gold Member
Shouldn't we expect PS5 Pro to have higher clocks/power efficiency than the 7800xt since it is still monolithic?

Depends on how it's done.
The 7800XT has 64MB of L3 made in N6, that is connected via Infinity Fabric. This saves a good amount of power, since every time there is a cache hit, there is no need to go to memory.
If the PS5 has no L3, then it will require more accesses to memory, but also greater memory bandwidth. This could mean higher memory speed or more channels. Any of these, cost more power.

But with cache, it's necessary's to power the IF. Going trough the IF is cheaper than going to memory. But it's still more expensive than going directly through the main chip, both in terms of latency an power usage.
Another option would be to connect that L3 cache with a 3D stack, using TSVs. This would mean lower power usage and lower latency. But it could incr4ease temperatures and limit clocks for the SoC.
It would be funny to see a Zen2 SoC with 3DVcache.
 
Depends on how it's done.
The 7800XT has 64MB of L3 made in N6, that is connected via Infinity Fabric. This saves a good amount of power, since every time there is a cache hit, there is no need to go to memory.
If the PS5 has no L3, then it will require more accesses to memory, but also greater memory bandwidth. This could mean higher memory speed or more channels. Any of these, cost more power.

But with cache, it's necessary's to power the IF. Going trough the IF is cheaper than going to memory. But it's still more expensive than going directly through the main chip, both in terms of latency an power usage.
Another option would be to connect that L3 cache with a 3D stack, using TSVs. This would mean lower power usage and lower latency. But it could incr4ease temperatures and limit clocks for the SoC.
It would be funny to see a Zen2 SoC with 3DVcache.
I still don’t believe the zen 2 leaks but we will see
 

Rudius

Member
which hopefully will allow for 1440p(ish?)/60
also, I am looking forward to getting improvements for PS VR2 games. GT7, mainly. Get it closer to 90/120 fps and increased IQ, that would be sweet.
The PS4 Pro was essential for a good PSVR experience, upgrading many games from blurry to nice looking.

With PSVR2 we don't have the same resolution problem with heavy games like GT7 and Resident Evil, thanks to the eye-tracked foveated rendering. No Man's Sky became nice and sharp once they implemented the technique.

The one big improvement a PS5 Pro could bring is taking games from a 60fps reprojected to 120 up to a native 90fps.
 

THE:MILKMAN

Member
I still don’t believe the zen 2 leaks but we will see

Not sure why. It is probably the least surprising spec for a number reasons already mentioned. Rich at DF did question Cerny about the CPU being x86 and just working in his "inside PS4 Pro" interview where Cerny explains why it isn't the case for a fixed console.

I can't see going from Zen 2 to Zen 4/4c/5 would be any different and therefore not going to happen.

But of course PS5 Pro doesn't exist so all this is moot!
 

Mr.Phoenix

Member
So 200mm2 = 54 CUs, but the PS5 will need some extra ones for yields, right? At 2.4 GHz, that gets us to 16.5 tflops. 2.3 GHz would be 15.9 Tflops. I think thats what we get.

Of course, that means the RT and IPC gains better be substantial, otherwise, this will be a rather meek upgrade.
Exactly. That's why I have been saying it will be between 16-17TF. 200mm2 would be able to fit 60 RDNA3 CUs. Hell, it likely would be able to fit 64CUs because, unlike the PC part, the PS5pro would lack any of the hardware needed to link the GCD to the MCDs.

And we could also expect that the PS5pro would likely be using the RTUs from RDNA4, which if they accelerate the BVH tree would mean they were going to be bigger than the RTU found in RDNA3. So the same way the CUs in the PS4pro were "bigger" due to the CBR ID stuff, then the CUs in the PS5pro would likely be bigger too. And RDNA3 already has 2x AI units per CU, so I expect no changes there. They likely increase the GPU L2 cache from 4MB (as its in the PS5 and even the 7800xt) to 6MB, since the PS4 doesn't have an L3 cache.

All said, the actual PS5pro GPU, stands to take up anywhere between 200mm2 to 220mm2. Then the only things left would be CPU, CPU cache, mem PHY controllers, IO stack...etc. 100mm2 should be enough for those I am guessing. And we end up with a 300-320mm2 APU.

The only question now is if its going to be 54CU active, 56CU active or 60CU active. I think the last one is least likely because that would mean there are more than 60CUs in the PS5pro GPU. And the 7800xt is a 60CU full chip, the 7700xt is basically a 7800xt with 6 disabled CUs and only 3 of 4 MCDs. It even has the same GCD size as the 7800xt.

My money is on us getting 16.2TF (2.35Ghz) which would be marketed as a 32TF console thanks to the whole dual-issue thing. If AMD and Nvidia can do it, so would Sony.
 
Last edited:

PeteBull

Member

ChiefDada

Member
The midgen upgrade hype is real.
If the device launches holidays 2024 its devkits gotta exist by now for sure, ps4pr0 launched nov 2016, first official info came sept 7 so 3 months earlier, now lets see when we got first true ps4pr0 leaks (back then it was called ps4k https://www.neogaf.com/threads/ps4k...pu-price-tent-q1-2017.1202462/#post-199646466 ;D

Wow, just skimming the first few pages really proves history does repeat itself.

A few good ones:

This gen has barely started though

What sort of MESS

This is really sounding like a PS5 with those massive hardware upgrades.

PS4 owners are gonna be pissed. Glad I waited to buy mine!

This thing better downscale 4K to 1080p.

Because I don't know anyone with a 4K TV. Is there even widespread 4K content at all?
 

Mr.Phoenix

Member
I’m

Rdna is far better at its raster than rt I think its far more likely that the pro can equal the 4080 is raster in select games than it ever could match let alone exceed a 3080 in rt. It’s just insane you label everyone names and call people trolls for disagreeing with you yet have opinions like this
No. I really dont know wherte you are getting this from or what kinda math you are doing. You do not have to guess what the PS5pro GPU would be. Its going to be based off the 7800xt GCD. That is a 60CU GPU. An argument can be made that the PS5pro has 60CU active which would mean that the PS5pro would have a custom 64-66CU raw GPU or that the PS5pro just takes that 60CU 7800xt and disables 4 or 6 CUs.

Either way, whatever the case, you are going to end up with something that's between a 7700xt and 7800xt in raster performance. There is just nothing else around this.

Lords of the Fallen@1440p NO RT
- 4080 (73fps)
- 4070 (47.6fps)
- 7800xt (43fps)
- 7700xt (37fps)

Avatar @1440p
- 4080 (85fps)
- 4070 (52fps)
- 7800xt (44fps)
- 7700xt (40fps)

AC: Mirage @1440p
- 4080 (128fps)
- 4070 (97fps)
- 7800xt (91fps)
- 7700xt (79fps)

I don't understand why you choose to continue ignoring the data. Its right there, there is no way that anything made based on the 7800xt and that could even be the 7700xt performs like a 4080. Can come close to a 4070, but then the 4080 is not even in the same weight class.

What you are hoping for is some sort of optimization + secret sauce, but that's not something we can quantify. It would be great if it does/has that... but it would make no sense to be here saying that it will. And the only thing that makes a 7700/7800xt perform like a 4070ti... will make a 4080 perform like a 4080ti/super. Dual issue compute. Both AMD and Nvidia has it.
The expectation is it will be on at least 4nm not 5
I don't see it being 4nm. 5nm seems more likely, in the same way, that sony stuck with 6nm with the PS5 refresh. Its clear that they are never trying to be using the latest and greatest nodes. Cheaper for Sony to use 5nm. My guess is that you have the same APU be around 320mm2 on 5nm and be around 280mm2 on 4nm. But that 4nm APU would cost more than its 5nm counterpart.

Even if it only cost $10 more, that would be enough for Sony to not use it. This is what I think though, but even if its a 4nm process, the specs won't change.
 
Last edited:

buenoblue

Member
No. I really dont know wherte you are getting this from or what kinda math you are doing. You do not have to guess what the PS5pro GPU would be. Its going to be based off the 7800xt GCD. That is a 60CU GPU. An argument can be made that the PS5pro has 60CU active which would mean that the PS5pro would have a custom 64-66CU raw GPU or that the PS5pro just takes that 60CU 7800xt and disables 4 or 6 CUs.

Either way, whatever the case, you are going to end up with something that's between a 7700xt and 7800xt in raster performance. There is just nothing else around this.

Lords of the Fallen@1440p NO RT
- 4080 (73fps)
- 4070 (47.6fps)
- 7800xt (43fps)
- 7700xt (37fps)

Avatar @1440p
- 4080 (85fps)
- 4070 (52fps)
- 7800xt (44fps)
- 7700xt (40fps)

AC: Mirage @1440p
- 4080 (128fps)
- 4070 (97fps)
- 7800xt (91fps)
- 7700xt (79fps)

I don't understand why you choose to continue ignoring the data. Its right there, there is no way that anything made based on the 7800xt and that could even be the 7700xt performs like a 4080. Can come close to a 4070, but then the 4080 is not even in the same weight class.

What you are hoping for is some sort of optimization + secret sauce, but that's not something we can quantify. It would be great if it does/has that... but it would make no sense to be here saying that it will. And the only thing that makes a 7700/7800xt perform like a 4070... will make a 4080 perform like a 4080ti/super. Dual issue compute. Both AMD and Nvidia has it.

I don't see it being 4nm. 5nm seems more likely, in the same way, that sony stuck with 6nm with the PS5 refresh. Its clear that they are never trying to be using the latest and greatest nodes. Cheaper for Sony to use 5nm. My guess is that you have the same APU be around 320mm2 on 5nm and be around 280mm2 on 4nm. But that 4nm APU would cost more than its 5nm counterpart.

Even if it only cost $10 more, that would be enough for Sony to not use it. This is what I think though, but even if its a 4nm process, the specs won't change.

Exactly. Finally some common sense lol
 

Gaiff

SBI’s Resident Gaslighter
The game favors amd cards so the 6800 outperforms the 3080 there…
Oh, really?



97 average vs 84 in favor of the 3080 at 1440p. 52 vs 47 average in favor of the 3080 at 4K. The 3080 wins by 10 to 15% over the 6800. We've been through this already. Stop lying and claiming falsehoods. The absolute best the PS5's GPU can do is get within 20-25% of the 3080, not 2%, let alone outperform it. And AMD does run better than NVIDIA in this game, which is why the 6800 XT beats the 3080, y'know, the actual card it's supposed to rival, not the lower tier 6800.

No. I really dont know wherte you are getting this from or what kinda math you are doing. You do not have to guess what the PS5pro GPU would be. Its going to be based off the 7800xt GCD. That is a 60CU GPU. An argument can be made that the PS5pro has 60CU active which would mean that the PS5pro would have a custom 64-66CU raw GPU or that the PS5pro just takes that 60CU 7800xt and disables 4 or 6 CUs.

Either way, whatever the case, you are going to end up with something that's between a 7700xt and 7800xt in raster performance. There is just nothing else around this.

Lords of the Fallen@1440p NO RT
- 4080 (73fps)
- 4070 (47.6fps)
- 7800xt (43fps)
- 7700xt (37fps)

Avatar @1440p
- 4080 (85fps)
- 4070 (52fps)
- 7800xt (44fps)
- 7700xt (40fps)

AC: Mirage @1440p
- 4080 (128fps)
- 4070 (97fps)
- 7800xt (91fps)
- 7700xt (79fps)

I don't understand why you choose to continue ignoring the data. Its right there, there is no way that anything made based on the 7800xt and that could even be the 7700xt performs like a 4080. Can come close to a 4070, but then the 4080 is not even in the same weight class.

What you are hoping for is some sort of optimization + secret sauce, but that's not something we can quantify. It would be great if it does/has that... but it would make no sense to be here saying that it will. And the only thing that makes a 7700/7800xt perform like a 4070... will make a 4080 perform like a 4080ti/super. Dual issue compute. Both AMD and Nvidia has it.

I don't see it being 4nm. 5nm seems more likely, in the same way, that sony stuck with 6nm with the PS5 refresh. Its clear that they are never trying to be using the latest and greatest nodes. Cheaper for Sony to use 5nm. My guess is that you have the same APU be around 320mm2 on 5nm and be around 280mm2 on 4nm. But that 4nm APU would cost more than its 5nm counterpart.

Even if it only cost $10 more, that would be enough for Sony to not use it. This is what I think though, but even if its a 4nm process, the specs won't change.
Right? Like, how the fuck is is it complicated to understand? The PS5 Pro will very likely be in the ballpark of a 4070/3080/7800 XT in rasterization performance in general. The 4080 is still about 50% faster than those guys. Assuming a very well-optimized title on the Pro that doesn't run all that well on PC, then the Pro could reach 4070 Ti level of performance, which is generally around 25% faster than its class of GPU.

The PS5 Pro would need to be as fast as the 4070 Ti out of the gate and then hope for another 25% boost in optimized games to get on the level of a 4080. What are the odds? And that's in raster alone.

How is it so hard for this kid to understand simple maths and scaling? Jesus Christ.
 
Last edited:

Mr.Phoenix

Member
Oh, really?



97 average vs 84 in favor of the 3080 at 1440p. 52 vs 47 average in favor of the 3080 at 4K. The 3080 wins by 10 to 15% over the 6800. We've been through this already. Stop lying and claiming falsehoods. The absolute best the PS5's GPU can do is getting within 20-25% of the 3080, not 2%, let alone outperform it. And AMD does run better than NVIDIA in this game, which is why the 6800 XT beats the 3080, y'know, the actual card it's supposed to rival, not the lower tier 6800.


Right? Like, how the fuck is is it complicated to understand? The PS5 Pro will very likely be in the ballpark of a 4070/3080/7800 XT in rasterization performance in general. The 4080 is still about 50% faster than those guys. Assuming a very well-optimized title on the Pro that doesn't run all that well on PC, then the Pro could reach 4070 Ti level of performance, which is generally around 25% faster than its class of GPU.

The PS5 would need to be as fast as the 4070 Ti out of the gate and then hope for another 25% boost in optimized games to get on the level of a 4080. What are the odds? And that's in raster alone.

How is it so hard for this kid to understand simple maths and scaling? Jesus Christ.

At this point, I think he just has confirmation bias. Wants what he thinks to be true so much that he is now resorting to bending facts.
 

Dorfdad

Gold Member
Man just slap a 4090 in it and call it a day! We won’t need to upgrade for at least 2 years!
So it will cost you $25.00 more big deal inflation happens.









/s
 
Last edited:

omegasc

Member
The PS4 Pro was essential for a good PSVR experience, upgrading many games from blurry to nice looking.

With PSVR2 we don't have the same resolution problem with heavy games like GT7 and Resident Evil, thanks to the eye-tracked foveated rendering. No Man's Sky became nice and sharp once they implemented the technique.

The one big improvement a PS5 Pro could bring is taking games from a 60fps reprojected to 120 up to a native 90fps.
I believe GT7 renders at 1080p? I'm basing off the 120fps mode... Since it has to render twice (one for each eye) on the VR I thought that it would be too much to get 4k 60. But getting the current output to 90 would already be a nice overall improvement!
 

Rudius

Member
I believe GT7 renders at 1080p? I'm basing off the 120fps mode... Since it has to render twice (one for each eye) on the VR I thought that it would be too much to get 4k 60. But getting the current output to 90 would already be a nice overall improvement!
GT7 runs at a high resolution on PSVR2 at 60fps reprojected to 120. I don't know the number of pixels, but it looks native or close to native in the headset.

Keep in mind that the game in VR only has to display a high resolution in the area you are looking at, decreasing it outside, but you can't notice it because it follows your gaze with the eye-tracking.
 

sncvsrtoip

Member
I believe GT7 renders at 1080p? I'm basing off the 120fps mode... Since it has to render twice (one for each eye) on the VR I thought that it would be too much to get 4k 60. But getting the current output to 90 would already be a nice overall improvement!
gt7 render around 90% of full 2000 x 2040 per eye in 60fps reprojected to 120fps in area of eye focus (and closer to 240p outside of it ;d)
 

mckmas8808

Mckmaster uses MasterCard to buy Slave drives
Creative choice, the only real reason most developers target 30 FPS is because of hardware limitations.

A combat heavy game like Wolverine will need to rely on the fluidity and responsiveness of 60 FPS, anything lower would be immersion breaking. It also brings a sense of realism as well since the motion is so lifelike.

But Insomniac has never thought like this before. They've never did this before either. So the change is interesting. Maybe you're right though.
 

omegasc

Member
GT7 runs at a high resolution on PSVR2 at 60fps reprojected to 120. I don't know the number of pixels, but it looks native or close to native in the headset.

Keep in mind that the game in VR only has to display a high resolution in the area you are looking at, decreasing it outside, but you can't notice it because it follows your gaze with the eye-tracking.

gt7 render around 90% of full 2000 x 2040 per eye in 60fps reprojected to 120fps in area of eye focus (and closer to 240p outside of it ;d)
I guess it's the reprojection from 60 that tricks me into thinking it's lower res. I'm aware of the foveated rendering and it works amazingly well! GT7 really changes with the VR2. Can't recommend it enough.
 

rnlval

Member
I don't know, I cant help but feel that the AMD and Nvidia listed FP 32 TF numbers are scams. As we are yet to see any game take advantage of this whole dual-issue compute thing.And the way both companies goes about achieving these double TF numbers is weird. Eg. Nvidia didn't really double anything, they just made their already existing 64 INT32 cores also capable of "acting" like 64 FP32 cores simultaneously. AMD on the other hand just allows its own 64 FP32 cores to work on two instructions simultaneously,

Either way, we are yet to see any game use this feature, evident in the fact that compared to their GPUs using the older TF measurements, we are not seeing anywhere near the kinda performance boosts that these inflated TF numbers would suggest.
Per CU or SM, there's no TMU increase with RDNA3 and Ampere/Ada.

The Compute Units (CUs) of the RDNA 3 architecture have been significantly redesigned. Since the first GCN GPUs nearly 11 years ago, the GPU block has always had 64 shaders (“ALUs”). RDNA 3’s new thing is that it gives the SIMD units that provide those “shaders” the ability to process two instructions per cycle. As with RDNA 2, two 32-wide SIMD units are used for this, but with the newly added ability to “dual issue”, i.e. process two instructions simultaneously. They thus have a theoretical compute performance of the equivalent of up to 128 shaders instead of the current 64.

However, this duplication of “ALUs” or shaders is done within the CU structure based on previous generations, so that 64 of these dual-issue shaders, which could theoretically do the work of 128, share some of the control and computate structures that were serving 64 shaders in RDNA 2. Also important is that this dual-issue capability is still quite inflexible and has various complicated constraints and requirements on the instructions for them to be executed simultaneously. Therefore, in practice, 64 of these RDNA 3 dual-issue shaders have a much smaller resulting performance than you would get from 128 RDNA 2 shaders. So while this doubles the theoretical TFLOPS, in practice the performance yield extracted will be much smaller than 2x – usually.

The above-average performance gains with the RDNA3 architecture in our tests are probably evidence that, at least in some cases, dual-issue is already being used in compute applications. There is one thing that supports this idea. An analogous situation occurred with Nvidia’s Ampere architecture. It also doubled the number of FP32 shaders, though the technical details were different there. But even then, the real benefit in games was quite limited and far from seeing doubled performance. And perhaps similar to today’s situation, the benefit of the FP32 doubling was much more pronounced in some compute applications than in games. You may still remember the big performance gains in various Cuda and OpenCL applications that were shown in 2020 reviews. So this points to the nature of compute applications being somewhat different compared to games, which may be behind the atypical performance gains in compute applications on the RDNA3 architecture in the Radeon RX 7600 and Navi 33 chip.


--------------------------

In compute workloads, RDNA 3 and Ampere/Ada GPUs show their strength. Lowly RX 7600 is still useful for compute workloads.

The PC is a multipurpose device despite being used in PC-based game consoles or gaming PC roles. The goal of combining both market segments is for economies of scale.
 
Last edited:

ChiefDada

Member
Avatar @1440p
- 4080 (85fps)
- 4070 (52fps)
- 7800xt (44fps)

- 7700xt (40fps)

I was surprised to see the 7800xt this close to 4070 at 1440p in an RT based game. This is why I believe PS5 Pro would handily edge 4070 with few exceptions for future games that will be more RT focused and by extension VRAM hungry. 12gb will be an issue for this card especially for first party ports that flex i/o even further than we've seen. 4070S 16gb will be where this card lands when factoring RT/raster in conjunction.
 

Bojji

Member
Shouldn't we expect PS5 Pro to have higher clocks/power efficiency than the 7800xt since it is still monolithic?



Based on rumors with focus on RT and supposed 2x RT performance, PS5 Pro is likely better than 3000 series and more comparable to 4070 RT capabilities.

Rumors point to it not having higher clocks than PC part. Power consumption is the biggest factor here, consoles never really exceeded 200W that much (as a whole) and 7800XT eats over 250W ALONE. It has to be less performant than PC GPU unless Sony increase console TDP drastically compared to PS5.

With (much) better RT performance and some sort of DLSS competitor it will be interesting piece of tech.

I was surprised to see the 7800xt this close to 4070 at 1440p in an RT based game. This is why I believe PS5 Pro would handily edge 4070 with few exceptions for future games that will be more RT focused and by extension VRAM hungry. 12gb will be an issue for this card especially for first party ports that flex i/o even further than we've seen. 4070S 16gb will be where this card lands when factoring RT/raster in conjunction.

Console will still have 16GB of memory so I doubt VRAM requirements will increase. Devs have 13GB (?) of usable memory and some of it has to go for CPU tasks.
 
Last edited:

Mr.Phoenix

Member
Per CU or SM, there's no TMU increase with RDNA3 and Ampere/Ada.
I believe you mean no ALU increase, and yes. I am aware of that. Its what I have been saying. Nvidia just lets its INT shaders also act like FP shaders and AMD allows its FP shaders capable of doing two instructions at once.
In compute workloads, RDNA 3 and Ampere/Ada GPUs show their strength. The PC is a multipurpose device despite being used in PC-based game consoles or gaming PC roles. The goal of combining both market segments is for economies of scale.
Again, also making my point. These compute workloads that take advantage of the whole dual issue thing so far have not been games. Its not like we are running Blender on the PS5/XSX. And the reason for that is that games are yet to be written to specifically take advantage of it. At the end of the day, FP32 shaders pretty much do the bulk of the work in any gaming environment, so being able to double that performance would be good. However, devs have to code for it, at least in a way that allows the drivers to pick up the slack.

That is yet to happen. And I have my suspicions of how effective it is for gaming applications because of how much no one has actually used it. That's a LOT of potential performance being left on the table. There has to be a reason for that.
 
Rumors point to it not having higher clocks than PC part. Power consumption is the biggest factor here, consoles never really exceeded 200W that much (as a whole) and 7800XT eats over 250W ALONE. It has to be less performant than PC GPU unless Sony increase console TDP drastically compared to PS5.

Power will drop on the new node
 
All I want is Mark Cerny doing the presentation. But this time with actual footage and not just some slides please!!!!:messenger_pensive:
My tub is ready
I was surprised to see the 7800xt this close to 4070 at 1440p in an RT based game. This is why I believe PS5 Pro would handily edge 4070 with few exceptions for future games that will be more RT focused and by extension VRAM hungry. 12gb will be an issue for this card especially for first party ports that flex i/o even further than we've seen. 4070S 16gb will be where this card lands when factoring RT/raster in conjunction.
I agree. We could see PS5 Pro close to 4070ti with RT in some games.
 

Mr.Phoenix

Member
Power will drop on the new node
Two things...

  1. this is based on the assumption that the PS5pro uses 4nm. I believe it will use 5nm, the same as the 7800xt.
  2. Even if it uses 4nm, how much of a power drop are we looking at? The og PS5 went from 220W on 7nm to 200W on 6nm. That's about a 10% power drop. So Even if we gave it a 20% drop compared to the 7800xt on the hypothetical new node we are still looking at around 200W. And that's just for the GPU.
No matter what, I can't see the PS5pro pulling anything more than 220-240W. And the only way they achieve that is not just by using a 4/5nm node, but by undervolting the APU which in turn limits how high it can be clocked.

Long story short, I don't see it being clocked as high as the 7800xt. And all this is worse if it's not even on 4nm.
 
Last edited:

Gaiff

SBI’s Resident Gaslighter
I was surprised to see the 7800xt this close to 4070 at 1440p in an RT based game. This is why I believe PS5 Pro would handily edge 4070 with few exceptions for future games that will be more RT focused and by extension VRAM hungry. 12gb will be an issue for this card especially for first party ports that flex i/o even further than we've seen. 4070S 16gb will be where this card lands when factoring RT/raster in conjunction.
That's actually a pretty poor showing from the 7800 XT and I most certainly wouldn't use that in favor of the Pro considering it would need to outperform the 7800 XT by a good 30% in this benchmark to "easily" edge the 4070. The 4070 is also heavily gimped with an awful 192-bit bus that makes it choke at higher resolutions. VRAM isn't even the problem with that card.

Don't forget that the vast majority of games are still hybrid and not full ray tracing. In Avatar's case, we'd have to force RT off using NVIDIA Inspector or AMD GPU profiler to see how close they are in raster in that game since auto-detection turns on RT automatically for supported cards. That the 7800 XT gets beaten by 18% against a card it should beat in rasterization doesn't look good. This likely means that toggling ray tracing alone in this game gives the 4070 a pretty huge lead, assuming that the 7800 XT narrowly beats it without it.

I think you're grossly overestimating NVIDIA's advantage in usual RT workloads. In monster RT games like in Cyberpunk or Alan Wake, yeah, AMD GPUs can get mollywhopped by over 50% by NVIDIA, but these are the exceptions, not the rule. In most games, it's more along the lines of 10-20% because the vast majority don't use that many RT effects besides the usual suspects.
 

PaintTinJr

Member
I don't know, I cant help but feel that the AMD and Nvidia listed FP 32 TF numbers are scams. As we are yet to see any game take advantage of this whole dual-issue compute thing.And the way both companies goes about achieving these double TF numbers is weird. Eg. Nvidia didn't really double anything, they just made their already existing 64 INT32 cores also capable of "acting" like 64 FP32 cores simultaneously. AMD on the other hand just allows its own 64 FP32 cores to work on two instructions simultaneously,

Either way, we are yet to see any game use this feature, evident in the fact that compared to their GPUs using the older TF measurements, we are not seeing anywhere near the kinda performance boosts that these inflated TF numbers would suggest.
We are seeing it on both IMO, but the differing strategies make it difficult to see it so clearly and obviously.

If you take a less common game engine scenario utilising RPM FP16 or FP32 for a twitch shooter, and just doing rasterization - without RT - you see AMD hardware do well at native, and Nvidia hardware feel like it underperforms at native, typically. But in the AMD card it is getting a big uplift for its use of RPM, and because RT isn't used it doesn't have to take 10 or more half teraflops and VRAM bandwidth away from the comparative compute metric and instead saturates its hardware heavily.

On the Nvidia card in the same scenario the benefit of FP16 over FP32 is essentially just a memory bandwidth saving at the expense of less accuracy, as typically FP16 on Nvidia still occupies FP32 capable units just at half the data size, so the engine performance then becomes one of the standard headline Teraflop figure for the Nvidia GPU, which will be less than the headline figure of the AMD GPU at FP16. And in the Nvidia GPU at this scenario the native rendering means the RT hardware sits idle.

In the more typical scenario where the AMD PC GPU cards seemingly fold in benchmarks even against lower Nvidia GPUs, and with inferior RT, what is typically happening, is that the AMD Teraflop number is being reduced by the BVH RT hardware blocking CUs from doing Rasterization/compute when doing RT BVH - although that issue isn't present in the PS5 APU - and because the RT on AMD is done in compute, even more teraflops are taken to compute the higher level RT algorithm on CUs, so even if FP16 and async is being exploited well, the Nvidia card typically has a memory bandwidth advantage by memory type and interface width, and dedicated RT hardware that in reality is worth an awful lot of AMD FP16 or FP32 teraflops, so the Nvidia GPU gets to use all its FP32 or FP16 compute for rasterization level rendering and then add the RT rendering from the RT hardware as a freebie on top.
 
Last edited:

rnlval

Member
I believe you mean no ALU increase, and yes. I am aware of that. Its what I have been saying. Nvidia just lets its INT shaders also act like FP shaders and AMD allows its FP shaders capable of doing two instructions at once.

Again, also making my point. These compute workloads that take advantage of the whole dual issue thing so far have not been games. Its not like we are running Blender on the PS5/XSX. And the reason for that is that games are yet to be written to specifically take advantage of it. At the end of the day, FP32 shaders pretty much do the bulk of the work in any gaming environment, so being able to double that performance would be good. However, devs have to code for it, at least in a way that allows the drivers to pick up the slack.

That is yet to happen. And I have my suspicions of how effective it is for gaming applications because of how much no one has actually used it. That's a LOT of potential performance being left on the table. There has to be a reason for that.
For 3D graphics, ALU is nothing without load-store units like TMUs and ROPS. Compute workload doesn't need graphics hardware like TMU and ROPS.

ROPS has a Z-buffer (depth) and color pixel read/write operations. ROPS has other fixed-function hardware e.g. MSAA.

TMU has texture read/write operations. TMU has other fixed-function hardware. e.g. texture filters. Despite unified shaders, graphics I/O units are not unified.

RDNA 3's dual-issue mode doesn't support GCN's legacy wave64 instructions.

GpGPU is not a DSP.
 
Last edited:

rnlval

Member
We are seeing it on both IMO, but the differing strategies make it difficult to see it so clearly and obviously.

If you take a less common game engine scenario utilising RPM FP16 or FP32 for a twitch shooter, and just doing rasterization - without RT - you see AMD hardware do well at native, and Nvidia hardware feel like it underperforms at native, typically. But in the AMD card it is getting a big uplift for its use of RPM, and because RT isn't used it doesn't have to take 10 or more half teraflops and VRAM bandwidth away from the comparative compute metric and instead saturates its hardware heavily.

On the Nvidia card in the same scenario the benefit of FP16 over FP32 is essentially just a memory bandwidth saving at the expense of less accuracy, as typically FP16 on Nvidia still occupies FP32 capable units just at half the data size, so the engine performance then becomes one of the standard headline Teraflop figure for the Nvidia GPU, which will be less than the headline figure of the AMD GPU at FP16. And in the Nvidia GPU at this scenario the native rendering means the RT hardware sits idle.

In the more typical scenario where the AMD PC GPU cards seemingly fold in benchmarks even against lower Nvidia GPUs, and with inferior RT, what is typically happening, is that the AMD Teraflop number is being reduced by the BVH RT hardware blocking CUs from doing Rasterization/compute when doing RT BVH - although that issue isn't present in the PS5 APU - and because the RT on AMD is done in compute, even more teraflops are taken to compute the higher level RT algorithm on CUs, so even if FP16 and async is being exploited well, the Nvidia card typically has a memory bandwidth advantage by memory type and interface width, and dedicated RT hardware that in reality is worth an awful lot of AMD FP16 or FP32 teraflops, so the Nvidia GPU gets to use all its FP32 or FP16 compute for rasterization level rendering and then add the RT rendering from the RT hardware as a freebie on top.
PS5's GPU RT cores have similar issues with the rest of the RDNA 2 RT cores. PS5 Pro's GPU RT has hardware BVH transverse. BVH transverse workload has a blocking behavior on AMD's current hardware.

GPU's TFLOPS resource remained the same, but the BVH and RT denoise workloads can consume the available shader resources for traditional raster workloads. Both AMD and NVIDIA have thrown 2X FLOPS per CU/SM at the problem.

RDNA 2 RT hardware accelerates ray/box/triangle intersections.

RDNA 3 RT hardware accelerates the DXR flag feature for early subtree culling (reduce transversal iterations), 1.5x more rays in flight (per RT core), and box sorting (reduce transversal iterations).

NVIDIA's RTX RT cores are the feature's complete implementation with the BVH transverse and DXR flag feature from the start. NVIDIA's Ampere RT cores add a triangle interpolation feature. https://videocardz.com/newz/nvidia-details-geforce-rtx-30-ampere-architecture

Sony and MS wanted NVIDIA's RT innovation but were unwilling to pay for them, hence pushing AMD to be a lower-cost alternative.
 
Last edited:

SmokSmog

Member
BTW guys, there is no 4nm, it's N4, it's just N5++

N7 = proper node
N6 = 7nm++ with some EUV steps that yielded slightly smaller chip and faster to make cuz of less steps.
N5 = proper node
N4 = 5nm++
N3 = proper node

And I think RDNA4/PS5 PRO and real PS5 Slim will be made on N4P

Comparing PS5 PRO to 7800XT is silly, Navi32 was ready in late 2022, PS5 PRO will release in late 2024 while using RDNA3.5 or RDNA4.
 
Last edited:

rnlval

Member
Oh, really?



97 average vs 84 in favor of the 3080 at 1440p. 52 vs 47 average in favor of the 3080 at 4K. The 3080 wins by 10 to 15% over the 6800. We've been through this already. Stop lying and claiming falsehoods. The absolute best the PS5's GPU can do is getting within 20-25% of the 3080, not 2%, let alone outperform it. And AMD does run better than NVIDIA in this game, which is why the 6800 XT beats the 3080, y'know, the actual card it's supposed to rival, not the lower tier 6800.


Right? Like, how the fuck is is it complicated to understand? The PS5 Pro will very likely be in the ballpark of a 4070/3080/7800 XT in rasterization performance in general. The 4080 is still about 50% faster than those guys. Assuming a very well-optimized title on the Pro that doesn't run all that well on PC, then the Pro could reach 4070 Ti level of performance, which is generally around 25% faster than its class of GPU.

The PS5 Pro would need to be as fast as the 4070 Ti out of the gate and then hope for another 25% boost in optimized games to get on the level of a 4080. What are the odds? And that's in raster alone.

How is it so hard for this kid to understand simple maths and scaling? Jesus Christ.


6800 XT (72 CU RDNA 2) is not RDNA 3. RX 7900 XT has 84 CU RDNA 3 which is the closest 1:1 replacement for RX 6900 XT's 80 CU RDNA 2 configuration.

RX 7800 XT has 60 CU RDNA 3.


YAJnfAl.png

There's an RT good performance jump between RX 7900 XT has 84 CU RDNA 3 vs RX 6900 XT's 80 CU RDNA 2.

RTX 3090 Ti FE has 84 SM with clock speeds of about 1.9 GHz. https://www.techpowerup.com/review/nvidia-geforce-rtx-3090-ti-founders-edition/38.html

RTX 4080 FE has 76 SM with clock speeds above 2.7 GHz. https://www.techpowerup.com/review/nvidia-geforce-rtx-4080-founders-edition/40.html My Gigabyte RTX 4080 Gaming OC's one-click overclock has 2.9 Ghz. I do have TUF RTX 4090 OC but that's another category on its own i.e. 128 SM scale.

Based on CU/SM scaling, RX 7900 XTX (96 CU) should land slightly above RTX 4080, but RDNA 3 RT hardware is missing a feature.

For $$$$ and VRAM, RDNA 3 GPUs are value for money. NVIDIA's Ampere SKUs below RTX 3080 Ti (with 12 GB VRAM) have VRAM issues.
 
Last edited:

Mr.Phoenix

Member
For 3D graphics, ALU is nothing without load-store units like TMUs and ROPS. Compute workload doesn't need graphics hardware like TMU and ROPS.

ROPS has a Z-buffer (depth) and color pixel read/write operations. ROPS has other fixed-function hardware e.g. MSAA.

TMU has texture read/write operations. TMU has other fixed-function hardware. e.g. texture filters. Despite unified shaders, graphics I/O units are not unified.

RDNA 3's dual-issue mode doesn't support GCN's legacy wave64 instructions.

GpGPU is not a DSP.
Maybe its that I am not understanding you. Or maybe its that I just don't agree with you. But I am not getting the point you are making.

Are you trying to say that for graphical workloads the reason we aren't seeing benefits of dual-issue compute is that there isn't a relative increase in TMU and ROPs? Cause if that's what you are saying, then I don't agree.

Take the Non-dual issue capable 2080Ti for instance vs the 3080ti. Regardless of the increase in TMU and ROP on the 3080ti, its still only about 50% better than the 2080ti which tracks with the actual non-dual issue TF difference between the two cards vs being over 3x better as the claimed TF number would suggest.

And I am fully aware of what TMU and ROPs do, and I am just not buying that they are somehow the bottleneck preventing us from seeing the claimed FP32 performance gains. If what is being said that in-game applications, dual issue compute adds only like 10% of relative performance, then fine, I can't argue with that mainly because I can't prove it wrong, but that thing is marketed as a literal doubling in GPU compute. Thats disingenuous.
 

PaintTinJr

Member
PS5 Pro will use the same Zen2 as PS5 cuz they will strike two birds with one stone (porting Zen2 to N4P for PS5 slim will cost so they will also use it for PS5 Pro)

4a5001b7beea096457f480c8808572428b-09-roll-safe.rsquare.w400.jpg
And that would be another reason why I think the Pro would use two enhanced PS5 APUs, because Sony can lockout the enhancements in the new base model via firmware, for performance parity with the original PS5 APU, while remaining flexible for high demand of Pros, with one chip design for two products and get all the scalable BOM benefits on the Pro from the base model, even if the Pro sells as modestly as the previous Pro.
 

PaintTinJr

Member
PS5's GPU RT cores have similar issues with the rest of the RDNA 2 RT cores. PS5 Pro's GPU RT has hardware BVH transverse. BVH transverse workload has a blocking behavior on AMD's current hardware.

GPU's TFLOPS resource remained the same, but the BVH and RT denoise workloads can consume the available shader resources for traditional raster workloads. Both AMD and NVIDIA have thrown 2X FLOPS per CU/SM at the problem.

RDNA 2 RT hardware accelerates ray/box/triangle intersections.

RDNA 3 RT hardware accelerates the DXR flag feature for early subtree culling (reduce transversal iterations), 1.5x more rays in flight (per RT core), and box sorting (reduce transversal iterations).

NVIDIA's RTX RT cores are the feature's complete implementation with the BVH transverse and DXR flag feature from the start. NVIDIA's Ampere RT cores add a triangle interpolation feature. https://videocardz.com/newz/nvidia-details-geforce-rtx-30-ampere-architecture

Sony and MS wanted NVIDIA's RT innovation but were unwilling to pay for them, hence pushing AMD to be a lower-cost alternative.
Mark Cerny in the Road to PS5 explicitly says otherwise about the PS5 custom geometry engine BVH tests blocking, and as he is the system architect I'll take that as a solid source on the matter. He also talked about us being in the early days of RT with many changes/evolution in techniques, and he has patents on the subject himself, so I'd argue he was never going the Nvidia route with fixed turnkey solutions for PlayStation, as it is the exact opposite of everything he has talked about, and AMD's engineer choices - compared to Intel and Nvidia - contradict your point too.

Everyone has chosen the solution that they wanted for RT and AI, and are happy with their choices(AMD being happy with the FP64 double flops setup probably)

As for the double compute on both AMD and Nvidia, where on these specs is the 4090 getting twice as many (theoretical)) FP16 (half)flops per clock as FP32? Unless I'm reading it wrong the tech specs show no gain on the Nvidia cards.

Comparing and contrasting with the AMD RT 7900XTX, you can clearly see you get twice as many (theoretical) half flops per clock to flops, yes?
 
Last edited:

Doczu

Member
Man those third parties are sure keeping their lips tight on this one.
I wonder it there aren't a lot of those kits in the wild and Sony would have it easy to NDA them to hell
 

PeteBull

Member
About 4090 or 7900xtx or any other new/older card, its fp16 flops, fp32 flops, at the end of the day it really doesnt matter, what matters is- actual avg performance across pletora of games, or if u really like some particular game and know u will play it for long months or even years- performance in that specific game.
Im techsavy dude/nerd/graphic whore and even to me all new archi/process node doesnt matter for shit, as long as results are there, aka gpu/cpu/whole system(be it pc or console) performs at the level i want to and am happy with- then i can pay proper price for it, hell even solid premium is perfectly fine, its worth it :)
 
Two things...

  1. this is based on the assumption that the PS5pro uses 4nm. I believe it will use 5nm, the same as the 7800xt.
  2. Even if it uses 4nm, how much of a power drop are we looking at? The og PS5 went from 220W on 7nm to 200W on 6nm. That's about a 10% power drop. So Even if we gave it a 20% drop compared to the 7800xt on the hypothetical new node we are still looking at around 200W. And that's just for the GPU.
No matter what, I can't see the PS5pro pulling anything more than 220-240W. And the only way they achieve that is not just by using a 4/5nm node, but by undervolting the APU which in turn limits how high it can be clocked.

Long story short, I don't see it being clocked as high as the 7800xt. And all this is worse if it's not even on 4nm.
Fat PS5 consumes 230W. With what they learned witht their cooling solution with the Slim I think Sony could make a 250W Pro model and cool it with the same size as the launch model.

I still don't see why they wouldn't use 4nm in 2024. Maybe 5nm will be used on the devkits and 4nm on the consumer model.
 
Fat PS5 consumes 230W. With what they learned witht their cooling solution with the Slim I think Sony could make a 250W Pro model and cool it with the same size as the launch model.

I still don't see why they wouldn't use 4nm in 2024. Maybe 5nm will be used on the devkits and 4nm on the consumer model.
Didn't TH allude to the Pro being the same size as the Vanilla PS5? Might give us some insight into TDP and such.
 
Well it looks as if Sony might have learned a lesson or two with the ps4pro because it was the Minecraft studio that spilled the beans before and allowed Microsoft to counter with the One X. I wouldn't be surprised if Sony only sent kits to a handful of studios to control leaks I don't expect a ton of pro patches this time around because for the most part the games are ready to scale to stronger hardware. GDC could be the earliest we hear anything about the pro and by then Sony would've probably shipped dev kits to everyone. I don't for a second believe that the pro doesn't exist I just think they learned from the last roll out.
 
Top Bottom