• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Switch dataminer from Famiboards suggests the Switch 2's portable GPU clocks will be 561MHz. He also said 1.8GHz for the CPU is "hopium"

Bojji

Member
How far ahead is the nvidia architecture of this machine vs. The PS4s GCN architecture (From what… 2011-2012).

I’d say the CPU is probably leagues better than AMD Jaguar cores from the pre-ryzen era.

Add DLSS… yeah I think we can safely assume this will blow a standard PS4 out of the water without issue.

Ampere is from 2020 so much more modern. It supports mesh shaders, VRS, RT, ML and some other stuff that PS4 does not.

With 12GB of ram ports from PS5 will be more common than PS4 ports to switch (at least that's what I suspect).
 

FireFly

Member
Thanks, I guess, for retelling what I already said? Did you somehow miss the part where I said "dual issue compatibility may increase the real world performance roughly 20-30%"?
No, I saw that but I thought you meant GCN at 860 Gigaflops + 20-30%, so a 1.1 TF PS4/Xbox One. If you meant 1.1 TF RDNA 2/1.375 TF GCN, then there is no real disagreement about the performance.
 
Last edited:

Zacfoldor

Member
IF TRUE, that would be a huge win.

Think about PS4 and what the games looked like on it. Think of Arkham Knight. Now add DLSS.

If you think about it compared to the current Switch, it is beyond a generational leap.

You just know Nintendo is going to cook on this thing. Plus it's BC too.
 
Last edited:
Seems right on par with what I was thinking. To be fair, this is fantastic for Nintendo, we are going to see some amazing first party games that push this console pretty far.As for third party games, we are going to see a lot of re-releases and definitive editions as well as a lot of work from indie devs, but still not much in the sense of new games.
 

Zathalus

Member
That's 1.72 dual issue "fake" terraflops, for comparison the ROG ALLY and all it's Z1E competitors are 8.6 TF's by that metric. The single issue metric all the other consoles and Switch 1 uses would place it around 860Gigaflops, the dual issue compatibility may increase the real world performance roughly 20-30%. 561Mhz in portable is questionable for Samsung 8nm on the tiny battery compartment we've seen from the CAD files.



Steamdeck's 1.6TF is single issue, so Switch 2 portable at these clocks is roughly half the power, 0.86 single issue terraflops . The OP is quoting "fake" double counted flops, simultaneous FP16 and FP32 support. In practice dual issue only adds 15-30%, varying wildly by game, some games basically nothing, some games 30%. The ROG Ally is nearly 9TF's by this metric, or faster than the Switch 3 in 2033. This is their most generationally outdated and underpowered console in raw processing since the OG Wii, and is genuinely the old Switch Pro they rebranded as Switch 2 due to ballooning wafer shortages and prices. 1Ghz docked mode would be 1.5TF's by all the other console's single issue TF ratings, roughly between base Xbox One and base PS4. Supporting dual issue and the "newer" circa 2019/2020 Nvidia cores should push docked mode closer to base PS4, with DLSS likely upscaling a native 900p-1080p render resolution to 1440p.

Not quite the same between AMD and Nvidia. Nvidia has hardware for dual FP32 pipelines in the CUDA cores. These are not “fake” FP32 metrics at all, but since not everything in each CUDA cores were doubled, effective performance does not double but is around ~30% better then the previous single FP32 metric. This approach has a clear benefit, no game needs to be programmed to take advantage of this.

AMD on the other hand uses VODP that requires that two separate, specific vector instructions are passed by the compiler. Even AMDs own optimisations show a very meagre boost in performance. Games have to specifically he coded to take advantage of VODP. None do as far as I know.

For the ROG Ally specifically that metric is a mere fairy tale. Even at 25w the console realistic does around 3.7 teraflops assuming decent CPU load.

Assuming the leaked clocks are correct, for portable mode the GPU would be similar to the Xbox One S in performance, but with all the other advantages that the new architecture brings. Docked mode is a different story, if the leak of 1007Mhz is true that certainly puts it above a PS4, but the PS4 Pro still has considerably more power on tap.

Of course if the clocks are lower then what is speculated by this leak then that all goes out the window.
 
Last edited:

bender

What time is it?
So like a Steam Deck except not terrible.

grin-evil.gif
 

Zacfoldor

Member
So around 2.5 tf docked, dam that’s disappointing as fuck.
Wait a sec, first page said the PS4 only had, checks notes, 1.84 TFLOPS.

Are you telling me that Nintendo Switch 2 will have MORE power than PS4 and DLSS too? Can we get Uncharted 4, God of War 2, and Elden Ring? All those games will play better on Switch 2 than PS4 based on the numbers above + DLSS, no? I mean, this is huge. There won't be hardly any games that can't be ported easily.

Episode 7 Wow GIF by Wrexham AFC
 
Last edited:

Xdrive05

Member
You can tell Nintendo definitely learned from the Switch's hardware weak points, which is very... uncharacteristic for them? 12GB of fast RAM is at least as important as the good SoC this time around. Switch 1 was crippled by RAM bandwidth first and foremost. That's why TotK and BotW slow down when they do.

Hoping that Nintendo prioritizes FPS this gen. I would rather the next Zelda game be 1080p 60fps than 1200-1440p 30fps just so they can say they "support 4K when docked" or whatever. No doubt DLSS will go a long way to make either approach viable, importantly.
 

Three

Member
IF TRUE, that would be a huge win.

Think about PS4 and what the games looked like on it. Think of Arkham Knight. Now add DLSS.

If you think about it compared to the current Switch, it is beyond a generational leap.

You just know Nintendo is going to cook on this thing. Plus it's BC too.
It's going to for sure be a big leap from the switch. It will make PS4 ports much easier and likely too.
 

Haint

Member
Between 3.1-3.5TF in docked mode according to the leak… How’s that disappointing?

Cause it's not, you're quoting fake double counted terraflops and comparing it to console's with single counted terraflops. Double counted flops do not translate to 100% gains, they translate to up to 30% gains only in select titles, and some basically none at all. Real world docked mode is basically going to be a base PS4, with DLSS doing a higher quality upscale to 1440p. The PS4 is 12 years old.
 
Last edited:

kevboard

Member
Cause it's not, you're quoting fake double counted terraflops and comparing it to console's with single counted terraflops. Double counted flops do not translate to 100% gains, they translate to 30% gains only in select titles, and some basically none at all. Real world docked mode is basically going to be a base PS4, with DLSS doing a higher quality upscale to 1440p. The PS4 is 12 years old.

which double counted?

at 1ghz the GPU is a real 3.1 TFLOPS, no dual issue FP32.
with dual issue it's 6 TFLOPS.

the T239 has 1536 Cuda Cores. each Cuda Core can do 2 operations per clock

so 1536 x 2 x 1000 = 3.07 TFLOPS
 
Last edited:

S0ULZB0URNE

Gold Member
Cause it's not, you're quoting fake double counted terraflops and comparing it to console's with single counted terraflops. Double counted flops do not translate to 100% gains, they translate to up to 30% gains only in select titles, and some basically none at all. Real world docked mode is basically going to be a base PS4, with DLSS doing a higher quality upscale to 1440p. The PS4 is 12 years old.
The speed will go up in docked mode.
 

AGRacing

Member
Ampere is from 2020 so much more modern. It supports mesh shaders, VRS, RT, ML and some other stuff that PS4 does not.

With 12GB of ram ports from PS5 will be more common than PS4 ports to switch (at least that's what I suspect).
Agree. My only point is that comparing Teraflops is even more ridiculous in this particular case than it usually has been to this point.
 

Haint

Member
which double counted?

at 1ghz the GPU is a real 3.1 TFLOPS, no dual issue FP32.
with dual issue it's 6 TFLOPS.

the T239 has 1536 Cuda Cores. each Cuda Core can do 2 operations per clock (this isn't dual issue btw)

so 1536 x 2 x 1000 = 3.07 TFLOPS single issue FP32

Your calculation is already counting simultaneous FP16 and FP32, you have to divide by 2, so ~1.5TF's. Dual mode and architectural improvements will push real world performance closer to base PS4, but I'm being generous here cause portables always underperform their theoretical paper specs and Switch 2 memory bandwidth is well below base PS4.

The speed will go up in docked mode.

Yes, to 1Ghz assuming this information is true. 1Ghz docked mode would be 1.5TF's if not calculating it by the "fake" simultanious FP16 and FP32 method. Accounting for that improvement and a newer Nvidia architecture, you're looking at around 2TF's real world performance in docked mode, with significant caution directed towards the memory bandwidth, and the reality that mobile SOC's always significantly under perform their theoretical paper specs.
 
Last edited:

kevboard

Member
Your calculation is already counting simultaneous FP16 and FP32, you have to divide by 2, so ~1.5TF's. Dual mode and architectural improvements will push real world performance closer to base PS4, but I'm being generous here cause portables always underperform their theoretical paper specs and Switch 2 memory bandwidth is well below base PS4.

it's not. Nvidia's shaders, like AMD's can do 2 instructions per clock.
this isn't dual issue, this is how Nvidia's and AMD's FP32 TFLOPs are calculated without dual issue.

just an example:

Example, the Xbox One:
768 shaders x 853MHz x 2 instructions per clock = 1,310,208 = 1.31 TFLOPS

the shaders run 2 operations per cycle, even with old architectures like AMD's GCN
 
Last edited:

Kataploom

Gold Member
I don't care about non Nintendo games. 1440p upscaled to 4k seem.doable docked?
Give me 1080p upscaled to 1440p and we're gold, TBH that's how I play some heavy games on PC in order to secure minimum 60 fps and It's all good, I barely see any problem in IQ (and I can say the ones I do are mostly because of TAA driven effect)
 
Last edited:

blacktout

Member
How much available to games? 10? Less? More?

I don't think anyone knows, because nothing has leaked about the OS/firmware. The Switch's OS is super barebones to minimize its resource use, but we don't know if Nintendo will opt for the same path with the Switch 2.

I've seen people estimate 1.5 GB for the OS (so 10.5 for games), but as far as I know they were just blindly speculating.
 

FireFly

Member
which double counted?

at 1ghz the GPU is a real 3.1 TFLOPS, no dual issue FP32.
with dual issue it's 6 TFLOPS.

the T239 has 1536 Cuda Cores. each Cuda Core can do 2 operations per clock

so 1536 x 2 x 1000 = 3.07 TFLOPS
The teraflop figure is real for compute applications, but real games will see a smaller boost due to needing integer operations, which are shared with one of the FP32 units.
 

Wolzard

Member
It looks very similar to the MX570.

Ampere Architecture
2048 CUDA cores
FP32 - 4.731 TFLOPS
Memory Bus - 64 bit
Bandwidth - 96.00 GB/s
Base Clock - 832 MHz
Boost Clock - 1155 MHz


It obviously suffers more due to the lack of VRAM, but the performance is somewhat similar to a PS4 Pro.

 

kevboard

Member
The teraflop figure is real for compute applications, but real games will see a smaller boost due to needing integer operations, which are shared with one of the FP32 units.

well, if you go by that logic then all out TFLOPS numbers are inflated and the PS5 only is a 5.1 TFLOPS system, the Series X 6 TFLOPS, the PS4 900 GFLOPS and so on.

all of these systems' performance is calculated with 2 instructions per clock per shader unit.

either way, these leaked clocks would mean 3.1 TFLOPS or 1.55 TFLOPS... but if you go by the latter you also have to cut the numbers of all other current and last gen consoels in half to compare
 

Haint

Member
it's not. Nvidia's shaders, like AMD's can do 2 instructions per clock.
this isn't dual issue, this is how Nvidia's and AMD's FP32 TFLOPs are calculated without dual issue.

just an example:

Example, the Xbox One:
768 shaders x 853MHz x 2 instructions per clock = 1,310,208 = 1.31 TFLOPS

the shaders run 2 operations per cycle, even with old architectures like AMD's GCN

The double count is baked into the shader core count with Nvidia starting with ampere. See: RTX 2080, 2944cores x 1700Mhz x 2, ~10TF's Vs. first double counted generation, RTX 3080, 8704 cores x 1700mhz x 2, ~30TF's
 
Last edited:
Top Bottom