They're not fake teraflops. Ampere adds the ability for the integer unit to process FP32 operations, effectively doubling throughput when integer operations are not required.
"In the Turing generation, each of the four SM processing blocks (also called partitions) had two primary datapaths, but only one of the two could process FP32 operations. The other datapath was limited to integer operations.
GA10X includes FP32 processing on both datapaths, doubling the peak processing rate for FP32 operations. One datapath in each partition consists of 16 Ampere GPU Architecture FP32 CUDA Cores capable of executing 16 FP32 operations per clock. Another datapath consists of both 16 FP32 CUDA Cores and 16 INT32 Cores, and is capable of executing either 16 FP32 operations OR 16 INT32 operations per clock. As a result of this new design, each GA10x SM partition is capable of executing either 32 FP32 operations per clock, or 16 FP32 and 16 INT32 operations per clock.
All four SM partitions combined can execute 128 FP32 operations per clock, which is double the FP32 rate of the Turing SM, or 64 FP32 and 64 INT32 operations per clock."
So if your application only has FP32 operations, performance will double. The issue is games require integer operations as well, so the real world compute boost in games is less. Nevertheless we can see Ampere parts competing with RDNA 2 parts that have ~0.7x Ampere TF, and we know that RDNA 2 has 1.25X IPC boost over GCN. So a 1.72 TF Ampere part should be roughly equivalent to a 1.2 TF RDNA 2 part or a 1.5 TF GCN part. In other words below the PS4, but probably roughly in line with the Steam Deck's real world performance.