My 4K 100 frames! NVIDIA GeForce RTX 4090 Graphics Card Test Report

NVIDIA GeForce RTX 4090

Creative martial arts, and gaming magic, I am NVIDIA GeForce RTX 4090. NVIDIA’s new generation of Ada Lovelace GPU architecture, officially launched after the first flagship card, not only process upgrades, and soaring clocks, but also Tensor Core and RT Core upgrades, as well as the new generation of DLSS 3’s, AI frame-filling technology to open up the era of GPU acceleration, At the same time, it has an AV1 dual-encoding engine, which not only meets the dream of gamers’ 4K 100 frames, but also is a specialist in 3D rendering, creation, simulation, and computing. Let us examine this new generation of cards from the perspective of architecture, specifications and performance. Upgrade the experience.

After meeting the new generation of flagship cards Ada Lovelace, NVIDIA GeForce RTX 4090

NVIDIA’s new generation of Ada Lovelace micro-architecture, RTX 40 series, the first flagship GeForce RTX 4090 will be officially sold tomorrow 10/12, priced at $1599. Then two new cards of GeForce RTX 4080 16GB and 12GB will also be launched in November, priced at $1199 and $899.

Ada Lovelace – A Quantum Leap.

This generation of NVIDIA switched to TSMC 4N custom process and Ada Lovelace micro-architecture, allowing the GPU to have more SM units, further increasing the number of CUDA, Tensor Core and RT Core, etc. The increase in the number of units such as TMUs and ROPs, coupled with the ultra-high clock of 2.5GHz Boost, allows the RTX 4090 to surpass the RTX 3090 Ti with 2-4x faster performance.

The GeForce RTX 4090 has 16,384 CUDA cores, 512 4th Gen Tensor Cores and 128 3rd Gen RT Cores, a 2.52 GHz Boost clock and 24GB of GDDR6X memory. With super crazy specifications to meet the needs of professional creation, rendering and flagship gamers at the same time, this super crazy graphics card is priced at NT$ 56,990.

The GeForce RTX 4080 has two specifications, the RTX 4080 16GB has CUDA 9728, Tensor Core 304, RT Core 76, 2.51GHz Boost clock, starting from NT$ 42,990; and RTX 4080 12GB is CUDA 7680, 240 Tensor Cores, 60 RT Cores, 2.61GHz Boost clock, starting from NT$31,990.

RTX 4090, RTX 4080 spec sheet.
RTX 4090 will replace RTX 3090 / Ti products, while RTX 4080 will replace RTX 3080 12GB / Ti products.

In terms of US dollar pricing, it is true that this generation of RTX 4080 is more expensive, but according to the current performance basis given by NVIDIA, RTX 4090 is 2-4x faster than RTX 3090 Ti, while RTX 4080 is 2-4x faster than RTX 3080 Ti times. Therefore, in terms of performance and price, the price/performance ratio of the RTX 4090 is definitely better than that of the RTX 3090 Ti. Even if the RTX 4080 12GB performance can compete with the RTX 3090 Ti, the price of this generation of RTX 4080 will increase accordingly.

At the end, why NVIDIA can increase the performance of each generation by 2-4x so crazy? In addition to the process improvement and the soaring clock, the more important thing is the new technology SER, Displaced Micro-Mesh Engine, Opacity Micro map brought by the new architecture of Ada Lovelace Engine, as well as Optical Flow Accelerator and DLSS 3’s AI supplementary frame, this is NVIDIA’s wildest technical strength.

Ada Lovelace Key Technology.

Ada Lovelace: A complete core AD102 GPU with process clock upgrade, advanced ray tracing and dual AV1 Encode

Ada Lovelace micro-architecture, reaching 76.3 billion transistors, with a total of 12 sets of GPC, 72 sets of TPC and 144 sets of SM units, which means There are a total of 18432 CUDA, 576 Tensor Cores and 144 RT Cores. The current flagship RTX 4090 of this generation is only under 128 SM units, which means that there is still a possibility that the RTX 4090 Ti can surpass the current flagship product in the future.

In addition to improving the SM unit, Ada also has GDDR6X high-speed memory, the 4th generation Tensor Cores to improve AI inference performance, the 3rd generation RT Core to improve the quality of ray tracing, and the 8th generation audio and video encoder to support AV1 hardware encoding. With 2-4x performance upgrade from DLSS 3.

AD102 full body GPU block diagram.

ADA Gen 4 Tensor Cores

Tensor Core is a high-performance computing core, which brings breakthrough performance improvement for matrix calculations required for deep learning training and inference. The core is aimed at “matrix multiplication” and “accumulate math operations”. Design plays a very important role in the application of AI and HPC.

Compared with the Ampere architecture, Ada can bring 2x the Tensor TFLOPS performance improvement of FP16, BF16, TF32, INT8 and INT4. At the same time, adding the FP8 Transformer Engine of the Hopper architecture can provide 1.3 PetaFLOPS of Tensor Core performance.

Ada Lovelace.

ADA 3rd Gen RT Core and New Optical Tracking Technology The 3rd Gen RT Core with

Ada architecture can bring 2x faster Ray-Triangle Intersection output performance (compared to the previous Ampere). The newly added ” Opacity Micromap Engine ” allows objects to have the Virtual Mesh of Micro-Triangles, which can be used to record the opaque state of objects, with three states: opaque, transparent or unknown.

ADA’s Opacity Micro map Engine enables objects to have a Virtual Mesh record opacity state.

If the ray is traced to an opaque representation, the “hit” is recorded and returned, and if it is a transparent area, the ray tracing is directly ignored, and the unknown area is handed over to SM to process the intersection of rays through a stylized Shader. The Opacity Micro map Engine is used to evaluate opacity masks, ie equilateral triangle masks that report the intersection of “rays/triangles” using barycentric coordinates.

Through Opacity Micro map Engine, geometry alpha-test can be directly performed, reducing the alpha calculation burden of Shader, and this function allows developers to draw more complex shapes, translucent objects, such as ferns, fences and other objects, and through the Ada RT Core for efficient ray tracing.

Like the smoke of the game screen, the use of Opacity Micro map Engine technology can reduce the complexity of ray tracing.
On the left of the above picture is the original ray tracing amount. The darker the colour, the more light is calculated, but it is clear that the smoke effect is quite translucent, but the overall performance is reduced; the right of the picture is required after applying Opacity Micro map Engine. The amount of ray tracing calculation is increased, thereby improving the performance of ray tracing when dealing with complex objects and transparent effects.

In addition, Ada RT Core adds a ” Displaced Micro-Mesh Engine ” to solve increasingly complex geometric scenes, the computational burden of ray tracing, and the reduction of memory/storage capacity required for BVH data. Displaced Micro-Mesh uses the correlation of geometric space to represent complex geometry with reference triangle points plus displacement direction.

In this way, the generation of a large number of BVH structures can be avoided during ray tracing, the performance of BVH traversal can be more effective, and the existing Micro-Mesh LOD can be used to render the original geometry during rasterization. That is to say, a Displaced Micro-Mesh Engine can create highly detailed geometric ray tracing using simple BVH, datum triangle points and displacement direction maps.

To use ray tracing to render the crab shell in detail, it is necessary to subdivide the crab into 1024 Triangles, and then use the BVH algorithm to calculate the light change of each position, which will generate a very large amount of BVH data and performance loss.
Ada’s Displaced Micro-Mesh Engine is to add a complex surface to a displacement direction map with 1 Triangle so that only a simple BVH calculation is needed, and the displacement direction map is used to calculate the light change at each location.

This generation of Ada adds a new ” Shader Execution Reordering ” (SER) function, which can dynamically arrange Shaders to process light to achieve better execution benefits.

To put it simply, when a scene calculates ray tracing, it starts from the main ray to calculate the encountered objects, and then the reflection and ambient diffusion generated by the main ray will perform a second ray tracing, but the second ray tracing is messy. The order of the situation also leads to the poor performance of the Shader of the second ray tracing.

In short, SER can optimize the Shader of ray tracing, making the operation more efficient.

Therefore, after Shader Execution Reordering is added to the ray tracing pipeline, it can reorder and group the same Shaders hit by the second ray tracing, so that the ray tracing Shaders have better efficiency. SER can provide a 2x RT Shader performance improvement, and when Cyberpunk 2077 is running in Overdrive Mode, there is a 44% performance increase in SER results.

This diagram better explains the SER function. 
The calculation of the first ray tracing is a sequential state, but the second ray tracing includes reflection, refraction, and diffusion. Therefore, the Shader cannot have the best performance due to the disorder in the calculation. It can be optimized by SER. Greatly improved performance.

DLSS 3 and Optical Flow Accelerator in the Great Acceleration Era

As games have rich objects, more complex geometry, beautiful worlds, and stacks of technologies such as physically realistic ray tracing, traditional GPU rendering performance cannot keep up with the needs of contemporary games, so NVIDIA is the first The development of DLSS technology for AI deep learning acceleration has also allowed various GPU manufacturers to launch their own acceleration technology and officially ushered in the era of GPU acceleration.

When “Battlefield V” came out in 2018 imported ray tracing, there were only 39 ray tracings per pixel, but 4 years later, “Electric Rider 2077” can reach 635 ray tracings per pixel, which is very efficient. But a huge change.

The unique “DLSS 3” of the RTX 40 is based on the technology of DLSS 2, adding the concept of “AI supplementary frames”, which is the function of the Optical Flow Accelerator. Optical Flow is an optical flow method used in computer vision to calculate the direction and amount of movement of each pixel in a continuous image.

DLSS 3 technology requires the game engine to provide: lower-resolution rendering images and Motion Vectors, inferring high-resolution images through DLSS’s deep learning network, and providing the images to the Optical Flow Accelerator to calculate the movement direction of each pixel And the amount of movement, and finally through the Optical Multi Frame Generation to generate the AI ​​supplementary frame picture.

DLSS 3 uses the Optical Flow Accelerator to calculate the optical flow movement direction and vector of the picture pixels and uses the AI ​​of Optical Multi Frame Generation to deduce the image of Frame2, which is the AI ​​supplementary frame in vernacular.
Why does an AI complement frame need Optical Flow Accelerator? 
The main reason is that the object in the picture above can know the motion direction through the Motion Vectors of the game engine, but if the shadow on the ground is not an object, the motion vector of the shadow will be missing, and there will be problems when supplementing the frame.
The combination of the engine’s Motion Vectors and the Optical Flow’s pixel vectors can produce more stable AI complementary frames.

When the game turns on DLSS 3, Frame Generation and reduces the delay through Reflex, 1/4 of the pixels of Frame1 of the game screen will be rendered by the game, and the remaining 3/4 will be the pixels deduced by DLSS Super Resolution, and then go to the next A picture Frame2, this one will be generated by DLSS Frame Generation, so a total of 7/8 of the pictures of Frame1 + Frame2 are generated by the AI ​​of DLSS 3.

Frame1 is rendered by the engine to render low-resolution images (1/4) and the remaining 3/4 images are generated by DLSS, and then Frame2 is completely drawn by DLSS Frame Generation, so a total of 7/8 of the 2 Frames are generated by DLSS. The AI ​​inference of DLSS was born.

DLSS 3 can provide a 2-4x increase in game performance through AI frame supplementation while maintaining a similar image quality to native rendering, but it will also increase the overall game latency, so NVIDIA forces DLSS 3 to include Reflex technology, by cancelling Render Queue allows the GPU to immediately take over rendering after the CPU has finished processing, achieving lower system latency.

Therefore, DLSS 3 combines technologies such as AI Super Resolution, Frame Generation and ReFlex, relying on the 4th generation Tensor Core, Optical Flow Accelerator, and the supercomputer used by NVIDIA to train AI, to meet the ultimate performance of 4K100fps for next-generation gamers.

DLSS 3 Full Stack.

DLSS 3 requires Ada’s Optical Flow Accelerator hardware to enable Frame Generation to achieve the desired performance improvement, so DLSS 3 is currently an exclusive feature of the RTX 40 series, and future games that support DLSS 3 will also be compatible with DLSS 2, which is DLSS Super Resolution and NVIDIA Reflex are supported on GTX 900 and above.

DLSS 3 = Super Resolution + Frame Generation + Reflex. 
The original DLSS 2 only needs Super Resolution.

Dual AV1 audio and video encoding, Portal RTX launched in November

In addition to the above-mentioned upgrade, Ada Lovelace is also equipped with dual 8th-generation NVENC encoding engines, mainly adding the audio and video encoding function of AV1. As for decoding, the 5th generation NVDEC is the same as Ampere. After all, NVDEC already fully supports the ability to decode all kinds of audio and video.

Ada Lovelace.

The RTX 40 series is equipped with dual 8th-generation NVENC encoding engines, which mainly include the function of AV1 audio and video encoding. This is also the future mainstream audio and video streaming encoding and has a better signal-to-noise ratio than H.264. Compared with the bit rate, the image picture of AV1 will be better than that of H.264 encoding.

AV1 audio and video encoding can have better picture quality and performance improvement, which is also the main audio and video encoding for future streaming.
At the same 8Mbps bit rate of left AV1 and right H.264, the details of the floor are quite different (4K SBS comparison).
Left AV1, right H.264 at the same 8Mbps bit rate, the difference in the texture of the road is clearly visible (4K SBS comparison).

The dual 8th generation NVENC encoding engine, in order to obtain a 2x increase in video output performance, requires video editing software support such as DaVinci Resolve, Voukoder, and Jianying will support the RTX 40 dual encoding engine at the first time. Adobe Premiere Pro will have to wait for a future update.

For example, when recording 8K60 images, you can use dual Encoders, each responsible for the resolution of 7680 x 2160, for better performance improvement.

Dual encoding engine.

In addition to the AV1 encoding and dual encoding engine, NVIDIA Omniverse also supports the new DLSS 3 technology, as well as the RTX Remix’s god-level game Mod production tool.

But for players, the classic game Portal With RTX will launch free DLC in November, supporting Vulkan RT compatible GPU, of course, the best experience is to use RTX 40 and DLSS 3.

Ada’s creative focus.
Portal With RTX will release free DLC in November; RTX Remix will be released soon.

NVIDIA GeForce RTX 4090 Founders Edition graphics card out of the box/back is the front classic re-enhanced

After the Ada Lovelace card, the NVIDIA GeForce RTX 4090 founding version, maintains the new graphics card aesthetic design “back is the front” and “less but better” GeForce pioneered by the Ampere generation. The Founders Edition features a sturdy, durable aluminium alloy frame for the X-Frame, with an anodized finish for a premium finish and a golden metallic finish.

Media Edition NVIDIA RTX 4090 Founding Edition.
Special design inside the box.

The inside of the frame is filled with heat dissipation fins, and the inside is a vapour chamber to dissipate heat for the GPU and VRAM, and then the waste heat is guided to the heat dissipation fins through heat pipes. This generation of RTX 4090 founding version uses a larger 116mm, FDB, 7-blade dual fan, and increases the thickness of the graphics card to 3-Slot and reduces the length of the graphics card to 30.48cm (12 inches).

This generation of vapour chamber is also optimized and has a dedicated cutout for the memory so that the vapour chamber can be in contact with the GPU more evenly, and the thermal pad of the memory is reduced to 1.5mm for better heat conduction effect; this Generation radiator can support up to 650W Qmax cooling capacity.

The back is the classic front, the RTX 4090 and the penetrating cooling airflow in front of the graphics card.
The original front is also filled with metal frames with cooling fins and rear fans, bringing a unique aesthetic design.

The RTX 4090 is fully replaced with the PCIe 12+4 Pin (12VHPWR) power supply interface, which can transmit up to 600W of power consumption in one line, and also makes the whole line more beautiful when the new card is installed. Of course, the founding version also provides a 12VHPWR to 4 PCIe 6+2pin cables.

Generally, it is recommended to connect at least 3 PCIe 6+2pins for conversion. If you buy a new power supply, it is recommended to choose a new power supply that conforms to the ATX12 V3.0 and EPS12V V2.92 specifications. In this way, you can put away the ugly cannon, as long as A 12VHPWR cable can provide the power required by the graphics card.

RTX 4090 uses PCIe 12+4 Pin (12VHPWR) for power supply.
12VHPWR in the accessories to 4 PCIe 6+2pin cables (towers).
The professional new power supply can meet the power supply needs of the RTX 4090 with only one cable.

The RTX 4090 display output provides 1 HDMI 2.1a supporting VRR, 4K120Hz / 8K60Hz HDR, and 3 DisplayPort 1.4a DSC supporting 12-bit 4K240Hz HDR / 12-bit 8K60Hz HDR and other output capabilities, and can connect up to 4 screen outputs at the same time.

RTX 4090 display output.

NVIDIA GeForce RTX 4090 creative video output, GPU rendering performance test

This test includes creative tests such as Adobe Premiere Pro 2020, DaVinci Resolve 18 and Blender, and games are tested at 2160p, 1440p resolution, full effects, e-sports, AAA games and lighting Chase the performance of the game, and add additional DLSS 3 pre-testing so that players can fully understand why the RTX 4090 is powerful (so expensive).

The most powerful graphics card is the NVIDIA GeForce RTX 3090 Founders Edition.

Test Platform
Processor: Intel Core i9-12900K
Motherboard: ASRock Z690 PG Velocita
Graphics Card: NVIDIA GeForce RTX 4090 Founding Edition, NVIDIA GeForce RTX 3090 Founding Edition
System Disk: Solidigm P41 Plus 1TB PCIe 4.0 SSD
Cooler: ASUS ROG STRIX LC II 280mm
Power Supply: Seasonic PRIME PX-1000
Operating System: Windows 11 Pro 21H2 64bit, Resizable BAR On
driver version: NVIDIA 521.90

GPU-Z can view NVIDIA GeForce RTX 4090 information, AD102 GPU with 4nm process, 16384 rendering CUDA cores, and 24576 MB GDDR6X (Micron) memory and the GPU is pre- Set clock to 2235 MHz and Boost to 2520 MHz.

DXVA Check decoder test, all video codecs currently support decoding of various resolutions.

DaVinci Resolve 18 is a purely GPU-accelerated video editing program, including powerful colour correction and special effects functions, and directly uses CUDA core computing, so that the playback and output of video clips have very good performance. The beta version includes support for NVIDIA AV1 encoding.

DaVinci Resolve 18.

This test is divided into two parts. The first test project uses 4K Blackmagic RAW images, and each has a Wedding_Heavy_Styles timeline. This video uses a lot of Resolve effects, such as OFX: Light Rays / Glow / Sketch, etc., output a Pretty high-style movie genre.

Bride_FaceRefine_Selective_Color uses Face Refinement for face tracking and highlights the main bride with colour; 50% Retime and Optical Flow – Enhanced Better both use Optical Flow technology to reduce the speed of the image by 50%.

SuperScale2x 4K Source uses 4K ProRES source video to produce 4K video output of 2x Zoom In subject; SuperScale4x HD_Source uses HD H.264 source video and uses Resolve Super Scale to output 4K videos.

The performance of this part of the RTX 4090 can be said to be very outstanding, especially in the Optical Flow test, which saves nearly 2 times the output time, allowing creators to have faster output performance.

DaVinci Resolve 18, the less time the better.

The second test is the AV1 and HEVC encoding test with dual NVENC encoding. The test project is a 44-second short film from the Blender Open Movie Project “Tears of Steel”, and has 8k Prores442HQ 30FPS and 4K Prores422HQ 30FPS video, available To test the performance of the output for HEVC, AV1 encoding.

The output settings mainly use NVIDIA Encoder, Quality: Restrict to 80000 Kb/s, Encoding Profile: Main
Rate Control: Constant Bitrate, Preset: Faster, Tuning: High Quality, Two Pass: Disable and other output settings.

In terms of performance, the performance of RTX 4090 is not much different from that of RTX 3090 when outputting 4K30, but when processing 8K output, the dual encoding engine of RTX 4090 accelerates HEVC output time by 2x, and the speed of AV1 encoding is also quite fast, it can be seen that As long as the video editing software supports the RTX 40 dual-encoding engine, it can achieve excellent encoding performance growth.

DaVinci Resolve 18 dual NVENC encoding test, the shorter the better.

Adobe Premiere Pro 2022 video editing software, using the self-developed Mercury Playback Engine GPU acceleration, can use the GPU’s encoding engine to accelerate the image output speed. Test project 1 is the company’s 1080p60fps out-of-the-box video; the BigMix4K project uses 3 segments of FinalAdjusted_MPE 1920×1080 images to form a 4K timeline for H.264 and HEVC format output.

(The tested Premiere Pro 2022 does not yet support the RTX 4090 dual-encoding feature.)

The RTX 4090 still has a faster output speed than the RTX 3090 in terms of performance, but unlike the supported DaVinci Resolve, which can have an amazing output time reduction. Therefore, this test is mainly provided for your reference.

Adobe Premiere Pro 2022.
Adobe Premiere Pro 2022 output, the shorter the better.

Blender is a cross-platform, open-source 3D authoring tool that supports various 3D tasks: Modeling, Rigging, Animation, Simulation, Rendering, Compositing, and Motion Tracking. For the test, use Blender Benchmark 3.3.0 to test the rendering work of the Demo project.

Blender Benchmark 3.3.0 test, it can be seen that the number of samples per minute (efficiency) of RTX 4090 in 3 scenes is about twice as much as that of RTX 3090, showing the 3D creation strength of Ada Lovelace.

Blender, the higher the performance, the better.

V-Ray Benchmark is developed by Chaos Group. V-Ray is a ray rendering software designed based on the laws of physics, and this tool can perform calculation tests on the rendered images of ray tracing for CPU and GPU respectively.

V-Ray Whether it is GPU RTX or CUDA calculation, RTX 4090 beats RTX 3090 with 1.9x times the number of Vpaths.

V-Ray Benchmark, higher performance is better.

SPECviewperf 2020 is based on the standard drawing performance testing tool developed by professional applications, testing various professional computer graphics software such as 3ds Max, Catia, Creo, Energy, Maya, Medical, SNX, SolidWorks and other drawing tests and engineering simulations.

The test is 1920 x 1080 resolution, and the result is FPS. This performance is related to the tools used. The performance of the RTX 4090 is improved by about 1~2.9x times, depending on the program and situation of the test.

SPECviewperf 2020, a higher FPS is better.

NVIDIA GeForce RTX 4090 – 3DMark benchmark performance test

3DMark Fire Strike performance test is the mainstream DirectX 11 API test scenario, testing the performance of 1080p, Extreme 1440p and Ultra 2160p respectively.

The RTX 4090 gave Fire Strike a score of 54174, while the Ultra Graphics score was 2x faster than the RTX 3090, and the Extreme was 1.8x and FHD 1.6x.

3DMark Fire Strike, the higher the score, the better.

3DMark Time Spy is a test scenario designed with DirectX 12 API, which is also locked in AAA game level, and tests the performance of 1440p and Extreme 2160p respectively.

The RTX 4090 achieved a total score of 32638 points in Time Spy, which is 1.8x and 1.9x higher than the RTX 3090 respectively.

3DMark Time Spy, the higher the score, the better.

For ray tracing tests, 3DMark Port RoyalAdding ray tracing to scenes in AAA games tests the ability of a new generation of GPUs to accelerate hardware ray tracing. At the same time, the XDR test is a functional test using the DirectX Raytracing API.

Even without DLSS, the RTX 4090 can have quite amazing ray tracing performance. The Port Royal achieves 119 FPS and XDR 138 FPS. Compared with the RTX 3090, it also brings a 1.95x and 2.44x performance improvement.

3DMark Port Royal, the higher the better.

3DMark DLSS Feature Test can perform performance tests for DLSS 3 and DLSS 2. When set to 3840 x 2160 and Performance acceleration, RTX 4090 can get 138 FPS / 2.3x performance improvement in DLSS 2. After DLSS 3 uses AI to generate images, it can be Up to 193 FPS / 3.3x performance boost.

In the future, we will use the game to measure the performance of DLSS 3.

3DMark DLSS Feature Test, the higher the better.

NVIDIA GeForce RTX 4090 – 4 e-sports games performance test

4 e-sports games: “Rainbow Six: Siege”, “League of Legends”, “APEX Heroes” and “CS:GO”, etc., are all heavy skills, teamwork tactical competitive shooting and DOTA-type games, so the game FPS is also an average of more than 100 frames under the condition of low game picture quality and details. Tested at 2160p, 1440p, and the highest settings for special effects.

For e-sports games, the performance increase brought by RTX 4090 is not obvious. After all, RTX 3090 can also provide very strong performance for e-sports games at 2160p. In the test, only “Rainbow Six: Siege” has obvious performance improvement. , the other three are relatively close.

2160p e-sports game test, the higher the FPS, the better.
1440p gaming test, the higher the FPS, the better.

NVIDIA GeForce RTX 4090 – 11 games performance test

The average performance of 11 AAA games , also tested at 2160p, 1440p, with full effects on, this test only uses light chase for F1, and the rest of the games have no light chase, no DLSS acceleration, test GPU actual traditional rendering game performance.

The game test list includes the entry-level “F1 2021”, “Forza Horizon 5” racing game, “Tomb Raider: Shadow”, the movie game “Death Stranding”, “Gears of War 5”, “The Division 2″, ” Horizon: Expecting Dawn, as well as performance-heavy tests such as Borderlands 3, Assassin’s Creed: Viking Age, Red Dead 2, and God of War.

The RTX 4090 has a fairly good performance improvement in 2160p and AAA games, with an average of 148 FPS in 11 games, which can achieve an average performance upgrade of about 1.7x compared to the RTX 3090’s average 89 FPS.

But at 1440p resolution, RTX 4090 averages 200 FPS, RTX 3090 averages 143 FPS, about 1.4x performance improvement.

It can be seen that under the main 2160p resolution, RTX 4090 can bring players about 1.7x performance improvement without relying on DLSS acceleration. If expressed as a percentage, it has an average performance upgrade of 66%.

2160p AAA game test, higher FPS is better.
1440p AAA gaming test, higher FPS is better.

NVIDIA GeForce RTX 4090 – 9 ray-chasing games tested

9 ray-chasing DXR games were tested, using the most popular “Rider 2077”, “Control”, “Watch Dogs: Liberty Legion”, “Thriller: Exile” “, “Marvel’s Spider-Man Remastered Edition”, “Marvel Interstellar”, “Ghost Thread: Tokyo”, “Polar Howl 6” and “Evil Castle Village” and other games for testing. Test 2160P, 1440p resolution, in addition to the highest setting of special effects/light chase, DLSS acceleration will also be enabled, please refer to the chart for detailed settings.

The RTX 4090 can reach an average of 110 FPS under the acceleration of DLSS 2 of “Dian Yu Ke 2077”, which is much higher than the average 60 FPS of the RTX 3090; also in the light pursuit games such as “Control” and “Ghost Line: Tokyo” , you can feel the powerful light chasing game performance upgrade brought by the RTX 4090.

The RTX 4090 can reach an average of 132.3 FPS under 2160p and 9 light-chasing game tests. Compared with the RTX 3090, it has an average of 82 FPS. The light-chasing game performance is improved by about 1.6x times, an average of 65% of the upgrade.

As for 1440p, RTX 4090 averages 169 FPS, RTX 3090 averages 119.8 FPS, about 1.4x times, and 41% performance improvement.

2160p light chasing game test, the higher the FPS, the better.
1440p light chasing game test, the higher the FPS, the better.  NVIDIA

GeForce RTX 4090 – DLSS 3 performance test

During the test period, NVIDIA provided a pre-release test version, mainly for the media to have a glimpse of the performance improvement brought by DLSS 3. Tested games include Microsoft Flight Simulator, A Plague Tale: Requiem, Unreal Engine 5: Lyra, F1® 22, Unity Enemies, Traitor 2077 and Justice Online”, which use 2160p resolution and the highest settings for light tracking.

In the DLSS 3 game settings, there will be clear options for “Super Resolution” and “Frame Generation”. Both functions must be enabled at the same time to use the technology of DLSS 3, while players of the RTX 30 / 20 series can only be enabled. Super Resolution function, Frame Generation will not be enabled.

Microsoft Flight Simulator DLSS 3 settings.
The DLSS 3 setting of “Electric Rider 2077”.

RTX 4090 is accelerated by DLSS 3, and the performance of “Dian Yu Ren Ke 2077” can achieve an average performance improvement of 140 FPS about 3.5 times; and the Enemies movie animation released by the Unity engine can also be achieved with DLSS 3 under real-time ray tracing rendering. 103 FPS about 3.68x performance upgrade.

Under the DLSS 3 Performance setting, the RTX 4090 can achieve a performance improvement of about 1.9x~4.7x times, and the average is about 2.95x times, which is also in line with the performance of 2-4x times when NVIDIA published.

However, it is also necessary to wait for the game to support DLSS 3 technology in order to benefit the players of the RTX 40 series, but the RTX 4090 initially develops the support of DLSS 3, and when the mid-level, entry-level RTX 4060 comes out in the future, players will be able to get better game acceleration performance upgrade.

DLSS 3 game performance test, the higher the better.

NVIDIA GeForce RTX 4090 power consumption and temperature measurement

The power consumption and temperature test of the graphics card, using the Time Spy Stress test and “Electric Rider 2077” to test. The power consumption is measured using the PACT tool provided by NVIDIA, which can monitor the wattage provided by the PCIe slot and the power supply 12V.

In terms of graphics card temperature, the RTX 4090 founding version maintained a maximum temperature of 67.8°C in the stress test, and the temperature during the 2077 game was slightly reduced by 65°C, while the comparative RTX 3090 founding version has changed the thermal pad, so the temperature performance is comparable.

RTX 4090 Founders Edition GPU temperature.

In the TBP power consumption test of the graphics card, in the Time Spy Stress test, the RTX 4090 achieved an average power consumption of 390W and an instantaneous maximum of 462W, while the 2077 game had an average power consumption of 358W and instantaneous power consumption of 398W.

The RTX 4090 can use Furmark Xtreme burn-in to reach an average power consumption of 458W, which is also the upper limit of power consumption preset by NVIDIA. If the brand-made cassette is overclocked, it should also be around 500W, unless two 12VHPWR power supplies are used.

RTX 4090 Founding Edition GPU power consumption.
Furmark Xtreme burn-in comes to 458W on average and 482W instantaneously.


The NVIDIA GeForce RTX 4090 once again surpassed its predecessor with its strength, allowing DaVinci Resolve 18 to have faster video output performance under the acceleration of dual NVENC, as well as support for the new generation of AV1 encoding function, which will become the standard of the RTX 40 series, and 3D Creation performance Blender and V-Ray have nearly 2x performance upgrades, which are undoubtedly the strongest creative weapons.

In terms of game performance, it once again dominates the performance of 2160p and 4K AAA and light-chasing games. AAA games are upgraded by an average of 1.7x, and light-chasing games are upgraded by an average of 1.6x. If DLSS 3 popularizes AI frame supplementation, it can bring an average of 2.95x acceleration. , to meet the player’s dream of 4K100fps 100 frames, it is undoubtedly a plug-in-level game magic weapon.

DLSS 3 currently supports up to 35 games (including programs), but the release and update times of each game are different, so it takes a while for the new technology to become popular.


This generation has a nearly 2x performance upgrade, and it can be expected that the performance of the RTX 4080 will still be bright in the future, but with the relative 2x performance improvement, the price of the RTX 4090 is more cost-effective than the RTX 3090, but the price of the RTX 4080 will increase accordingly. Under these circumstances, how much budget do players have to pursue this ultimate performance? In particular, whether the mainstream RTX 4060 in the future can meet the expectations of players and the sweetness of the price depends on how NVIDIA calculates.

The RTX 4090 will go on sale at 9:00 p.m. tomorrow on 10/12. The suggested price in Taiwan starts at NT$ 56,990. As for whether Taiwan will sell the founding version, it will be left to NVIDIA Taiwan to announce itself; as for the RTX 4080, it will wait until November. It can be expected that there should be an RTX 40 laptop GPU next year, but the price of the first flagship should be quite high. Players who are interested in gaming laptops can pay attention to the news of next year’s CES.

4K 100fps is not a dream, because I am an NVIDIA GeForce RTX 4090.

If this article is helpful for you, please share this article with your friends on social media. Thank you!!

This article is based on the personality of the reviews. You are responsible for fact-checking if the contents are not facts or accurate.

Title: My 4K 100 frames! NVIDIA GeForce RTX 4090 Graphics Card Test Report