At CES 2024, NVIDIA unveiled an array of hardware and software aimed at unlocking the full potential of generative AI on Windows 11 PCs.
Running generative AI locally on a PC is critical for privacy, latency and cost sensitive applications. At CES, NVIDIA is bringing new innovations across the full technology stack to enable the generative AI era on PC. RTX GPUs are capable of running the broadest range of applications, with the highest performance. Tensor Cores in these GPUs dramatically speed up AI performance across the most demanding applications for work and play.
NVIDIA introduced the GeForce RTX 40 SUPER Series family of GPUs. The GeForce RTX 4080 SUPER generates AI video over 1.5X faster and images over 1.7X faster than the GeForce RTX 3080 Ti. Tensor Cores on SUPER GPUs deliver up to 836 trillion AI operations per second (TOPS) — bringing transformative AI capabilities to gaming, creating and everyday productivity.
New laptops start shipping later this month from every top Original Equipment Manufacturer (OEM) – including Acer, ASUS, Dell, HP, Lenovo, MSI, Razer, Samsung and more. These systems bring a full set of generative AI capabilities out-of-the-box.
NVIDIA RTX desktops and mobile workstations, powered by the NVIDIA Ada Lovelace architecture, also deliver the performance necessary to meet the challenges of enterprise workflows.
Mobile workstations with RTX GPUs can run NVIDIA AI Enterprise software, such as TensorRT and NVIDIA RAPIDS for simplified, secure generative AI and data science development. A three-year license for NVIDIA AI Enterprise is included with every RTX A800 40 GB GPU, providing a workstation development platform for AI and data science.
AI Workbench, a unified toolkit that allows developers to quickly create, test and customize pretrained generative AI models and large language models (LLM), will be released in beta later this month. It provides developers with the flexibility to collaborate on and migrate projects to any GPU-enabled environment. It also offers streamlined access to popular repositories like GitHub.
Once AI models are built for PC use cases, they can then be optimized to take full advantage of Tensor Cores on RTX GPUs through NVIDIA TensorRT, a library for high-performance AI inference. NVIDIA recently extended TensorRT to text-based applications by releasing TensorRT-LLM for an open-source library for accelerating large language models. The latest update to TensorRT-LLM is now available, which adds Phi-2 to the growing list of pre-optimized models for PC, which run up to five times faster compared to other inference backends.
With these new tools and libraries, PC developers are primed to deliver even more generative AI applications on top of the over 500 AI-powered PC games and applications currently accelerated by RTX GPUs.
At CES, NVIDIA and its developer partners are releasing several new generative AI-powered applications and services including NVIDIA RTX Remix, a platform for creating RTX remasters of classic games. It’s releasing into open beta later this month with generative AI texture tools that transform textures from classic games into modern 4K physically based rendering (PBR) materials.
NVIDIA ACE microservices are also releasing to include generative AI speech and animation models to enable developers to add intelligent, dynamic digital avatars to games. With Chat with RTX, an NVIDIA tech demo, AI enthusiasts can easily connect PC LLMs to their own data using a popular technique known as retrieval augmented generation (RAG). It’s accelerated by TensorRT-LLM, enabling users to interact with their notes, documents and other content. It’s also available as an open-source reference project so developers can easily implement the same capabilities into their own applications.
Find out more about these and other announcements from NVIDIA at CES 2024.
Source: Windows Blog
—