Today, Windows developers can leverage PyTorch to run inference on the latest models across the breadth of GPUs in the Windows ecosystem, thanks to DirectML. We’ve updated Torch-DirectML to use DirectML 1.13 for acceleration and support PyTorch 2.2. PyTorch with DirectML simplifies the setup process, through a one-package install, making it easy to try out AI powered experiences and supporting your ability to scale AI to your customers across Windows.
To see these updates in action, check out our Build session Bring AI experiences to all your Windows Devices.
See here to learn how our hardware vendor partners are making this experience great:
- AMD: AMD is glad PyTorch with DirectML is enabling even more developers to run LLMs locally. Learn more about where else AMD is investing with DirectML.
- Intel: Intel is excited to support Microsoft’s PyTorch with DirectML goals – see our blog to learn more about the full support that’s available today.
- NVIDIA: NVIDIA looks forward to developers using the torch-directml package accelerated by RTX GPUs. Check out all the NVIDIA related Microsoft Build announcements around RTX AI PCs and their expanded collaboration with Microsoft.
PyTorch with DirectML is easy-to-use with the latest Generative AI models
PyTorch with DirectML provides an easy-to-use way for developers to try out the latest and greatest AI models on their Windows machine. This update builds on DirectML’s world class inferencing platform ensuring these optimizations provide a scalable and performant experience across the latest Generative AI models. Our aim in this update is to ensure a seamless experience with relevant Gen AI models, such as Llama 2, Llama 3, Mistral, Phi 2, and Phi 3 Mini, and we’ll expand our coverage even more in the coming months!
The best part is using the latest Torch-DirectML package with your Windows GPU is as simple as running:
pip install torch-directml
Once installed, check out our language model sample that will get you running a language model locally in no time! Start by installing a few requirements and logging into the Hugging Face CLI:
pip install –r requirements.txt huggingface-cli login
Next, run the following command, which downloads the specified Hugging Face model, optimizes it for DirectML, and runs the model in an interactive chat-based Gradio session!
python app.py --model_repo “microsoft/Phi-3-mini-4k-instruct”
Phi 3 Mini 4K running locally using DirectML through the Gradio Chatbot interface.
These latest PyTorch with DirectML samples work across a range of machines and perform best on recent GPUs equipped with the newest drivers. Check out the Supported Models section of the sample for more info on the GPU memory requirements for each model.
This seamless inferencing experience is powered by our close co-engineering relationships with our hardware partners to make sure you get the most of your Windows GPU when leveraging DirectML.
Try out PyTorch with DirectML today
Trying out this update is truly as simple as running “pip install torch-directml” in your existing Python environment and following the instructions in one of our samples. For more guidance on getting setup visit the Enable PyTorch with DirectML on Windows page on Microsoft Learn.
This is only the beginning of the next chapter with DirectML and PyTorch! Stay tuned for broader use case coverage, expansion to other local accelerators, like NPUs, and more. Our goal is to meet developers where they’re at, so they can use the right tools to build the next wave of AI innovation.
We’re excited for developers to continue innovating with cutting edge Generative AI on Windows and build the AI apps of the future!
Source: Windows Blog
—