Full ComfyUI in your browser, no GPU needed. Pay Buzz per second only when you run. Now in preview. [Open Comfy on Civitai](https://comfy.civitai.com)

Comfy on Civitai

These are notes for installing SageAttentio 2 on a local PC. If you want to install it with an embedded Python, replace every place that runs Python with the embedded Python executable.Because SageAttention has poor compatibility with other packages, make sure every installed component’s versions match to avoid errors.References:<a target="_blank" rel="ugc" href="https://www.reddit.com/r/StableDiffusion/comments/1iyt7d7/automatic_installation_of_triton_and/?tl=zh-hant">https://www.reddit.com/r/StableDiffusion/comments/1iyt7d7/automatic_installation_of_triton_and/?tl=zh-hant</a><a target="_blank" rel="ugc" href="https://github.com/wildminder/AI-windows-whl?tab=readme-ov-file">https://github.com/wildminder/AI-windows-whl?tab=readme-ov-file</a><a target="_blank" rel="ugc" href="https://github.com/woct0rdho/triton-windows">https://github.com/woct0rdho/triton-windows</a><a target="_blank" rel="ugc" href="https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/cli_args.py">https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/cli_args.py</a>Prepare the data<ul><li><a target="_blank" rel="ugc" href="https://www.python.org/downloads/release/python-3120/">python3.12</a></li><li><a target="_blank" rel="ugc" href="https://developer.nvidia.com/cuda-12-6-0-download-archive">CUDA Toolkit 12.6</a></li><li>Pytorch 2.6</li><li>Triton 3.2</li><li><a target="_blank" rel="ugc" href="https://visualstudio.microsoft.com/zh-hant/visual-cpp-build-tools/">MSVC</a></li><li><a target="_blank" rel="ugc" href="https://github.com/wildminder/AI-windows-whl?tab=readme-ov-file">sageattention-2.2.0-cp312-cp312-win_amd64.whl</a></li></ul><h2 id="1.install-python-3.12">1.Install Python 3.12</h2>Go to the official Python website to download and install Python 3.12 for your system.<a target="_blank" rel="ugc" href="https://www.python.org/downloads/release/python-3120/">https://www.python.org/downloads/release/python-3120/</a>Please remember to add Python to your environment variables so you can use it in CMD later.<h2 id="2.install-cuda-toolkit-12.6">2.Install CUDA Toolkit 12.6</h2>Go to NVIDIA’s official website to download and install CUDA Toolkit 12.6 for your system.<a target="_blank" rel="ugc" href="https://developer.nvidia.com/cuda-12-6-0-download-archive">https://developer.nvidia.com/cuda-12-6-0-download-archive</a><h2 id="3.install-msvc-build-tools">3.Install MSVC Build Tools</h2>Go to the official visualstudio website to download and install MSVC Build Tools for your system.<a target="_blank" rel="ugc" href="https://visualstudio.microsoft.com/zh-hant/visual-cpp-build-tools/">https://visualstudio.microsoft.com/zh-hant/visual-cpp-build-tools/</a>check the items as shown in the picture.<ul><li>Desktop development with C++</li><li>MSVC v143 - VS 2022 C++ x64/x86 build tools</li><li>Windows 10/11 SDK</li><li>C++ CMake tools for Windows</li><li>Ninja（"Some versions are under “Individual Components.”）</li></ul><edge-media url="0754dee1-8768-4cdb-82cd-edad806a9014" type="image" filename="automatic-installation-of-triton-and-sageattention-into-v0-4mba5gfvkile1.webp"></edge-media>After installation, add <code>cl.exe</code> to the <code>PATH</code> environment variable.<pre><code>C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.44.35207\bin\Hostx86\x64</code></pre><h2 id="4.install-pytorch-2.6">4.Install Pytorch 2.6</h2>Run the following commands in CMD.<pre><code>pip install torch==2.6.0+cu126 torchvision==0.21.0+cu126 torchaudio==2.6.0+cu126 --index-url https://download.pytorch.org/whl/cu126</code></pre>Verify the installed version.<pre><code>python -c "import torch; print(torch.__version__, torch.version.cuda)"</code></pre><h2 id="5.install-triton-3.2-wheel">5.Install Triton 3.2 Wheel</h2>Install the required packages first.<pre><code>pip install onnxruntime-gpu
pip install wheel
pip install setuptools
pip install packaging
pip install ninja
pip install "accelerate &gt;= 1.1.1"
pip install "diffusers &gt;= 0.31.0"
pip install "transformers &gt;= 4.39.3"
python -m ensurepip --upgrade
python -m pip install --upgrade setuptools</code></pre>Install Triton Wheel<pre><code>pip install https://github.com/woct0rdho/triton-windows/releases/download/v3.2.0-windows.post10/triton-3.2.0-cp312-cp312-win_amd64.whl</code></pre><h2 id="6.double-check-the-environment-before-installing-sageattention.">6.Double-check the environment before installing SageAttention.</h2>Run the following commands in CMD.<pre><code>python --version

where cl

nvcc --version

nvidia-smi

python -m torch.utils.collect_env

where cudart64_*.dll

pip show Triton

python -c "import torch; print(torch.__version__, torch.version.cuda)"</code></pre>When checking, make sure the CUDA ABI matches the CUDA version.The ABI (Application Binary Interface) determines how C++/CUDA extension modules interact with the PyTorch core during compilation.<h2 id="7.install-sageattention-2.1.1">7.Install SageAttention 2.1.1</h2>Download the matching SageAttention <code>.whl</code> file from<a target="_blank" rel="ugc" href="https://github.com/wildminder/AI-windows-whl?tab=readme-ov-file">https://github.com/wildminder/AI-windows-whl?tab=readme-ov-file</a>After downloading, use Python to run the <code>.whl</code> file.<pre><code>python -m pip install D:\downloads\package-1.2.3-cp312-cp312-win_amd64.whl</code></pre>After installation, verify that it was installed successfully.<pre><code>pip show SageAttention</code></pre>If you plan to install SageAttention 2.2, note that it only supports CUDA ≥ 12.8; therefore PyTorch ≥ 2.7 is required.Afterward, either pass <code>--use-sage-attention</code> in ComfyUI’s command-line arguments or use the node in <a target="_blank" rel="ugc" href="https://github.com/kijai/ComfyUI-KJNodes">KJ_Nodes</a>—use either method.<h2 id="8.supplementary-materials">8.Supplementary materials</h2>A list of NVIDIA GPU architecture names for each generation and their corresponding architecture numbers:<edge-media url="b1858a25-e94a-474f-99d6-ff508adcecf3" type="image" filename="螢幕擷取畫面 2025-10-23 184700.png"></edge-media><ul><li>Fermi and Kepler were deprecated starting with CUDA 9 and 11</li><li>Maxwell was deprecated starting with CUDA 11.6</li></ul><h2 id="9.postscript">9.postscript</h2>The official SageAttention documentation is really sparse, causing many people to run into installation issues.I was also frustrated by all the version compatibility problems until I found <a target="_blank" rel="ugc" href="https://github.com/wildminder/AI-windows-whl?tab=readme-ov-file#sageattention-22-sageattention2">AI-windows-whl</a> on GitHub, which finally solved it. Just run the prebuilt .whl directly.Hopefully the project will provide version-matched .whl files officially in the future.<hr /><span style="background-color:rgb(26, 27, 30);color:rgb(193, 194, 197);font-family:-apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, Helvetica, Arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;;font-size:16px">Update 2026/01/13I updated my environment to Python3.12 and PyTorch 2.10.0 and CUDA 13.0, and reinstalled SageAttention 2.2.I re-downloaded and installed the corresponding .whl file, but SageAttention 2.2 shows an error: “from . import fused ImportError: DLL load failed while importing fused: The specified procedure could not be found.”After repeatedly troubleshooting with ChatGPT, I found the issue was with AIB: “This .pyd was built with a different PyTorch version/build options.”Switching to <a target="_blank" rel="ugc" href="https://github.com/woct0rdho/SageAttention/releases">woct0rdho’s Windows release</a> wheels (ABI3 + libtorch stable ABI) and reinstalling once fixes it.<a target="_blank" rel="ugc" href="https://github.com/woct0rdho/SageAttention/releases">https://github.com/woct0rdho/SageAttention/releases</a><edge-media url="39f29c38-c9fd-4203-9e4c-7b1b402449e7" type="image" filename="螢幕擷取畫面 2026-01-13 140351.png"></edge-media>Below is a verification script that can help with troubleshooting.You can find _fused*.pyd in a single line.<pre><code>python -c "import glob,os; sp=r'C:\Users\HomePC\AppData\Local\Programs\Python\Python312\Lib\site-packages'; print(glob.glob(os.path.join(sp,'sageattention','_fused*.pyd')))"</code></pre>List all files inside the sageattention package<pre><code>dir /b "C:\Users\HomePC\AppData\Local\Programs\Python\Python312\Lib\site-packages\sageattention"</code></pre><edge-media url="faccb13d-71f8-4409-a36f-072ba21a12da" type="image" filename="螢幕擷取畫面 2026-01-13 132249.png"></edge-media>Check whether certain DLL dependencies exist<pre><code>where msvcp140.dll
where vcruntime140_1.dll
where cudart64_13.dll
where cublas64_13.dll</code></pre><edge-media url="91921bed-c326-4691-8874-d85894e1d6b6" type="image" filename="螢幕擷取畫面 2026-01-13 132150.png"></edge-media>Quickly rule out VC++ Runtime issues<pre><code>powershell -NoProfile -Command "(Get-Item $env:windir\System32\vcruntime140_1.dll).VersionInfo.FileVersion"
powershell -NoProfile -Command "(Get-Item $env:windir\System32\msvcp140.dll).VersionInfo.FileVersion"
</code></pre>Verify that “_fused” can be imported<pre><code>python -c "from sageattention import _fused; print('_fused OK')"</code></pre><hr />Update 2026/01/13: Post-update notesI updated my entire environment to install and enable <a target="_blank" rel="ugc" href="https://github.com/Comfy-Org/comfy-kitchen">Comfy Kitchen</a>.I also updated ComfyUI to v0.8.2.In this environment, everything runs very smoothly—SDXL, Z-Image, and Wan2.2 all work properly, with a slight performance improvement.When updating the environment, I tried upgrading Python to 3.13, but compatibility with older plugin nodes was too poor and caused many bugs, so I decided to stick with Python 3.12.For now, I’ll use PyTorch 2.10.0, Python 3.12, and CUDA 13.0 as the base environment for running ComfyUI.If you’ve also installed Comfy Kitchen, you can use this command to verify whether Comfy Kitchen was installed successfully.<pre><code>python -s -c "import comfy_kitchen as ck; print('comfy-kitchen', ck.__version__); print('backends', ck.list_backends())"</code></pre><ul><li>You’ll see a list of available backends (e.g., cuda / triton / eager).</li><li>If it only shows eager (or there’s no cuda), it usually means the CUDA wheel didn’t load successfully, but it can still run (just not at full performance).</li></ul><edge-media url="01414231-f81c-4bf4-9c8b-8be79c707547" type="image" filename="螢幕擷取畫面 2026-01-13 140142.png"></edge-media><edge-media url="6d7799cc-bb57-46aa-aedf-9e37d3532c51" type="image" filename="螢幕擷取畫面 2026-01-13 14075412313.png"></edge-media>