Optimize Windows For Moonlight
This guide is divided into Universal Windows Steps and Vendor-Specific Steps (choose your card).
Phase 1: Universal Windows Foundation (All GPUs)
Do these steps first regardless of which brand of GPU you have. They prepare Windows to prioritize ML tasks.
1. Enable "Ultimate Performance" Power Plan
Windows typically hides this plan. It is critical for stopping micro-latency during inference.
Open Command Prompt (Admin).
Paste this command:
powercfg -duplicatescheme e9a42b02-d5df-448d-aa00-03f14749eb61Open Control Panel -> Power Options.
Expand "Show additional plans" and select Ultimate Performance.
2. Enable Hardware-Accelerated GPU Scheduling (HAGS)
This allows your GPU to manage its own memory, bypassing the CPU bottleneck.
Press
Win + I(Settings) -> System -> Display -> Graphics.Click "Change default graphics settings" (blue link).
Turn Hardware-accelerated GPU scheduling to On.
Requires a Restart.
3. Force "High Performance" for Your Apps
Windows loves to push background apps (like a Python terminal or Moonlight) to the integrated graphics to save battery. Stop this.
Go to System -> Display -> Graphics.
Under "Add an app", select Desktop app -> Browse.
Find
Moonlight.exeand click Add.Click the app in the list -> Options -> Select High Performance -> Save.
Phase 2: Choose Your GPU Path
🟢 Option A: NVIDIA
Drivers: Install the NVIDIA Studio Driver (more stable for compute) rather than Game Ready. You can also try Game Ready if you experience issues.
Control Panel:
Right-click Desktop -> NVIDIA Control Panel.
Manage 3D Settings -> Power Management Mode -> Prefer Maximum Performance.
Low Latency Mode -> On (Reduces token delay).
CUDA - Sysmem Fallback Policy -> Prefer No Sysmem Fallback (Prevents using slow RAM if VRAM fills up).
🔴 Option B: AMD
Enable "Compute Mode" (Older cards like RX 580):
Open AMD Software: Adrenalin Edition.
Settings (Gear icon) -> Graphics.
Scroll to Advanced -> GPU Workload -> Set to Compute.
Note: Newer RDNA (6000/7000 series) cards handle this automatically. For these, select the "Standard" profile instead of "Gaming" to minimize post-processing interference.
Enable SAM (Smart Access Memory):
In Adrenalin -> Performance tab -> Tuning.
Ensure AMD Smart Access Memory is Enabled. (Requires "Re-Size BAR" enabled in your BIOS).
🔵 Option C: Intel
Intel Arc Control Settings:
Open Intel Arc Control (Alt+I).
Performance Tab -> Power Limit -> Increase to Max (slide to right).
Global Game Settings -> Turn OFF "Frame Smoothing" or "V-Sync" (these add latency).
Summary Table for Reference
Feature
NVIDIA Settings
AMD Settings
Intel Settings
Driver Type
Studio Driver
Adrenalin Edition
Arc Control / WHQL
Power Mode
Prefer Max Performance
Standard Profile / Compute Mode
Power Limit Max
Key BIOS Feature
Re-Size BAR (Optional)
Smart Access Memory (Required)
Re-Size BAR (Required)
Last updated