// learn · comfyui-core

ComfyUI + Flux: Setup, Models, and First Workflow

Everything you need to install ComfyUI, download the right Flux model variant for your GPU, and run your first image generation workflow. Real numbers, no hype.

Published 2026-05-19comfyui fluxcomfyui flux setupflux comfyui workflow

ComfyUI is a node-based workflow interface for running diffusion models locally or on a remote GPU. Flux is the current state-of-the-art open-weight text-to-image model family from Black Forest Labs. Together they are the standard toolchain for engineers building AI image pipelines in 2026. This guide covers installation, model selection, and your first working workflow - with the specific decisions you need to make for your hardware.

Before you install: GPU requirements by Flux model variant

Flux comes in three quality tiers and multiple quantization levels. The variant you choose determines your VRAM requirement and your image quality. There is no single right answer - it depends on your GPU.

Flux Model Variants: VRAM Requirements and Quality, May 2026
VariantPrecisionVRAM requiredQualitySpeed
Flux Dev FP16Full24 GBBestSlowest
Flux Dev Q88-bit quant16 GBNear-identical to FP16Moderate
Flux Dev NF44-bit quant8 GBGood, minor detail lossFaster
Flux Schnell FP16Full, distilled24 GBGood, 4-stepVery fast
Flux Schnell NF44-bit, distilled8 GBAcceptableFastest

Practical recommendation: if you have a 12 GB card (RTX 3080, 4070), use Flux Dev NF4. If you have 16 GB (RTX 4080, 3090), use Flux Dev Q8. If you have 24 GB or more, use Flux Dev FP16 for production-quality output.

8 GB
Minimum VRAM to run Flux Dev NF4 (4-bit quantized) - the lowest entry point for production-quality Flux generation
Black Forest Labs model documentation, Comfy model testing, May 2026

Installing ComfyUI: the three-command path

ComfyUI runs on Linux, macOS, and Windows. The installation is the same across platforms. You need Python 3.10-3.12 and a CUDA-compatible GPU (NVIDIA) or Apple Silicon for MPS acceleration.

$bash
git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
pip install -r requirements.txt
python main.py

This starts the ComfyUI server on localhost:8188. The interface opens in your browser. You will see an empty canvas with a default workflow - delete it (Ctrl+A, Delete) and load the Flux workflow below.

One flag worth knowing on first launch: if you are on a machine with limited VRAM, add --lowvram or --novram to the main.py command. These tell ComfyUI to aggressively offload model components to system RAM between inference steps. Expect slower generation but successful runs on 8 GB cards that would otherwise OOM.

$bash
# For 8 GB VRAM cards
python main.py --lowvram

# For cards with less than 6 GB VRAM
python main.py --novram

Downloading the Flux model files

Flux Dev requires three file downloads: the main transformer, the text encoders (CLIP-L and T5-XXL), and the VAE. These go in specific directories inside your ComfyUI installation.

$bash
# From your ComfyUI directory

# Main model - choose ONE based on your VRAM
# FP16 (24 GB)
wget -P models/unet/ https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors

# NF4 quantized (8 GB) - community quantization by Kijai
wget -P models/unet/ https://huggingface.co/Kijai/flux-fp8/resolve/main/flux1-dev-fp8.safetensors

# Text encoders (same for all variants)
wget -P models/clip/ https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
wget -P models/clip/ https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors

# VAE
wget -P models/vae/ https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/ae.safetensors

Note: Flux Dev requires accepting the license agreement on Hugging Face before you can download. Log in to Hugging Face, accept the license on the FLUX.1-dev model page, then authenticate the CLI with huggingface-cli login. Flux Schnell does not require this - it is Apache 2.0 licensed and freely downloadable.

Your first workflow: the minimal Flux node graph

A working Flux text-to-image workflow in ComfyUI requires six node types: DualCLIPLoader (loads both text encoders), UnetLoader (loads the main transformer), VAELoader (loads the VAE), CLIPTextEncode (encodes your prompt), KSampler (runs the diffusion), and VAEDecode + SaveImage (decodes and saves the result).

The simplest way to get this workflow is to download the official Flux example from the ComfyUI GitHub repository examples folder, or load it from the ComfyUI Manager. Once loaded, you will see the node graph pre-connected. The only things you need to configure before your first run: select your downloaded model files in the loader nodes, and write your prompt in the CLIPTextEncode node.

Flux Dev sampler settings that produce reliable results: 20 steps, Euler sampler, Simple scheduler, CFG scale 1.0. Yes, CFG 1.0 - Flux does not benefit from classifier-free guidance the way SDXL does. Running it at CFG 3.5 or 7.0 (SDXL defaults people often copy over) will produce washed-out, oversaturated results. Keep it at 1.0 unless you have a specific reason to change it.

ComfyUI Manager: the package manager for custom nodes

ComfyUI by itself has a limited node set. Most production workflows rely on custom nodes from the community - additional preprocessing, ControlNet support, LoRA chaining, upscaling, and more. ComfyUI Manager is the package manager that installs and updates these.

$bash
# Install ComfyUI Manager
cd ComfyUI/custom_nodes
git clone https://github.com/ltdrdata/ComfyUI-Manager
# Restart ComfyUI - the Manager button appears in the interface

With Manager installed, you can install any custom node pack from the interface without touching the command line again. The most useful packs for Flux workflows: ComfyUI-Impact-Pack (segmentation, detailing), ComfyUI_essentials (utility nodes), and was-node-suite-comfyui (image processing). Install only what your workflow needs - each custom node pack is an additional dependency that can break on ComfyUI updates.

Next steps: from local to production

Once you have a working local Flux workflow in ComfyUI, the next decision is whether to keep running it locally or move it to a production environment with an API layer. Local is fine for personal use and experimentation. For anything serving external traffic - a product feature, a batch job, a client project - you need GPU infrastructure that is reliable, scalable, and has no cold start penalty when you need it.

ComfyUI as a Production API covers the architecture of exposing your workflow as a REST endpoint. GPU Provider Cost Comparison 2026 has current pricing across the major GPU clouds. ComfyUI Hosting 2026 compares managed hosting options for teams that do not want to operate GPU infrastructure themselves.

ComfyUI Installation Options, May 2026
MethodOSSetup timeBest for
Git clone + pipLinux / macOS / Windows15-30 minDevelopers who want full control
ComfyUI portable (Windows)Windows only5-10 minNon-technical users on Windows
Docker imageLinux / macOS30-60 minReproducible environments, CI/CD
Managed (Runflow, ComfyDeploy)Any (cloud)1-2 daysTeams skipping local GPU entirely

Flux sampler settings that actually work

The default ComfyUI sampler settings (Euler, Karras, 20 steps, CFG 7.0) are tuned for SD 1.5 and will produce bad results with Flux. Flux needs different settings because it uses a rectified flow formulation rather than DDPM. The correct Flux Dev settings: Euler sampler, Simple scheduler, 20 steps, CFG 1.0. For Flux Schnell: Euler, Simple, 4 steps, CFG 1.0.

Image resolution: Flux is trained on multiple resolutions and handles non-square images well. The base training resolution is 1024x1024. For portrait images use 832x1216. For landscape use 1216x832. Unlike SDXL, you do not need to stay close to 1024px total - Flux at 1280x1280 still produces coherent results without the tile-repeat artifacts that plagued SD 1.5 at high resolutions.

LoRAs and ControlNets for Flux in 2026

The Flux LoRA ecosystem is growing rapidly in 2026 but is still smaller than SDXL's. For fine-tuning style, the most common approach is rank-16 to rank-64 LoRA training using SimpleTuner or OneTrainer. Key difference from SDXL training: Flux LoRAs train on the transformer layers directly, not on a UNet. Training 1,000 steps on 30-50 reference images typically produces usable style LoRAs.

ControlNet for Flux: as of May 2026, several ControlNet implementations exist for Flux including Canny, Depth, and Pose variants. The most stable are the XLabs-AI ControlNet models available on Hugging Face. Install via ComfyUI Manager as part of the ComfyUI-FluxControlNet node pack. Performance is slightly slower than SDXL ControlNets due to the larger model size.

Troubleshooting the most common Flux setup errors

Three errors appear in almost every first Flux setup. First: CUDA out of memory - usually caused by loading Flux Dev FP16 on a card with less than 24 GB VRAM. Fix: switch to the NF4 variant or add --lowvram to your launch command. Second: NaNsException (tensor with all NaNs) - typically caused by running Flux with CFG above 2.0 or using a sampler incompatible with rectified flow. Fix: set CFG to 1.0 and use Euler with Simple scheduler. Third: black or gray output images - usually caused by the VAE not loading correctly or a mismatch between the model and VAE versions. Fix: explicitly load the ae.safetensors VAE in a VAELoader node rather than relying on automatic detection.

If you are running on a machine without a GPU or with an unsupported GPU, ComfyUI will fall back to CPU inference. CPU inference for Flux Dev is extremely slow - expect 10-30 minutes per image. This is only useful for testing that the workflow configuration is correct, not for actual generation.

One final note on reproducibility: ComfyUI uses a seed value in the KSampler node to control randomness. Set a fixed seed during development to get consistent outputs while you tune other parameters. Switch to a random seed for production to generate variety across requests. Documenting your seed alongside the workflow JSON is good practice when sharing reproducible results.

Frequently Asked Questions

How much VRAM do I need for Flux in ComfyUI?

Flux Dev NF4 (4-bit quantized) requires 8 GB VRAM minimum. Flux Dev Q8 requires 16 GB. Flux Dev FP16 (full precision, best quality) requires 24 GB. For production use, Q8 on a 16 GB card gives near-identical quality to FP16 at significantly lower hardware cost.

What CFG scale should I use for Flux in ComfyUI?

Use CFG 1.0 for Flux Dev. Flux does not benefit from classifier-free guidance the way SDXL or SD 1.5 do. Running Flux at CFG 3.5-7.0 (common SDXL defaults) produces oversaturated, washed-out results. The recommended sampler settings are: Euler, Simple scheduler, 20 steps, CFG 1.0.

Do I need to accept a license to use Flux Dev?

Yes. Flux Dev requires accepting the license agreement on Hugging Face before downloading. Log in to Hugging Face, accept the license on the FLUX.1-dev model page, and authenticate with huggingface-cli login. Flux Schnell is Apache 2.0 licensed and requires no agreement.

Can I run ComfyUI on a Mac with Apple Silicon?

Yes. ComfyUI supports MPS (Metal Performance Shaders) acceleration on Apple Silicon (M1, M2, M3 series). Start with python main.py - no additional flags needed, MPS is detected automatically. Performance is slower than NVIDIA CUDA for Flux models. Flux Dev NF4 runs on 16 GB unified memory Macs. For production workloads, cloud GPU is significantly faster.

What is the difference between ComfyUI and Automatic1111?

Automatic1111 (A1111) is a form-based web UI designed for ease of use. ComfyUI is a node graph interface designed for building custom workflows. For production pipelines and programmatic use, ComfyUI is the standard choice because workflows can be exported as JSON and called via API. A1111 is easier for beginners but harder to automate.

How do I update ComfyUI and custom nodes?

Update ComfyUI: git pull in the ComfyUI directory, then pip install -r requirements.txt again. Update custom nodes via ComfyUI Manager - the Manager interface has an 'Update All' button that handles individual node packs. After any update, restart ComfyUI. If a workflow breaks after an update, check the ComfyUI GitHub issues for the specific error.

What is the fastest way to generate images with Flux?

Flux Schnell at 4 steps is the fastest option - roughly 5x faster than Flux Dev at 20 steps. On an RTX 4090, Flux Schnell NF4 generates a 1024x1024 image in 2-4 seconds. Additional speed optimizations: enable torch.compile (--disable-xformers flag and set compile mode in settings), and use Sage Attention if your CUDA version supports it.

Can ComfyUI workflows be called as an API?

Yes. ComfyUI has a built-in REST API at /prompt and /queue endpoints. Workflows saved as JSON can be submitted programmatically. For production API use, managed services like Runflow wrap ComfyUI workflows in a proper REST API with authentication, queuing, and output storage. For self-hosted setups, you would build this wrapper yourself.