Z-Image - 60亿参数 AI 图像生成 | 吐司

Z-Image AI Image Generator

阿里通义实验室最新力作!基于开创性的 S3-DiT (Single-Stream Diffusion Transformer) 架构,原生支持中英双语,以 60 亿参数的算力,还原最细腻的视觉想象。 Try it free now!

描述你想创作的图像...

提示词画廊

A futuristic concept cover for a tech magazine. A young coder is typing on a transparent holographic keyboard in a dark server room. Blue neon lights illuminate his face. Large, glitch-art style English text reads: "CYBER REALITY". Subtext: "The Code That Changed Everything."
Prompt

A futuristic concept cover for a tech magazine. A young coder is typing on a transparent holographic keyboard in a dark server room. Blue neon lights illuminate his face. Large, glitch-art style English text reads: "CYBER REALITY". Subtext: "The Code That Changed Everything."

A vibrant shot inside a flower shop. A woman is burying her nose in a bouquet of roses. Surrounded by buckets of colorful flowers. The lighting is soft and flattering, highlighting the petals and the dew on the leaves.
Prompt

A vibrant shot inside a flower shop. A woman is burying her nose in a bouquet of roses. Surrounded by buckets of colorful flowers. The lighting is soft and flattering, highlighting the petals and the dew on the leaves.

A rugged portrait of an older Caucasian male rancher with a sun-weathered face and deep wrinkles, wearing a dusty cowboy hat. He is leaning against a fence post at sunset, looking towards a herd of cattle. Golden hour light rims his profile. The texture of dirt and leather is prominent.
Prompt

A rugged portrait of an older Caucasian male rancher with a sun-weathered face and deep wrinkles, wearing a dusty cowboy hat. He is leaning against a fence post at sunset, looking towards a herd of cattle. Golden hour light rims his profile. The texture of dirt and leather is prominent.

核心能力

6B 参数的极致细节

拒绝妥协。作为未经蒸馏的 Base 版本,Z-Image 拥有完整的 60 亿参数规模,能够捕捉 Turbo 版本可能忽略的微小纹理、光影过渡和背景细节。每一张图都是壁纸级精度。

S3-DiT 单流架构

采用 Scalable Single-Stream Diffusion Transformer 架构,将文本与图像特征在同一流中处理。这种深度融合让模型比以往更"懂"你的提示词,复杂逻辑不再混淆。

原生中英双语掌控

不止是英文。Z-Image 在庞大的中文数据集上进行了原生训练,无论是古诗词意境,还是现代海报中的汉字排版,它都能精准呈现,无需额外 ControlNet 辅助。

完美的微调基座

想训练自己的风格?Z-Image Base 是最佳起跑线。相比蒸馏模型,Base 模型拥有更完整的特征空间,让你的 LoRA 和微调训练收敛更快、泛化性更强。

训练你自己的 Z-Image LoRA

  • S3-DiT 架构提供显著更好的训练潜力
  • 完整的 60 亿参数特征空间,收敛更快
  • 显存友好 - 16GB 显存消费级显卡即可运行

常见问题

Z-Image 是拥有 60 亿参数的原始基础模型,专注于提供最高的图像质量、最强的语义理解和最佳的微调潜力,适合专业创作和模型训练。而 Turbo 是基于 Base 的蒸馏版本,牺牲了极少量的细节以换取 8 步极速出图的能力。如果你追求极致画质或需要训练模型,请选择 Z-Image Base。

体验 60 亿参数的视觉震撼

无需等待下载,在浏览器中运行 Z-Image