中文PiD: 基于像素扩散的快速高分辨率潜在解码
ENPiD: Fast and High-Resolution Latent Decoding with Pixel Diffusion
arXiv 2605.23902v1提出PiD,一种基于可扩展像素空间扩散的高效解码新范式。现有高分辨率文本-图像系统受限于重建导向的潜空间解码器,缺乏细节且计算成本随分辨率激增。PiD利用像素扩散提升生成质量和效率,为兆像素级图像合成提供更优实用方案。
arXiv:2605.23902v1 Announce Type: new Abstract: Most practical high-resolution text-to-image systems, including latent diffusion and autoregressive models, perform generation in a compact latent space, and a decoder maps the generated latents back to pixels. Yet the latent-to-pixel decoder is reconstruction-oriented, optimized to invert the encoder rather than synthesize more details, and becomes increasingly costly at megapixel scale. This drawback calls for a more expressive and efficient decoding paradigm. Motivated by recent progress in scalable pixel-space diffusion, we introduce PiD, a P