DreamFit
<h2 style="font-size: 20px;">DreamFit是什么</h2>
<p>DreamFit是字节跳动团队联合清华大学深圳国际研究生院、中山大学深圳校区推出的虚拟试衣框架,专门用在轻量级服装为中心的人类图像生成。框架能显著减少模型复杂度和训练成本,基于优化文本提示和特征融合,提高生成图像的质量和一致性。DreamFit能泛化到各种服装、风格和提示指令,生成高质量的人物图像。DreamFit支持与社区控制插件的无缝集成,降低使用门槛。</p>
<p><img src="https://img.medsci.cn/aisite/img//LTxcrJNntWduEjo3Fz0efwOUAOWHyu13P41gtIeX.png" alt=""></p>
<h2 style="font-size: 20px;">DreamFit的主要功能</h2>
<ul>
<li>即插即用:易于与社区控制插件集成,降低使用门槛。</li>
<li>高质量生成:基于大型多模态模型丰富提示,生成高一致性的图像。</li>
<li>姿势控制:支持指定人物姿势,生成符合特定姿势的图像。</li>
<li>多主题服装迁移:将多个服装元素组合到一张图像中,适用于电商服装展示等场景。</li>
</ul>
<h2 style="font-size: 20px;">DreamFit的技术原理</h2>
<ul>
<li>轻量级编码器(Anything-Dressing Encoder):基于 LoRA 层,将预训练的扩散模型(如 Stable Diffusion 的 UNet)扩展为轻量级的服装特征提取器。只训练 LoRA 层,而不是整个 UNet,大大减少模型复杂度和训练成本。</li>
<li>自适应注意力(Adaptive Attention):引入两个可训练的线性投影层,将参考图像特征与潜在噪声对齐。基于自适应注意力机制,将参考图像特征无缝注入 UNet,确保生成的图像与参考图像高度一致。</li>
<li>预训练的多模态模型(LMMs):在推理阶段,用 LMMs 重写用户输入的文本提示,增加对参考图像的细粒度描述,减少训练和推理阶段的文本提示差异。</li>
</ul>
<h2 style="font-size: 20px;">DreamFit的项目地址</h2>
<ul>
<li>GitHub仓库:https://github.com/bytedance/DreamFit</li>
<li>arXiv技术论文:https://arxiv.org/pdf/2412.17644</li>
</ul>
<h2 style="font-size: 20px;">DreamFit的应用场景</h2>
<ul>
<li>虚拟试穿:消费者在线上虚拟试穿服装,节省时间和成本,提升购物体验。</li>
<li>服装设计:设计师快速生成服装效果图,加速设计流程,提高工作效率。</li>
<li>个性化广告:根据用户偏好生成定制化广告,提高广告吸引力和转化率。</li>
<li>虚拟现实(VR)/增强现实(AR):提供虚拟试穿体验,增强用户沉浸感和互动性。</li>
<li>社交媒体内容创作:生成个性化图像,吸引更多关注,提升内容的多样性和吸引力。</li>
</ul>
<p> </p>
<div class="markdown-heading" dir="auto">
<h2 class="heading-element" dir="auto" tabindex="-1">Installation Guide</h2>
<a id="user-content-installation-guide" class="anchor" href="https://github.com/bytedance/DreamFit#installation-guide" aria-label="Permalink: Installation Guide"></a></div>
<ol dir="auto">
<li>Clone our repo:</li>
</ol>
<div class="highlight highlight-source-shell notranslate position-relative overflow-auto" dir="auto">
<pre>git clone https://github.com/bytedance/DreamFit.git</pre>
<div class="zeroclipboard-container"> </div>
</div>
<ol dir="auto" start="2">
<li>Create new virtual environment:</li>
</ol>
<div class="highlight highlight-source-shell notranslate position-relative overflow-auto" dir="auto">
<pre>conda create -n dreamfit python==3.10
conda activate dreamfit</pre>
<div class="zeroclipboard-container"> </div>
</div>
<ol dir="auto" start="3">
<li>Install our dependencies by running the following command:</li>
</ol>
<div class="highlight highlight-source-shell notranslate position-relative overflow-auto" dir="auto">
<pre>pip install -r requirements.txt
pip install flash-attn --no-build-isolation --use-pep517 </pre>
<div class="zeroclipboard-container"> </div>
</div>
<div class="markdown-heading" dir="auto">
<h2 class="heading-element" dir="auto" tabindex="-1">Models</h2>
<a id="user-content-models" class="anchor" href="https://github.com/bytedance/DreamFit#models" aria-label="Permalink: Models"></a></div>
<ol dir="auto">
<li>You can download the pretrained models <a href="https://huggingface.co/bytedance-research/Dreamfit" rel="nofollow">Here</a>. Download the checkpoint to <code>pretrained_models</code> folder.</li>
<li>If you want to inference with StableDiffusion1.5 version, you need to download the <a href="https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5" rel="nofollow">stable-diffusion-v1-5</a>, <a href="https://huggingface.co/stabilityai/sd-vae-ft-mse" rel="nofollow">sd-vae-ft-mse</a> to <code>pretrained_models</code>. If you want to generate images of different styles, you can download the corresponding stylized model, such as <a href="https://huggingface.co/SG161222/Realistic_Vision_V6.0_B1_noVAE" rel="nofollow">RealisticVision</a>, to <code>pretrained_models</code>.</li>
<li>If you want to inference with Flux version, you need to download the <a href="https://huggingface.co/black-forest-labs/FLUX.1-dev" rel="nofollow">flux-dev</a> to <code>pretrained_models</code> folder</li>
<li>If you want to inference with pose control, you need to download the <a href="https://huggingface.co/lllyasviel/Annotators" rel="nofollow">Annotators</a> to <code>pretrained_models</code> folder</li>
</ol>
<p>The folder structures should look like these:</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto">
<pre class="notranslate"><code>├── pretrained_models/
| ├── flux_i2i_with_pose.bin
│ ├── flux_i2i.bin
│ ├── flux_tryon.bin
│ ├── sd15_i2i.ckpt
| ├── stable-diffusion-v1-5/
| | ├── ...
| ├── sd-vae-ft-mse/
| | ├── diffusion_pytorch_model.bin
| | ├── ...
| ├── Realistic_Vision_V6.0_B1_noVAE(or other stylized model)/
| | ├── unet/
| | | ├── diffusion_pytorch_model.bin
| | | ├── ...
| | ├── ...
| ├── Annotators/
| | ├── body_pose_model.pth
| | ├── facenet.pth
| | ├── hand_pose_model.pth
| ├── FLUX.1-dev/
| | ├── flux1-dev.safetensors
| | ├── ae.safetensors
| | ├── tokenizer
| | ├── tokenizer_2
| | ├── text_encoder
| | ├── text_encoder_2
| | ├── ...
</code></pre>
<div class="zeroclipboard-container"> </div>
</div>
<div class="markdown-heading" dir="auto">
<h2 class="heading-element" dir="auto" tabindex="-1">Inference</h2>
<a id="user-content-inference" class="anchor" href="https://github.com/bytedance/DreamFit#inference" aria-label="Permalink: Inference"></a></div>
<div class="markdown-heading" dir="auto">
<h3 class="heading-element" dir="auto" tabindex="-1">Garment-Centric Generation</h3>
<a id="user-content-garment-centric-generation" class="anchor" href="https://github.com/bytedance/DreamFit#garment-centric-generation" aria-label="Permalink: Garment-Centric Generation"></a></div>
<div class="highlight highlight-source-shell notranslate position-relative overflow-auto" dir="auto">
<pre># inference with FLUX version
bash run_inference_dreamfit_flux_i2i.sh \
--cloth_path example/cloth/cloth_1.png \
--image_text "A woman wearing a white Bape T-shirt with a colorful ape graphic and bold text." \
--save_dir "." \
--seed 164143088151
# inference with StableDiffusion1.5 version
bash run_inference_dreamfit_sd15_i2i.sh \
--cloth_path example/cloth/cloth_3.jpg\
--image_text "A woman with curly hair wears a pink t-shirt with a logo and white stripes on the sleeves, paired with white trousers, against a plain white background."\
--ref_scale 1.0 \
--base_model pretrained_models/Realistic_Vision_V6.0_B1_noVAE/unet/diffusion_pytorch_model.bin \
--base_model_load_method diffusers \
--save_dir "." \
--seed 28</pre>
<div class="zeroclipboard-container"> </div>
</div>
<p>Tips:</p>
<ol dir="auto">
<li>If you have multiple pieces of clothing, you can splice them onto one picture, as shown in the second row.</li>
<li>Use <code>--help</code> to check the meaning of each argument.Garment-Centric Generation with Pose Control</li>
</ol>
<div class="highlight highlight-source-shell notranslate position-relative overflow-auto" dir="auto">
<pre>bash run_inference_dreamfit_flux_i2i_with_pose.sh \
--cloth_path example/cloth/cloth_1.png \
--pose_path example/pose/pose_1.jpg \
--image_text "A woman wearing a white Bape T-shirt with a colorful ape graphic and bold text." \
--save_dir "." \
--seed 16414308815</pre>
</div>
<div class="markdown-heading" dir="auto">
<h3 class="heading-element" dir="auto" tabindex="-1">Tryon</h3>
<a id="user-content-tryon" class="anchor" href="https://github.com/bytedance/DreamFit#tryon" aria-label="Permalink: Tryon"></a></div>
<div class="highlight highlight-source-shell notranslate position-relative overflow-auto" dir="auto">
<pre>bash run_inference_dreamfit_flux_tryon.sh \
--cloth_path example/cloth/cloth_1.png \
--keep_image_path example/tryon/keep_image_4.png \
--image_text "A woman wearing a white Bape T-shirt with a colorful ape graphic and bold text and a blue jeans." \
--save_dir "." \
--seed 16414308815</pre>
<div class="zeroclipboard-container"> </div>
</div>
<p>Tips:</p>
<ol dir="auto">
<li>Keep image is obtained by drawing the openpose on the garment-agnostic region.</li>
<li>The generation code for keep image cannot be open-sourced for the time being. As an alternative, we have provided several keep images for testing.</li>
</ol>
<p>Disclaimer</p>
<p>Most images used in this repository are sourced from the Internet. These images are solely intended to demonstrate the capabilities of our research. If you have any concerns, please contact us, and we will promptly remove any inappropriate content.</p>
<p>This project aims to make a positive impact on the field of AI-driven image generation. Users are free to create images using this tool, but they must comply with local laws and use it responsibly. The developers do not assume any responsibility for potential misuse by users.</p>
<div class="markdown-heading" dir="auto">
<h2 class="heading-element" dir="auto" tabindex="-1">Citation</h2>
<a id="user-content-citation" class="anchor" href="https://github.com/bytedance/DreamFit#citation" aria-label="Permalink: Citation"></a></div>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto">
<pre class="notranslate"><code>@article{lin2024dreamfit,
title={DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder},
author={Lin, Ente and Zhang, Xujie and Zhao, Fuwei and Luo, Yuxuan and Dong, Xin and Zeng, Long and Liang, Xiaodan},
journal={arXiv preprint arXiv:2412.17644},
year={2024}
}
</code></pre>
<div class="zeroclipboard-container"> </div>
</div>
<div class="markdown-heading" dir="auto"> </div>