Skip to main content

Dataset release coming soon, it's not available yet. Until then give us a star on GitHub so that we know we should hurry up!

MV-Fashion: Towards Enabling Virtual Try-On and Size Estimation with Multi-View Paired Data

CVPR 2026
MV-Fashion Left PartMV-Fashion Right Part

Abstract

Existing 4D human datasets often fall short for fashion-specific research, lacking either realistic garment dynamics or task-specific annotations. To bridge this gap, we introduce MV-Fashion, a massive multi-view video dataset engineered for domain-specific fashion analysis.

MV-Fashion captures complex, real-world garment dynamics across 80 diverse subjects wearing multiple layered outfits. Crucially for Virtual Try-On (VTON) applications, it provides paired data: synchronized multi-view captures of worn garments alongside their corresponding flat, catalogue images.

MV-Fashion Annotations Overview

Dataset Highlights

Explore the diversity and quality of the captured data, designed to push the boundaries of human-centric rendering and virtual try-on.

Diverse Poses

Diverse Poses

Capturing a wide range of natural and complex human motions.

Layered Outfits

Layered Outfits

Intricate details of multi-layered clothing combinations.

Multi-view Consistency

Multi-view Consistency

Synchronized capture ensuring perfect alignment across all views.

Paired Data

Paired Data

Catalogue domain image pairs for the multi-view recordings for VTON.

Challenging Garments

Challenging Garments

Includes difficult items like loose dresses and transparent fabrics.

Robust Tracking

Robust Tracking

Accurate SMPL-X fitting and tracking.

Loading carousel...
Loading videos...

80 Subjects

Diverse pool of participants (50.6% male, 45.7% female) across various BMI and age distributions.

754 Garments

Spanning 14 distinct fashion categories, comprising single, double, and triple-layered outfits.

Paired VTON Data

Unique paired data featuring synchronized multi-view captures of worn garments with corresponding flat catalogue images.

68 Synchronized Cameras

60 RGB global shutter and 8 Depth/4K cameras capturing real-world dynamic deformations.

Rich Annotations

Features precise SMPL-X fits, 3D point clouds, text descriptions, and segmentation masks.

3,273 Sequences

Extensive multi-view video database yielding over 72.5 million high-fidelity frames.

Loading statistics...

Get Started

Download the dataset and run the initial evaluation scripts with just a few commands.

bash
# Clone the repository
git clone https://github.com/HunorLaczko/MV-Fashion.git
cd MV-Fashion

# Install dependencies
pip install -r requirements.txt

# Download the sample dataset
bash scripts/download_sample.sh

Citation

If you find our work useful in your research, please consider citing:

@misc{laczko2026mvfashionenablingvirtualtryon,
      title={MV-Fashion: Towards Enabling Virtual Try-On and Size Estimation with Multi-View Paired Data}, 
      author={Hunor Laczkó and Libang Jia and Loc-Phat Truong and Diego Hernández and Sergio Escalera and Jordi Gonzalez and Meysam Madadi},
      year={2026},
      eprint={2603.08147},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.08147},
}