Youngwoo Yoon*, Pieter Wolfert*, Taras Kucherenko*, Carla Viegas, Teodor Nikolov, Mihail Tsakov, Gustav Eje Henter
[Full challenge paper (ACM TOG)] [Initial publication (ICMI’22)]
Summary
This webpage contains data, code, and results from the second GENEA Challenge, intended as a benchmark of data-driven automatic co-speech gesture generation. In the challenge, participating teams used a common speech and motion dataset to build gesture-generation systems. Motion generated by all these systems was then rendered to video using a standardised visualisation and evaluated in several large, crowdsourced user studies. This year’s dataset was based on 18 hours of full-body motion capture, including fingers, of different persons engaging in dyadic conversation, taken from the Talking With Hands 16.2M dataset. Ten teams participated in the evaluation across two tiers: full-body and upper-body gesticulation. For each tier we evaluated both the human-likeness of the gesture motion and its appropriateness for the specific speech.
The evaluation results are a revolution, and a revelation: Some synthetic conditions are rated as significantly more human-like than human motion capture. At the same time, all synthetic motion is found to be vastly less appropriate for the speech than the original motion-capture recordings.
Please see our paper for more information, the challenge introduction video below, and the links below for the challenge data, code, and results.
Open-source materials
- Data
- Challenge dataset: DOI 10.5281/zenodo.6998230
- 3D coordinates of submitted motion: DOI 10.5281/zenodo.6973296
- Submitted BVH files: DOI 10.5281/zenodo.6976463
- User-study video stimuli: DOI 10.5281/zenodo.6997925
- Annotation manual: if you need this - contact us
- Code
- Visualization code: github.com/TeoNikolov/genea_visualizer
- Objective evaluation code: github.com/genea-workshop/genea_numerical_evaluations
- Text-based baseline: Yoon et al. (ICRA 2019)
- Audio-based baseline: Kucherenko et al. (IVA 2019)
- Interface for subjective evaluations: HEMVIP
- Code for creating attention-check videos: create_attention_check
- Utility to trim BVH files: trim_bvh
- Modified PyMO for the challenge dataset: Modified PyMO
- Results
- Subjective evaluation responses, analysis, and results: DOI 10.5281/zenodo.6939888
- Scripts to run Barnard’s test: genea-appropriateness
- Objective evaluation data: DOI 10.5281/zenodo.6979990 (FGD metric will be included later)
- Papers and presentation videos
- The GENEA Challenge 2022: A large evaluation of data-driven co-speech gesture generation [ACM article] [Youtube]
- Exemplar-based Stylized Gesture Generation from Speech: An Entry to the GENEA Challenge 2022 [OpenReview] [ACM article] [Youtube]
- TransGesture: Autoregressive Gesture Generation with RNN-Transducer [OpenReview] [ACM article] [Youtube]
- The IVI Lab entry to the GENEA Challenge 2022 – A Tacotron2 Based Method for Co-Speech Gesture Generation With Locality-Constraint Attention Mechanism [OpenReview] [ACM article] [Youtube]
- Hybrid Seq2Seq Architecture for 3D Co-Speech Gesture Generation [OpenReview] [ACM article] [Youtube]
- The DeepMotion entry to the GENEA Challenge 2022 [OpenReview] [ACM article] [Youtube]
- The ReprGesture entry to the GENEA Challenge 2022 [OpenReview] [ACM article] [Youtube]
- UEA Digital Humans entry to the GENEA Challenge 2022 [OpenReview] [ACM article] [Youtube]
- GestureMaster: Graph-based Speech-driven Gesture Generation [OpenReview] [ACM article] [Youtube]
- ReCell: replicating recurrent cell for auto-regressive pose generation [OpenReview] [ACM article] [Youtube]
Citation
If you use materials from this challenge, please cite our paper about the challenge:
@article{kucherenko2024evaluating,
author = {Kucherenko, Taras and Wolfert, Pieter and Yoon, Youngwoo and Viegas, Carla and Nikolov, Teodor and Tsakov, Mihail and Henter, Gustav Eje},
title = {Evaluating Gesture Generation in a Large-scale Open Challenge: The GENEA Challenge 2022},
year = {2024},
issue_date = {June 2024},
publisher = {Association for Computing Machinery},
volume = {43},
number = {3},
issn = {0730-0301},
url = {https://doi.org/10.1145/3656374},
doi = {10.1145/3656374},
month = {jun},
articleno = {32},
}
Also consider citing the original paper about the motion data from Meta Research:
@inproceedings{lee2019talking,
title={{T}alking {W}ith {H}ands 16.2{M}: {A} large-scale dataset of synchronized body-finger motion and audio for conversational motion analysis and synthesis},
author={Lee, Gilwoo and Deng, Zhiwei and Ma, Shugao and Shiratori, Takaaki and Srinivasa, Siddhartha S. and Sheikh, Yaser},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={763--772},
doi={10.1109/ICCV.2019.00085},
series={ICCV '19},
publisher={IEEE},
year={2019}
}
Contact
- You can e-mail the GENEA organisers at genea-contact@googlegroups.com.
- Also see the main GENEA website for additional information.