Dense Paraphrasing for Multimodal Dialogue Interpretation
Author Information
Author(s): Tu Jingxuan, Rim Kyeongmin, Ye Bingyang, Lai Kenneth, Pustejovsky James
Primary Institution: Brandeis University
Hypothesis
Can Dense Paraphrasing improve the interpretation of multimodal dialogues by translating nonverbal modalities into linguistic expressions?
Conclusion
The study demonstrates that augmenting context with dense paraphrasing significantly enhances the performance of common ground reasoning in dialogue systems.
Supporting Evidence
- The dense paraphrasing technique improves the alignment of information from multiple modalities.
- Results indicate that the proposed method outperforms baseline models in common ground reasoning tasks.
- Using decontextualized utterances significantly enhances model performance.
Takeaway
This study shows that turning gestures and actions into words can help computers understand conversations better, especially when people are talking and using their hands at the same time.
Methodology
The study uses Dense Paraphrasing to convert multimodal dialogue data into a structured text format, which is then processed by Large Language Models to improve common ground tracking.
Potential Biases
The reliance on a specific dataset may introduce biases that affect the generalizability of the findings.
Limitations
The study is based on a small dataset of controlled dialogues, which may not represent the diversity of real-world interactions.
Participant Demographics
Participants were university students aged 19 to 35, who spoke English.
Statistical Information
P-Value
p<0.05
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website