Abstract: Despite encouraging progress in 3D scene understanding, it remains challenging to develop an effective Large Multimodal Model (LMM) that is capable of understanding and reasoning in complex ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results