AI-based multi-modal 3D environment understanding and visualisation

University of Southampton

nearmejobs.eu

This project aims to develop an AI-based practical solution for 3D environments understanding from multi-modal (audio/visual) input data and reproducing it in a virtual or augmented reality space allowing real-time 3D interaction with spatial audio adapted to the environment and user locations.

Computer Vision is one of the most active areas where artificial intelligence (AI) is being used. This area is extremely expanding and getting a lot of interests and investments these days. Active perception of a surrounding environment through AI relies heavily on the design of architectures and their extensive training to generate compact representations. Taking advantages of recent advancements in AI technology, these representations have shown significant improvement in building new knowledge and acquiring new skills for AI agents and practical applications in our daily life.

Scene understanding, studies the task of representing a captured scene in a manner emulating human-like understanding of that space. Attaining this understanding is crucial for applications such as robotics, tele-communication, smart home, healthcare and assisted living.

In this project, you will join a team working on a pipeline for modelling and rendering of the full environment including 3D geometry, semantic objects and material attributes from multi-modal inputs such as video, audio and text. You will join this team and investigate topics in AI-based multi-modal 3D environmental scene understanding and visualisation.

The project will be supervised by Dr Hansung Kim and Dr Rahman Attar.

https://www.southampton.ac.uk/people/5y65w6/doctor-hansung-kim

https://www.southampton.ac.uk/people/62b6zg/doctor-rahman-attar

Various chances to attend the British vision summer school or major international conferences such as the Conference on Computer Vision and Pattern Recognition (CVPR) and the International Conference on Computer Vision (ICCV).

Entry requirements

You must have a minimum of a UK 2:1 honours degree or its international equivalent, in computer vision and machine learning, and being proficient in Python.

Experience in camera or VR systems, and experience in academic paper publication are desirable but not essential.

Applicants without MSc or MEng in computer vision or machine learning would have to provide strong justification that they would be able to complete a PhD in this field. 

How to apply

https://student-selfservice.soton.ac.uk/BNNRPROD/bzsksrch.P_Search

You need to:

  • choose programme type (Research), 2025/26, Faculty of Engineering and Physical Sciences
  • select Full time or Part time
  • choose the relevant PhD in Computer Science
  • add name of the supervisor in section 2

Applications should include:

  • personal statement
  • your CV (resumé)
  • 2 academic references
  • degree transcripts to date

To help us track our recruitment effort, please indicate in your email – cover/motivation letter where (nearmejobs.eu) you saw this posting.

Job Location