Assistant Professor in Visual Computing at Durham University. Stuart's research focus is on Visual Reasoning to understand the layout of visual content from Iconography (e.g. Sketches) to 3D Scene understanding and their implications on methods of interaction. He is currently a co-I on the RePAIR EU FET, DCitizens EU Twinning, and BoSS EU Lighthouse. He was a co-I on the MEMEX RIA EU H2020 project coordinated at IIT for increasing social inclusion with Cultural Heritage. Stuart has previously held a Researcher & PostDoc positions at IIT as well as PostDocs at University College London (UCL), and the University of Surrey. Also, at the University of Surrey, Stuart was awarded his PhD on visual information retrieval for sketches. Stuart holds an External Scientist at IIT, Honorary roles UCL and UCL Digital Humanities, and an international collaborator of ITI/LARSyS. He also regularly organises Vision for Art (VISART) workshop and Humanities-orientated tutorials and was Program Chair at British Machine Conference (BMVC) 2021.

Stuart James




Research interests

My research activities fit broadly into Spatial Reasoning — how we can reason about the layout of objects in space in both 2D and 3D to provide insight or retrieve relevant information. My research has a keen interest on varied data types including those from the Humanities such as Art and Cultural Heritage.

Exploring using Depth and Knowledge to answer questions specifically related to the layout of a 3D scene from a 2D perspective.

Visual Question & Answering

Detection, Representation and Reasoning on simplified representations or symbols such as Sketch, Line, Hatching, Motifs or icons.

Abstract & Iconography Reasoning

Identifying and retrieving relevant knowledge held within Knowledge Graphs to support Computer Vision tasks such as Visual Question and Answering or reasoning on location.

Knowledge Retrieval & Reasoning

Reconstructing the semantic relational structure of the scene using geometry and knowledge. Providing advanced interaction for questioning and reasoning.

Scene Graph

Principally on layout of content in 2D or 3D and how to make decisions that influence about a path or option linked with Visual Question and Answering

Planning & Reasoning

We have explored using sketches to search collections of videos using Visual Storyboarding to express the sequence of events in the target clip.

Sketch based Retrieval

We are using sequences to retrieve information providing a broader context than a one-off search. We have demonstrated through Free-Hand storyboarding and storey synthesis.

Visual Narratives and Stories

Within VR we explored the use free-hand sketching in an Immersive Environment (VR) with multiple modalities for the task of retrieval.

Interaction in Virtual Reality

Providing storytelling experiences overlaying information of surrounding Cultural Heritage and the stories of the particpants in the MEMEX Project.

Interaction in Augmented Reality

Cultural Heritage & Digital Humanities

Assistive Technologies


Research Group & Collaborators

Research Topic: Causality and Representation learning

Supervisor with Dr Alessio Del Bue (IIT)

Davide Talon

PhD Student

Research Topic: Optimising camera localisation in urban scenes

Collaborator with Dr Alessio Del Bue (IIT)

Dr Matteo Toso

PostDoc Collaborator

Research Topic: RePAIR Fresco 3D reconstruction and assembly

Collaborator with Dr Alessio Del Bue (IIT)

Dr Theodore Tsesmelis

PostDoc Collaborator


Mohamed Dahy Abdelaher Elkhouly

PhD Student

Dr Matteo Taiana

PostDoc Collaborator

Àlex Solé Gómez

Research Fellow

Dr Daniele Giunchi

External Collaborator


Looking to do a PhD?

Our group is always looking for good PhD candidates, so if you are interested in doing a PhD in Visual Reasoning please contact me to discuss the options. For more details review research areas and publications especially before making an inquiry or application.

Current Funding options:

Call for Interest in MSCA Postdoctoral Fellowships

Open call for interest in co-writing a MSCA Postdoctoral Fellowship on Computer Vision applied to the Arts and Humanities at Durham University. Wide array of topics we can discuss, but includes everything from digitisation to understanding and reasoning about art and heriage. The MSCA is an internationa collaborative program so a long-term secondment is required.

Project Duration: 1-2 Years

The EU provides support for the recruited researcher in the form of

  • a living allowance
  • a mobility allowance
  • if applicable, family, long-term leave and special needs allowances

In addition, funding is provided for

  • research, training and networking activities
  • management and indirect costs


  • PhD or 4 years of full-time research experience


  • Call opens 10 April 2024
  • Deadline 11 September 2024

Full details at

Feel free to contact me if you have any questions, please use "MSCA Postdoctoral Fellowships"" in the subject line.

Latest Blog Post

18 Jul 2023 . research . New position at Durham University Comments

As of 1st September 2023, I will be taking up a position as Assistant Professor in Visual Computing at Durham University working in the VIViD group. This marks a major transition for me, as I move from being a contract-based Assistant Professor (or Researcher RTDa in the Italian system) to a permanent member of staff (i.e. Lecturer).


Latest Publication

2024 IFFNeRF: Initialisation Free and Fast 6DoF pose estimation from a single image and a NeRF model

We introduce IFFNeRF to estimate the six degrees-of-freedom (6DoF) camera pose of a given image, building on the Neural Radiance Fields (NeRF) formulation. IFFNeRF is specifically designed to operate in real-time and eliminates the need for an initial pose guess that is proximate to the sought solution. IFFNeRF utilizes the Metropolis-Hasting algorithm to sample surface points from within the NeRF model. From these sampled points, we cast rays and deduce the color for each ray through pixel-level view synthesis. The camera pose can then be estimated as the solution to a Least Squares problem by selecting correspondences between the query image and the resulting bundle. We facilitate this process through a learned attention mechanism, bridging the query image embedding with the embedding of parameterized rays, thereby matching rays pertinent to the image. Through synthetic and real evaluation settings, we show that our method can improve the angular and translation error accuracy by 80.1% and 67.3%, respectively, compared to iNeRF while performing at 34fps on consumer hardware and not requiring the initial pose guess.

Accepted at International Conference on Robotics and Automation (ICRA) in Yokohama, Japan.

See full publication index


To find out more about our research you can find me at...

Department of Computer Science

Durham University
Room MS2099, Mathematical Sciences and Computer Science Building, Durham University, Upper Mountjoy, Stockton Road, DURHAM, DH1 3LE