Aesthetic Camera Viewpoint Suggestion with 3D Aesthetic Field

1University of Waterloo, 2City University of Hong Kong
CVPR 2026
Abstract illustration


We learn a 3D aesthetic field from sparse scene observations that captures spatially varying aesthetics, enabling efficient discovery of visually pleasing camera viewpoints.

Abstract

The aesthetic quality of a scene depends strongly on camera viewpoint. Existing approaches for aesthetic viewpoint suggestion are either single-view adjustments, predicting limited camera adjustments from a single image without understanding scene geometry, or 3D exploration approaches, which rely on dense captures or prebuilt 3D environments coupled with costly reinforcement learning (RL) searches. In this work, we introduce the notion of 3D aesthetic field that enables geometry-grounded aesthetic reasoning in 3D with sparse captures, allowing efficient viewpoint suggestions in contrast to costly RL searches. We opt to learn this 3D aesthetic field using a feedforward 3D Gaussian Splatting network that distills high-level aesthetic knowledge from a pretrained 2D aesthetic model into 3D space, enabling aesthetic prediction for novel viewpoints from only sparse input views. Building on this field, we propose a two-stage search pipeline that combines coarse viewpoint sampling with gradient-based refinement, efficiently identifying aesthetically appealling viewpoints without dense captures or RL exploration. Extensive experiments show that our method consistently suggests viewpoints with superior framing and composition compared to existing approaches, establishing a new direction toward 3D-aware aesthetic modeling.

Overview

Framework illustration
We distill aesthetic features into a feedforward Gaussian Splatting network (top). At inference, we adopt a two-stage aesthetic viewpoint search pipeline: coarse sampling to find good candidates (bottom left) and local refinement by gradient ascent (bottom right).

Viewpoint Suggestion Results

Given sparse scene captures, our framework suggests visually balanced and well-composed novel views that align with human aesthetic preferences.

Viewpoint Search Visualization

The results reveal how aesthetic quality varies across 3D space, and the corresponding images confirm strong alignment with human perceptual preferences.

Poster

BibTeX

@article{tang2026aesthetic,
    title={Aesthetic Camera Viewpoint Suggestion with 3D Aesthetic Field},
    author={Tang, Sheyang and Sarvestani, Armin Shafiee and Xu, Jialu and Xu, Xiaoyu and Wang, Zhou},
    journal={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2026},
  }