HD-TEXTure

Abstract

The recent advancements in 2D generation technology have sparked a widespread discussion on using 2D priors for 3D shape and texture content generation. However, these methods often overlook the subsequent user operations, such as texture aliasing and blurring that occur when the user acquires the 3D model and simplifies its structure. Traditional graphics methods partially alleviate this issue, but recent texture synthesis technologies fail to ensure consistency with the original model's appearance and cannot achieve high-fidelity restoration. Moreover, background noise frequently arises in high-resolution texture synthesis, limiting the practical application of these generation technologies.In this work, we propose a high-resolution and high-fidelity texture restoration technique that uses the rough texture as the initial input to enhance the consistency between the synthetic texture and the initial texture, thereby overcoming the issues of aliasing and blurring caused by the user's structure simplification operations. Additionally, we introduce a background noise smoothing technique based on a self-supervised scheme to address the noise problem in current high-resolution texture synthesis schemes. Our approach enables high-resolution texture synthesis, paving the way for high-definition and high-detail texture synthesis technology. Experiments demonstrate that our scheme outperforms currently known schemes in high-fidelity texture recovery under high-resolution conditions.

In the first stage of geometric restoration, we generate RGB images with multi-view consistency and corresponding normal maps for a single input image, incorporating depth supervision using the Depth anything model. This stage primarily serves to provide users with textured meshes for structural simplification, is not our focus. In the second stage, we begin by rendering the rough texture post-user operation as the initial input. We then generate an RGB image using the Depth2Img model and project it onto the texture map for gradient optimization. During this projection, we produce both low-resolution and high-resolution textures for self-supervision, effectively eliminating noise and "point gaps" in the high-resolution texture map.