Render2photo Converter Tool

This project converts 3d-renders of office interriors into photorealistic images using a LoRA for SDXL and Svelte-frontend.

Download

Render2Photo Converter Tool

This project transforms 3D renders of office interiors into photorealistic images using LoRA-Adapter.

Frontend view: Demo Demo

Demo:

Example transformations:

Input Photo	Output Photo

Images by PerOla Hammar and Gia Tu Tran, posted on Unsplash.

Getting Started

Create and activate a virtual environment:

# Create and activate virtual environment
python -m venv venv
   
# For Mac/Linux:
source venv/bin/activate
# For Windows:
venv\Scripts\activate

# Install requirements
pip install -r requirements.txt

You can work with the script using three methods:

Option A: Batch Processing Script
- Place images in the task-images folder (images in subfolders will not be processed)
- Run the script:
```
python scripts/enhance-image-lora_sdxl_render2photo_enhanced.py
```
- Results will be saved to the processed-images folder
- A settings.json file will be generated for reproducibility
- You can specify custom input and output directories using settings in the script
Option B: API
- Start the API server:
```
python api.py
```
- Open Postman and import the collection: enhance_image.postman_collection.json (link)
- Run the 'Process image' request (adjust the URL if necessary)
Option C: Web-Interface
- Start the API server:
```
python api.py
```
- Change to the web-interface directory, install the requirements and run the server:
```
cd website/render2image-processor/ 
npm i
npm run dev -- --host 0.0.0.0
```
- Open the URL in your browser that is displayed after you run the server
Optional: After you finished working, clear cache to free up disk space:
```
rm -rf ~/cache/huggingface/hub/models--stabilityai--sdxl-vae
rm -rf ~/cache/huggingface/hub/models--stabilityai--stable-diffusion-xl-base-1.0
```
(This removes downloaded Stable Diffusion models that may occupy significant space)

Training on your own data

To train your own LoRA-Adapter, you need to do the following:

Collect the image pairs - 3d-renders and photos and post them on Hugging Face using the following structure:
- photos
  - 0.png
  - 1.png
  - ...
- renders
  - 0.png
  - 1.png
  - ...
- metadata.json
```
{
  {
    id: 0,
    photo_path: "path",
    render_path: "path",
    ...
  },
  ...
}
```
Add your Hugging Face token with this dataset to the .env file under HUGGING_FACE_TOKEN variable
Run the script scripts/final-fine-tune-lora-with-descaled-aspect-ratio.py
This will produce a LoRA-adapter, that you can use for the inference script

Parameters

The following parameters can be customized:

Parameter	Default	Description
`BASE_MODEL_PATH`	stabilityai/stable-diffusion-xl-base-1.0	Base model for image generation
`VAE_PATH`	stabilityai/sdxl-vae	Improved VAE for better color reproduction
`LORA_DIR`	lora_sdxl_render2photo_enhanced/lora-weights-epoch-15	Directory containing trained LoRA weights
`INPUT_DIR`	task-images	Directory for input images
`OUTPUT_DIR`	processed-images	Directory for processed images
`PROMPT`	high quality photograph, photorealistic, masterpiece, high quality, detailed, realistic, photorealistic, consistent shapes, consistent lighting, consistent shadows, preserve as many details from the original image as possible, 8k, 4k, sharp focus	General prompt for image enhancement
`FACE_PROMPT`	high quality photograph, photorealistic, masterpiece, perfect face details, realistic face features, high quality, detailed face, ultra realistic human face, perfect eyes, perfect skin texture, perfect facial proportions, clean render	Specialized prompt for face enhancement
`NEGATIVE_PROMPT`	low quality, bad anatomy, bad hands, text, error, blurry, out of focus, low resolution, cropped, worst quality, jpeg artifacts, signature, watermark, distorted	Characteristics to avoid in generation
`FACE_NEGATIVE_PROMPT`	low quality, bad anatomy, distorted face, deformed face, disfigured face, unrealistic face, bad eyes, crossed eyes, misaligned eyes, bad nose, bad mouth, bad teeth, bad skin	Face-specific characteristics to avoid
`STRENGTH`	0.4	General processing strength
`FACE_STRENGTH`	0.35	Face processing strength
`GUIDANCE_SCALE`	6.0	Strength of prompt adherence
`FACE_GUIDANCE_SCALE`	8.0	Face-specific prompt adherence
`RESIZE_LIMIT`	2048	Maximum dimension for image resizing
`SEED`	42	Random seed for reproducible results
`UNET_RANK`	32	UNet rank from training script
`TEXT_ENCODER_RANK`	8	Text encoder rank from training script
`LORA_SCALE`	0.8	LoRA influence for main image processing
`FACE_LORA_SCALE`	0.3	LoRA influence for face regions
`NUM_STEPS`	400	Number of diffusion steps for main image
`FACE_NUM_STEPS`	200	Number of steps for face regions
`USE_CUSTOM_NOISE`	True	Enable custom noise initialization
`MIXED_PRECISION`	fp16	Precision setting for inference
`GRADIENT_CHECKPOINTING`	True	Memory optimization technique
`POST_PROCESS`	False	Enable post-processing
`CONTRAST_FACTOR`	1.2	Contrast enhancement factor
`SHARPNESS_FACTOR`	1.7	Sharpness enhancement factor
`SATURATION_FACTOR`	1.1	Saturation enhancement factor
`FACE_DETECTION_CONFIDENCE`	0.7	Confidence threshold for detection
`FACE_PADDING_PERCENT`	30	Percentage to expand face crop area
`ENABLE_FACE_ENHANCEMENT`	False	Enable specialized face processing
`DEBUG_MODE`	True	Visualize face detection
`USE_DNN_FACE_DETECTOR`	True	Use robust DNN face detector
`FACE_DETECTOR_MODEL_PATH`	models/opencv_face_detector_uint8.pb	Path to detector model
`FACE_DETECTOR_CONFIG_PATH`	models/opencv_face_detector.pbtxt	Path to detector config
`MAX_IMG_SIZE`	2048	Maximum image size for resizing
`USE_EMA`	True	Enable Exponential Moving Average
`EMA_DECAY`	0.9995	EMA decay rate from training script

Top categories