render2photo-converter-tool Svelte Themes

Render2photo Converter Tool

This project converts 3d-renders of office interriors into photorealistic images using a LoRA for SDXL and Svelte-frontend.

Render2Photo Converter Tool

This project transforms 3D renders of office interiors into photorealistic images using LoRA-Adapter.

Frontend view: Demo Demo

Demo: Demo

Example transformations:

Input Photo Output Photo

Images by PerOla Hammar and Gia Tu Tran, posted on Unsplash.

Getting Started

  1. Create and activate a virtual environment:

    # Create and activate virtual environment
    python -m venv venv
       
    # For Mac/Linux:
    source venv/bin/activate
    # For Windows:
    venv\Scripts\activate
    
    # Install requirements
    pip install -r requirements.txt
    
  2. You can work with the script using three methods:

    Option A: Batch Processing Script

    • Place images in the task-images folder (images in subfolders will not be processed)
    • Run the script:
      python scripts/enhance-image-lora_sdxl_render2photo_enhanced.py
      
    • Results will be saved to the processed-images folder
    • A settings.json file will be generated for reproducibility
    • You can specify custom input and output directories using settings in the script

    Option B: API

    • Start the API server:
      python api.py
      
    • Open Postman and import the collection: enhance_image.postman_collection.json (link)
    • Run the 'Process image' request (adjust the URL if necessary)

    Option C: Web-Interface

    • Start the API server:
      python api.py
      
    • Change to the web-interface directory, install the requirements and run the server:
      cd website/render2image-processor/ 
      npm i
      npm run dev -- --host 0.0.0.0
      
    • Open the URL in your browser that is displayed after you run the server
  3. Optional: After you finished working, clear cache to free up disk space:

    rm -rf ~/cache/huggingface/hub/models--stabilityai--sdxl-vae
    rm -rf ~/cache/huggingface/hub/models--stabilityai--stable-diffusion-xl-base-1.0
    

    (This removes downloaded Stable Diffusion models that may occupy significant space)

Training on your own data

To train your own LoRA-Adapter, you need to do the following:

  1. Collect the image pairs - 3d-renders and photos and post them on Hugging Face using the following structure:

    • photos
      • 0.png
      • 1.png
      • ...
    • renders
      • 0.png
      • 1.png
      • ...
    • metadata.json
      {
        {
          id: 0,
          photo_path: "path",
          render_path: "path",
          ...
        },
        ...
      }
      
  2. Add your Hugging Face token with this dataset to the .env file under HUGGING_FACE_TOKEN variable

  3. Run the script scripts/final-fine-tune-lora-with-descaled-aspect-ratio.py

  4. This will produce a LoRA-adapter, that you can use for the inference script

Parameters

The following parameters can be customized:

Parameter Default Description
BASE_MODEL_PATH stabilityai/stable-diffusion-xl-base-1.0 Base model for image generation
VAE_PATH stabilityai/sdxl-vae Improved VAE for better color reproduction
LORA_DIR lora_sdxl_render2photo_enhanced/lora-weights-epoch-15 Directory containing trained LoRA weights
INPUT_DIR task-images Directory for input images
OUTPUT_DIR processed-images Directory for processed images
PROMPT high quality photograph, photorealistic, masterpiece, high quality, detailed, realistic, photorealistic, consistent shapes, consistent lighting, consistent shadows, preserve as many details from the original image as possible, 8k, 4k, sharp focus General prompt for image enhancement
FACE_PROMPT high quality photograph, photorealistic, masterpiece, perfect face details, realistic face features, high quality, detailed face, ultra realistic human face, perfect eyes, perfect skin texture, perfect facial proportions, clean render Specialized prompt for face enhancement
NEGATIVE_PROMPT low quality, bad anatomy, bad hands, text, error, blurry, out of focus, low resolution, cropped, worst quality, jpeg artifacts, signature, watermark, distorted Characteristics to avoid in generation
FACE_NEGATIVE_PROMPT low quality, bad anatomy, distorted face, deformed face, disfigured face, unrealistic face, bad eyes, crossed eyes, misaligned eyes, bad nose, bad mouth, bad teeth, bad skin Face-specific characteristics to avoid
STRENGTH 0.4 General processing strength
FACE_STRENGTH 0.35 Face processing strength
GUIDANCE_SCALE 6.0 Strength of prompt adherence
FACE_GUIDANCE_SCALE 8.0 Face-specific prompt adherence
RESIZE_LIMIT 2048 Maximum dimension for image resizing
SEED 42 Random seed for reproducible results
UNET_RANK 32 UNet rank from training script
TEXT_ENCODER_RANK 8 Text encoder rank from training script
LORA_SCALE 0.8 LoRA influence for main image processing
FACE_LORA_SCALE 0.3 LoRA influence for face regions
NUM_STEPS 400 Number of diffusion steps for main image
FACE_NUM_STEPS 200 Number of steps for face regions
USE_CUSTOM_NOISE True Enable custom noise initialization
MIXED_PRECISION fp16 Precision setting for inference
GRADIENT_CHECKPOINTING True Memory optimization technique
POST_PROCESS False Enable post-processing
CONTRAST_FACTOR 1.2 Contrast enhancement factor
SHARPNESS_FACTOR 1.7 Sharpness enhancement factor
SATURATION_FACTOR 1.1 Saturation enhancement factor
FACE_DETECTION_CONFIDENCE 0.7 Confidence threshold for detection
FACE_PADDING_PERCENT 30 Percentage to expand face crop area
ENABLE_FACE_ENHANCEMENT False Enable specialized face processing
DEBUG_MODE True Visualize face detection
USE_DNN_FACE_DETECTOR True Use robust DNN face detector
FACE_DETECTOR_MODEL_PATH models/opencv_face_detector_uint8.pb Path to detector model
FACE_DETECTOR_CONFIG_PATH models/opencv_face_detector.pbtxt Path to detector config
MAX_IMG_SIZE 2048 Maximum image size for resizing
USE_EMA True Enable Exponential Moving Average
EMA_DECAY 0.9995 EMA decay rate from training script

Top categories

Loading Svelte Themes