LEARN 14 min read Stage 3: Solution-aware Updated 2026-05-01

LoRA Training for Virtual Influencer Personas: The 4-Hour Walkthrough

Step-by-step walkthrough for training a custom LoRA on RunPod to produce character-consistent images for a fictional Instagram persona. Includes real cost breakdown, tooling list, and honest failure modes.

Who this guide is for

This guide is for Segment 2 buyers — operators who want to build a fictional persona for Instagram, TikTok, or similar platforms. If you are trying to build a talking-head marketing video, you are on the wrong page — see HeyGen at £23/mo or Synthesia at £50/mo.

The fictional-persona workflow is unglamorous. It requires 4-8 hours of setup, a willingness to run open-source tooling on a cloud GPU, and honest expectations about what character-consistent generation can and cannot do.

The tool everyone calls “the AI influencer generator” — Synthesia, HeyGen, Descript — will not help you build Aitana López. They make talking-head video. You need consistent stills: same face, different outfits, different poses, different lighting, 200 posts deep, none of them visibly drifting. That requires a custom LoRA.

What you will build

By the end of this guide you will have:

  • A custom LoRA file for your fictional character (50-200MB)
  • A ComfyUI workflow that generates consistent-face images on demand
  • Realistic expectations: the LoRA will drift on extreme poses and unusual lighting; you will need 2-3 retakes per scene

What you will not have: any of the brand-deal infrastructure (DMs, contracts, FTC disclosures, payments). That is a separate problem.

Tools and costs

ToolCostPurpose
Midjourney£24/moGenerate initial character reference images
RunPod~£5/training runCloud GPU for LoRA training
ComfyUIFreeNode-based image generation workflow
Kohya_ssFreeLoRA training software
ChatGPT£16/moCaption generation

Total month 1: ~£50 (Midjourney + ChatGPT + one training run) Ongoing: ~£30-40/mo (Midjourney + ChatGPT, no retraining needed)

Step 1: Design your character in Midjourney (2-3 hours)

Generate 50-100 character variations using a consistent base prompt. Example prompt structure:

portrait photo of [physical description], [style], professional lighting, 
8k, photorealistic, centered composition

Select your best 20-30 images as your training set. Criteria:

  • Different angles (front, 3/4, side)
  • Different expressions (neutral, smiling, serious)
  • Different lighting conditions
  • Same recognisable face structure across all

Step 2: Caption your training images (1 hour)

Each image needs a text caption. The caption teaches the LoRA what is unique about your character vs what is generic. Format:

[trigger_word], [physical description], [clothing], [setting]

Set a unique trigger word (e.g., ohwx_woman, mycharacter) — this is how you invoke the character in future prompts.

Tools: Kohya_ss has a built-in captioning tool using BLIP; or caption manually.

Step 3: Train on RunPod (4 hours, ~£5)

  1. Create a RunPod account, provision a GPU pod (RTX 4090, ~£0.69/hr)
  2. Install Kohya_ss on the pod
  3. Upload your training images and captions
  4. Configure training: 1500-2000 steps, batch size 2, learning rate 1e-4
  5. Run training — takes 2-4 hours depending on step count
  6. Download the resulting .safetensors file (your LoRA)

Terminate the pod when done — idle GPU time is charged.

Step 4: Test in ComfyUI (1 hour)

Load ComfyUI locally or on a RunPod instance. Install the LoRA loader node. Generate 20 test images using your trigger word plus varied prompts. Check for:

  • Face recognition consistency across prompts
  • Realistic skin and lighting
  • Appropriate response to pose/outfit changes

If the character drifts significantly, adjust training steps (more steps for stronger character imprint, but risk overfitting).

Step 5: Production workflow

Once your LoRA is stable, your weekly generation workflow:

  1. Write a scene description (e.g., “at a café in Tokyo, autumn, holding a coffee”)
  2. Generate 10-20 variants with your LoRA loaded in ComfyUI
  3. Select the best 2-3 images
  4. Caption with ChatGPT in your character’s voice
  5. Post with synthetic-media disclosure (required under EU AI Act)

Typical output: 30-60 consistent-face images per month at ongoing cost of ~£30.

Honest failure modes

  • Face drift on extreme poses — side profiles and back-of-head shots often break the LoRA’s face model
  • Lighting extremes — very dark scenes or harsh directional light cause drift
  • Clothing-to-face linkage — if all training images used similar clothing, the LoRA sometimes links face identity to that clothing
  • Overfitting — too many training steps makes the character look plastic and identical in every image

Not sure which tool you need? Take the 60-second decision wizard.

Take the quiz →