NanoLive2D: AI-powered Live2D Avatar Customization

GitHub: https://github.com/GBSOSS/nano-live2d

What this is

NanoLive2D - Open-source Live2D avatar customization pipeline. Describe clothing in text → AI generates texture → avatar wears it in 3-5 seconds (using Gemini 2.0/2.5). Plus real-time Q&A with natural expressions.

Why: Traditional solutions take 4-5 weeks + $50K-200K. With this, setup takes less than a day, clothing generation is 3-5 seconds, runs at 60+ FPS on phones, and costs 80-95% less.

Good for: Quick avatar customization
Not for: Cinematic 3D or photorealistic rendering

How it works

Real-time avatars usually cost weeks of dev + ongoing GPU hosting. Live2D gives you 90% of the look at a fraction of the cost.

The pipeline:

Describe clothing in text → AI generates matching texture (Nano Banana)
Optional: provide reference images for style analysis (gemini-2.0-flash-exp)
Connect to any knowledge base for real-time Q&A

Technical stuff

1. Avatar customization

The challenge: most image models hallucinate entire new characters and trash your texture atlas. We needed to generate specific clothing while keeping the Live2D rig intact.

Two-stage approach:

Stage 1: [Optional] Reference analysis (gemini-2.0-flash-exp)

If you have reference images, the system can analyze them for style inspiration:

# Example: Analyze reference photo (optional)
prompt = """Please describe the clothing in detail, including:
1. Clothing type (T-shirt, hoodie, jacket, etc.)
2. Main colors and color scheme
3. Patterns or logos (if any)
4. Special design elements (pockets, zippers, hood, etc.)
5. Overall style (casual, sporty, formal, etc.)"""

response = requests.post(
    f"https://generativelanguage.googleapis.com/v1beta/models/"
    f"gemini-2.0-flash-exp:generateContent?key={api_key}",
    json={
        "contents": [{
            "parts": [
                {"text": prompt},
                {"inline_data": {"mime_type": "image/jpeg", "data": base64_image}}
            ]
        }]
    }
)

Stage 2: Clothing generation (Nano Banana)

Core feature: describe the clothing you want, AI generates it.

Example: "Blue hoodie with orange geometric patterns, streetwear style"

enhanced_prompt = f"""Generate a new version of this Live2D character texture sheet.

CRITICAL: Keep the EXACT SAME layout, positions, and style.
ONLY modify the TORSO clothing area (the shirt/hoodie in the middle).

Keep ALL other parts COMPLETELY UNCHANGED:
Hair (top row): KEEP EXACTLY THE SAME
Face and expressions: KEEP EXACTLY THE SAME
Arms and hands: KEEP EXACTLY THE SAME
Legs and shoes: KEEP EXACTLY THE SAME

ONLY CHANGE: The torso/clothing to: {clothing_description}

Output: 2D cartoon anime style"""

And that's it, the new texture that fits your existing rig in about 3-5 seconds.

2. Dialogue system

┌─────────────────┐       ┌──────────────────┐       ┌─────────────────┐
│ User Browser    │◄─────►│ Gemini 2.0/2.5   │       │ GBase Knowledge │
│                 │       │ REST APIs        │       │ Base            │
│ • PIXI.js v7    │       └──────────────────┘       └─────────────────┘
│ • Live2D SDK    │                │                          │
│ • 60 FPS        │                └──────────────────────────┘
│ • <10ms latency │                           │
└─────────────────┘                           ▼
                                    ┌─────────────────┐
                                    │ Avatar Logic    │
                                    │                 │
                                    │ 1. Speech→Text  │
                                    │ 2. KB Lookup    │
                                    │ 3. Text→Speech  │
                                    │ 4. Lip Sync     │
                                    └─────────────────┘

Stack:

PIXI.js v7.x + Live2D Cubism SDK (60 FPS on mobile)
26 motion files for animations

Backend:

const model = await PIXI.live2d.Live2DModel.from('haru_greeter_t05.model3.json');
app.stage.addChild(model);

// Connect to GBase knowledge base for Q&A
// Avatar answers domain-specific questions via:
// 1. Speech-to-text (user query)
// 2. GBase API retrieval (knowledge base lookup)
// 3. Text-to-speech + Live2D lip sync (response animation)

Why this approach

Gemini Nano Banana: Understands texture layouts, 3-5 sec generation, no GPU hosting costs

Live2D: Cross-platform, runs on integrated GPUs, reusable motion files

Don't use if: You need full 3D navigation, cinematic camera work, or physics-driven gameplay

What's on GitHub

Complete pipeline for avatar clothing customization:

Live2D runtime (PIXI.js + Cubism SDK)
Gemini 2.0/2.5 integration code
Text-to-texture generation pipeline
26 motion files + animation system
Sample character model (haru_greeter_t05)

Get started: https://github.com/GBSOSS/nano-live2d

Example use:

Describe clothing → avatar wears it
Integrate with your knowledge base
Deploy cross-platform at 60 FPS
Customize everything to your needs

See it in action: We built a demo at https://avatar.gbase.ai/ that adds automatic style inference (you input name/company URL, it figures out the clothing style).

Video: https://www.youtube.com/watch?v=BF8UNGzbTE0

Happy to answer questions!

Felo-Sparticle/nano-live2d.md

Select an option

No results found