Catatan Pemakain Semua Fitur

Dokumen ini merangkum cara pakai fitur Composio dan AI yang saat ini aktif di project AMA-LTX.

Tujuan dokumen:

memberi gambaran cepat fitur mana yang benar-benar hidup
menjelaskan alur pakai dari sisi user
mencatat dependency penting seperti Supabase, Composio, Gemini, dan AI Human
jadi pegangan saat debug kalau hasil tidak sesuai

1. Arsitektur Singkat

Stack utama yang dipakai:

Next.js untuk frontend dan API route
Supabase untuk auth, database, dan storage
Composio untuk koneksi user ke Gemini dan social platforms
Gemini untuk text, image, dan video generation
AI Human untuk karakter bicara, face image, voice, dan lipsync
backend Python lokal untuk stitch/render tertentu

Alur umum:

user connect akun atau upload reference
frontend kirim request ke API internal
API route memanggil Composio / Gemini / AI Human / backend lokal
hasil disimpan ke Supabase storage + database
UI membaca hasil dari database/storage permanen

2. Fitur Composio

2.1. Kegunaan Composio di project ini

Composio dipakai untuk:

koneksi Gemini per user
koneksi social platforms
YouTube automation / upload flow
sebagian integrasi app eksternal lain

Referensi file:

2.2. Gemini via Composio

Untuk fitur user-facing, Gemini diarahkan lewat Composio, bukan langsung GEMINI_API_KEY.

Rute utama:

POST /api/ai/gemini/text
POST /api/ai/gemini/image
POST /api/ai/gemini/video
GET /api/ai/gemini/video/status

Tool yang dipakai di sisi Composio/Gemini:

GEMINI_GENERATE_CONTENT
GEMINI_GENERATE_IMAGE
GEMINI_GENERATE_VIDEOS
GEMINI_WAIT_FOR_VIDEO

Catatan penting:

akun Gemini user harus terhubung
image/video yang awalnya keluar sebagai temp URL harus segera dipersist ke storage app
default model image yang dipakai sekarang sudah diarahkan ke gemini-2.5-flash-image

Service social integration menangani:

fetch koneksi platform
initiate connection
sync connection ke database
post/upload ke platform yang tersedia

Platform yang terlihat sudah dipetakan:

YouTube
Facebook
Instagram
TikTok
Twitter/X
LinkedIn

Referensi:

src/services/socialIntegrationService.ts

2.4. YouTube Automation

Dokumen setup YouTube sudah ada:

docs/COMPOSIO_YOUTUBE_SETUP.md

Use case yang tampak aktif:

upload video
livestream setup
channel sync
halaman dashboard / upload / manual upload / livestream

Sidebar path terkait:

/youtube
/youtube/upload-automation-ai
/youtube/manual-upload
/youtube/livestream

3. Fitur AI yang Aktif

3.1. Gemini Image

Dipakai untuk:

text to image
storyboard preview image
gambar output cepat dari prompt

Rute:

POST /api/ai/gemini/image

Catatan:

hasil temp URL harus disimpan permanen ke storage app
flow app sekarang sudah diarahkan ke simpan image binary, bukan cuma simpan URL temp
image permanen diproxy lewat app domain /storage/v1/object/public/...

Referensi:

3.2. Gemini Video / Veo

Dipakai untuk:

generate video dari prompt
poll status hasil video

Rute:

POST /api/ai/gemini/video
GET /api/ai/gemini/video/status

Catatan:

job video perlu polling
hasil final harus dicek valid URL-nya
untuk flow storyboard, clip final bisa lewat Gemini atau AI Human tergantung mode

3.3. AI Human

AI Human dipakai untuk:

character/avatar generation
face-based identity flow
speech / voice
lipsync
clip generation yang berbasis karakter bicara

Referensi:

Kemampuan penting saat ini:

faceImage dipakai untuk menjaga identitas karakter
speechText dipakai untuk dialog
voice dipakai untuk override suara
ada fallback merge/lipsync kalau polling provider tidak valid

3.4. AI Story / Magic Storyboard

Halaman utama:

/magic-storyboard

Fokus fitur ini:

bikin storyboard MV dari prompt singkat, lirik, audio, reference image, dan karakter AI Human
output scene-by-scene
edit prompt, timing, dialogue, voice, lalu generate preview/clip

Referensi file utama:

4. Magic Storyboard: Cara Pakai

4.1. Composer / Input

Landing sekarang memakai:

history/gallery full width di main area
composer floating di footer

Input minimum:

logline atau prompt singkat

Input opsional:

lyrics
audio
reference image
AI Human character
dialogue

Prompt options yang sekarang tersedia:

Scene Planning
Jumlah Shot
Style Target
Karakter Utama
Lokasi & Waktu
Intensity
Camera Dynamism
Reference Strength
Face Lock Strength
Profile Shots
Output Format
Shot ID Prefix

4.2. Scene Planning Mode

Mode tersedia:

Lyrics Driven
Fixed Shot Count

Perilaku:

Lyrics Driven: jumlah scene mengikuti parse lirik
Fixed Shot Count: jumlah shot mengikuti target 8/10/12/16/20

4.3. Uniform 8s Mode

Untuk project lyrics-only tanpa audio:

timeline otomatis dinormalisasi ke 8 detik per scene
strip bawah memakai ukuran slot seragam

Tujuannya:

timeline lebih rapi
editor lebih stabil
cocok untuk drafting storyboard cepat

4.4. Prompt Engine

User boleh input sederhana, tapi engine yang meng-expand prompt.

Prompt compiler saat ini membangun scene dari blok:

FACE LOCK (GLOBAL)
GLOBAL LOCK
STYLE PACK
SHOT ROLE
scene body
dialogue presence
NEGATIVE

Artinya:

user tidak perlu nulis prompt teknis panjang
engine menyusun prompt storyboard yang lebih sinematik dan lebih konsisten

5. Face Lock dan Konsistensi Wajah

Fokus utama untuk wajah konsisten:

upload reference image
atau pilih AI Human character

Saat reference ada, prompt scene sekarang diawali dengan:

Using the provided face reference, keep exact same face identity and facial proportions...

Selain itu ada blok eksplisit:

FACE LOCK (GLOBAL)

Isi yang dijaga:

eye shape
nose bridge
lip shape
facial proportions
age
ethnicity
skin tone
hairline
hairstyle

Default angle rule:

utamakan 3/4 view atau frontal
profile hanya jika memang dibutuhkan

Kontrol yang mempengaruhi hasil:

Reference Strength
Face Lock Strength
Profile Shots

Saran nilai:

Reference Strength: 85-95
Face Lock Strength: 90-100
Profile Shots: off jika target utama kemiripan wajah

6. Dialogue, Voice, Emotion, Speaking Intensity

Magic Storyboard sekarang support:

dialogue global project
dialogue per scene
voice override per scene
voice emotion per scene
speaking intensity per scene

Di editor scene:

user bisa edit Dialogue
pilih Voice
pilih Voice Emotion
atur Speaking Intensity

Saat generate clips:

scene memakai scene.dialog_text dulu
jika kosong, fallback ke project.dialog_text
voice memakai scene.voice_override dulu
lalu fallback ke project.character_voice

Ini penting untuk flow AI Human.

7. Editor Scene / Timeline

Fitur editor yang aktif:

transcript rail / lyric rail
preview scene aktif
compiled prompt panel
copy prompt
regenerate prompt per scene
regenerate all prompts
save scene
approve scene
drag / resize timeline pada mode non-uniform

Badges audit yang sekarang tampil:

dialogue
voice
ref xx
face xx
profile ok

Tujuannya:

user bisa lihat cepat apakah scene pakai voice/reference/face lock kuat atau tidak

8. Preview dan Generate Clips

8.1. Generate Preview

Preview scene bisa di-generate dari editor.

Catatan penting:

preview sekarang memprioritaskan asset hasil generate permanen
temp signed URL Gemini tidak boleh lagi diprioritaskan kalau ada fallback permanen
untuk scene lama, kadang perlu Regenerate Preview sekali

8.2. Generate Clips

Clip generation untuk jalur AI Human memakai:

start/reference frame
scene prompt
dialogue text
voice config

Jika scene berbicara:

framing akan diarahkan ke mouth-ready beat
emotion + speaking intensity ikut dibawa

9. History / Gallery

History magic-storyboard sekarang:

full-width gallery
lebih visual
lebih dekat ke feed/gallery style
composer berada di footer floating

Fitur history:

open project
delete project
refresh

Hasil final bisa sinkron ke:

mv_projects.final_video_url
generated_videos untuk gallery video umum

10. Coin / Billing

Create storyboard saat ini memakai coin.

Flow:

cek user login
cek balance
create project
build timeline
deduct coin
redirect ke editor

Jika deduct gagal:

project baru akan dibersihkan agar tidak jadi gratis

11. Storage Permanen

Storage permanen yang dipakai:

image: generated-images
audio: generated-audio
video/final outputs: sesuai flow render/export

Prinsip penting:

jangan simpan hanya temp URL provider
file binary harus disalin ke storage app
database harus menyimpan URL permanen app/storage

12. Fitur AI Lain yang Terlihat di Repo

Selain Magic Storyboard, repo ini juga punya:

generator video umum
text to image
image to video
motion control placeholder
lyric video / music tooling
YouTube translation tooling
character lab / character manager

Catatan:

tidak semua fitur di repo berarti semua sudah production-ready
beberapa flow masih bergantung pada setup env, provider, atau backend lokal

13. Dependency dan Environment Penting

Beberapa env yang relevan:

COMPOSIO_API_KEY
LOCAL_BACKEND_URL
GEMINI_API_KEY untuk flow internal tertentu yang bukan user-facing
Supabase env (NEXT_PUBLIC_SUPABASE_URL, service/admin key, dsb)

Dokumen terkait:

14. Troubleshooting Cepat

Gemini image/video gagal

cek koneksi Gemini via Composio
cek model default
cek apakah output provider hanya temp URL

Preview tidak muncul

cek apakah scene masih menyimpan signed temp URL lama
klik Regenerate Preview
cek apakah preview sudah dipersist ke storage permanen

Wajah karakter berubah

naikkan Reference Strength
naikkan Face Lock Strength
matikan Profile Shots
pastikan reference image jelas dan bukan angle ekstrem

Prompt scene terasa mirip semua

pakai project baru setelah engine prompt terbaru
cek Scene Planning
cek Jumlah Shot
lihat Compiled Prompt apakah blok SHOT ROLE dan STYLE PACK sudah berbeda

AI Human tidak bicara

cek dialog_text
cek voice
cek scene override
cek route generate clips menggunakan scene-level values

History ada tapi hasil final tidak muncul di gallery umum

cek apakah mv_projects.final_video_url sudah terisi
buka history/detail project agar sync ke generated_videos berjalan

15. Rekomendasi Workflow Terbaik

Untuk hasil paling stabil:

pilih style
isi logline singkat tapi jelas
upload 1 reference wajah yang bersih
aktifkan character AI Human jika scene akan bicara
set:
- Reference Strength tinggi
- Face Lock Strength tinggi
- Profile Shots off
create project
audit Compiled Prompt
edit dialogue / voice / emotion per scene
regenerate prompt jika perlu
generate preview
generate clips
export final

16. Catatan Penutup

Kalau ingin kualitas hasil naik, fokus utama bukan bikin user menulis prompt makin panjang, tetapi:

input user tetap sederhana
engine prompt makin pintar
reference handling makin kuat
face lock dan continuity makin ketat
storage selalu permanen

Itu jalur yang paling realistis untuk bikin hasil storyboard/video lebih stabil di app ini.

Catatan Pemakain Semua Fitur

1. Arsitektur Singkat

2. Fitur Composio

2.1. Kegunaan Composio di project ini

2.2. Gemini via Composio

2.3. Social Integration via Composio

2.4. YouTube Automation

3. Fitur AI yang Aktif

3.1. Gemini Image

3.2. Gemini Video / Veo

3.3. AI Human

3.4. AI Story / Magic Storyboard

4. Magic Storyboard: Cara Pakai

4.1. Composer / Input

4.2. Scene Planning Mode

4.3. Uniform 8s Mode

4.4. Prompt Engine

5. Face Lock dan Konsistensi Wajah

6. Dialogue, Voice, Emotion, Speaking Intensity

7. Editor Scene / Timeline

8. Preview dan Generate Clips

8.1. Generate Preview

8.2. Generate Clips

9. History / Gallery

10. Coin / Billing

11. Storage Permanen

12. Fitur AI Lain yang Terlihat di Repo

13. Dependency dan Environment Penting

14. Troubleshooting Cepat

Gemini image/video gagal

Preview tidak muncul

Wajah karakter berubah

Prompt scene terasa mirip semua

AI Human tidak bicara

History ada tapi hasil final tidak muncul di gallery umum

15. Rekomendasi Workflow Terbaik

16. Catatan Penutup