Bria VGL Prompt Writing
Generate structured JSON prompts for Bria's FIBO models using Visual Generation Language (VGL).
Related Skill: Use bria-ai to execute these VGL prompts via the Bria API. VGL defines the structured prompt format; bria-ai handles generation, editing, and background removal.
Core Concept
VGL replaces ambiguous natural language prompts with deterministic JSON that explicitly declares every visual attribute: objects, lighting, camera settings, composition, and style. This ensures reproducible, controllable image generation.
Operation Modes
| Mode | Input | Output | Use Case |
|---|---|---|---|
| Generate | Text prompt | VGL JSON | Create new image from description |
| Edit | Image + instruction | VGL JSON | Modify reference image |
| Edit_with_Mask | Masked image + instruction | VGL JSON | Fill grey masked regions |
| Caption | Image only | VGL JSON | Describe existing image |
| Refine | Existing JSON + edit | Updated VGL JSON | Modify existing prompt |
JSON Schema
Output a single valid JSON object with these required keys:
1. short_description (String)
Concise summary of image content, max 200 words. Include key subjects, actions, setting, and mood.
2. objects (Array, max 5 items)
Each object requires:
{
"description": "Detailed description, max 100 words",
"location": "center | top-left | bottom-right foreground | etc.",
"relative_size": "small | medium | large within frame",
"shape_and_color": "Basic shape and dominant color",
"texture": "smooth | rough | metallic | furry | fabric | etc.",
"appearance_details": "Notable visual details",
"relationship": "Relationship to other objects",
"orientation": "upright | tilted 45 degrees | facing left | horizontal | etc."
}
Human subjects add:
{
"pose": "Body position description",
"expression": "winking | joyful | serious | surprised | calm",
"clothing": "Attire description",
"action": "What the person is doing",
"gender": "Gender description",
"skin_tone_and_texture": "Skin appearance"
}
Object clusters add:
{
"number_of_objects": 3
}
Size guidance: If a person is the main subject, use "medium-to-large" or "large within frame".
3. background_setting (String)
Overall environment, setting, and background elements not in objects.
4. lighting (Object)
{
"conditions": "bright daylight | dim indoor | studio lighting | golden hour | blue hour | overcast",
"direction": "front-lit | backlit | side-lit from left | top-down",
"shadows": "long, soft shadows | sharp, defined shadows | minimal shadows"
}
5. aesthetics (Object)
{
"composition": "rule of thirds | symmetrical | centered | leading lines | medium shot | close-up",
"color_scheme": "monochromatic blue | warm complementary | high contrast | pastel",
"mood_atmosphere": "serene | energetic | mysterious | joyful | dramatic | peaceful"
}
For people as main subject, specify shot type in composition: "medium shot", "close-up", "portrait composition".
6. photographic_characteristics (Object)
{
"depth_of_field": "shallow | deep | bokeh background",
"focus": "sharp focus on subject | soft focus | motion blur",
"camera_angle": "eye-level | low angle | high angle | dutch angle | bird's-eye",
"lens_focal_length": "wide-angle | 50mm standard | 85mm portrait | telephoto | macro"
}
For people: Prefer "standard lens (35mm-50mm)" or "portrait lens (50mm-85mm)". Avoid wide-angle unless specified.
7. style_medium (String)
"photograph" | "oil painting" | "watercolor" | "3D render" | "digital illustration" | "pencil sketch"
Default to "photograph" unless explicitly requested otherwise.
8. artistic_style (String)
If not photograph, describe characteristics in max 3 words: "impressionistic, vibrant, textured"
For photographs, use "realistic" or similar.
9. context (String)
Describe the image type/purpose:
"High-fashion editorial photograph for magazine spread""Concept art for fantasy video game""Commercial product photography for e-commerce"
10. text_render (Array)
Default: empty array []
Only populate if user explicitly provides exact text content:
{
"text": "Exact text from user (never placeholder)",
"location": "center | top-left | bottom",
"size": "small | medium | large",
"color": "white | red | blue",
"font": "serif typeface | sans-serif | handwritten | bold impact",
"appearance_details": "Metallic finish | 3D effect | etc."
}
Exception: Universal text integral to objects (e.g., "STOP" on stop sign).
11. edit_instruction (String)
Single imperative command describing the edit/generation.
Edit Instruction Formats
For Standard Edits (no mask)
Start with action verb, describe changes, never reference "original image":
| Category | Rewritten Instruction |
|---|---|
| Style change | Turn the image into the cartoon style. |
| Object attribute | Change the dog's color to black and white. |
| Add element | Add a wide-brimmed felt hat to the subject. |
| Remove object | Remove the book from the subject's hands. |
| Replace object | Change the rose to a bright yellow sunflower. |
| Lighting | Change the lighting from dark and moody to bright and vibrant. |
| Composition | Change the perspective to a wider shot. |
| Text change | Change the text "Happy Anniversary" to "Hello". |
| Quality | Refine the image to obtain increased clarity and sharpness. |
For Masked Region Edits
Reference "masked regions" or "masked area" as target:
| Intent | Rewritten Instruction |
|---|---|
| Object generation | Generate a white rose with a blue center in the masked region. |
| Extension | Extend the image into the masked region to create a scene featuring... |
| Background fill | Create the following background in the masked region: A vast ocean extending to horizon. |
| Atmospheric fill | Fill the background masked area with a clear, bright blue sky with wispy clouds. |
| Subject restoration | Restore the area in the mask with a young woman. |
| Environment infill | Create inside the masked area: a greenhouse with rows of plants under glass ceiling. |
Fidelity Rules
Standard Edit Mode
Preserve ALL visual properties unless explicitly changed by instruction:
- Subject identity, pose, appearance
- Object existence, location, size, orientation
- Composition, camera angle, lens characteristics
- Style/medium
Only change what the edit strictly requires.
Masked Edit Mode
- Preserve all visible (non-masked) portions exactly
- Fill grey masked regions to blend seamlessly with unmasked areas
- Match existing style, lighting, and subject matter
- Never describe grey masks—describe content that fills them
Example Output
{
"short_description": "A professional businesswoman in a navy blazer stands confidently in a modern glass office, holding a tablet. Natural daylight streams through floor-to-ceiling windows, creating a warm, productive atmosphere.",
"objects": [
{
"description": "A confident businesswoman in her 30s with shoulder-length dark hair, wearing a tailored navy blazer over a white blouse. She holds a tablet in her left hand while gesturing naturally with her right.",
"location": "center-right",
"relative_size": "large within frame",
"shape_and_color": "Human figure, navy and white clothing",
"texture": "smooth fabric, professional attire",
"appearance_details": "Minimal jewelry, well-groomed professional appearance",
"relationship": "Main subject, interacting with tablet",
"orientation": "facing slightly left, three-quarter view",
"pose": "Standing upright, relaxed professional stance",
"expression": "confident, approachable smile",
"clothing": "Tailored navy blazer, white silk blouse, dark trousers",
"action": "Presenting or reviewing information on tablet",
"gender": "female",
"skin_tone_and_texture": "Medium warm skin tone, healthy smooth complexion"
},
{
"description": "A modern tablet device with a bright display showing charts and graphs",
"location": "center, held by subject",
"relative_size": "small",
"shape_and_color": "Rectangular, silver frame with illuminated screen",
"texture": "smooth glass and metal",
"appearance_details": "Thin profile, business application visible on screen",
"relationship": "Held by businesswoman, focus of her attention",
"orientation": "vertical, screen facing viewer at slight angle",
"pose": null,
"expression": null,
"clothing": null,
"action": null,
"gender": null,
"skin_tone_and_texture": null,
"number_of_objects": null
}
],
"background_setting": "Modern corporate office interior with floor-to-ceiling windows overlooking a city skyline. Minimalist furniture in neutral tones, potted plants adding touches of green.",
"lighting": {
"conditions": "bright natural daylight",
"direction": "side-lit from left through windows",
"shadows": "soft, natural shadows"
},
"aesthetics": {
"composition": "rule of thirds, medium shot",
"color_scheme": "professional blues and neutral whites with warm accents",
"mood_atmosphere": "confident, professional, welcoming"
},
"photographic_characteristics": {
"depth_of_field": "shallow, background slightly soft",
"focus": "sharp focus on subject's face and upper body",
"camera_angle": "eye-level",
"lens_focal_length": "portrait lens (85mm)"
},
"style_medium": "photograph",
"artistic_style": "realistic",
"context": "Corporate portrait photography for company website or LinkedIn professional profile.",
"text_render": [],
"edit_instruction": "Generate a professional businesswoman in a modern office environment holding a tablet."
}
Common Pitfalls
- Don't invent text - Keep
text_renderempty unless user provides exact text - Don't over-describe - Max 5 objects, prioritize most important
- Match the mode - Use correct
edit_instructionformat for masked vs standard edits - Preserve fidelity - Only change what's explicitly requested
- Be specific - Use concrete values ("85mm portrait lens") not vague terms ("nice camera")
- Null for irrelevant - Human-specific fields should be
nullfor non-human objects
Using VGL with Bria API
Generate Image with Structured Prompt
Pass VGL JSON to the structured_prompt parameter:
from bria_client import BriaClient
client = BriaClient()
vgl_prompt = {
"short_description": "Professional businesswoman in modern office...",
"objects": [...],
# ... full VGL JSON
}
# Use structured_prompt for deterministic generation
result = client.refine(
structured_prompt=json.dumps(vgl_prompt),
instruction="Generate this scene",
aspect_ratio="16:9"
)
print(result['result']['image_url'])
Refine Existing Generation
After generation, Bria returns a structured_prompt you can modify and regenerate:
# Initial generation
result = client.generate("A cozy coffee shop interior")
structured = result['result']['structured_prompt']
# Modify and regenerate
result = client.refine(
structured_prompt=structured,
instruction="Change the lighting to golden hour"
)
curl Example
curl -X POST "https://engine.prod.bria-api.com/v2/image/generate" \
-H "api_token: $BRIA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"structured_prompt": "{\"short_description\": \"...\", ...}",
"prompt": "Generate this scene",
"aspect_ratio": "16:9"
}'
References
- Schema Reference - Complete JSON schema with all parameter values
- bria-ai - API client and endpoint documentation for executing VGL prompts