A NextJs quickstart for creating and editing images and videos using Google's latest Gemini API models including Veo 3, Imagen 4, and Gemini 2.5 Flash Image aka nano banana.
![]() Compose |
![]() Edit |
![]() Video |
Note
If you want a full studio, consider Google's Flow (a professional environment for Veo/Imagen). Use this repo as a lightweight studio to learn how to build your own UI that generates content with Google's AI models via the Gemini API.
(This is not an official Google product.)
The quickstart provides a unified composer UI with different modes for content creation:
- Create Image: Generate images from text prompts using Imagen 4 or Gemini 2.5 Flash Image.
- Edit Image: Edit an image based on a text prompt using Gemini 2.5 Flash Image.
- Compose Image: Combine multiple images with a text prompt to create a new image using Gemini 2.5 Flash Image.
- Create Video: Generate videos from text prompts or an initial image using Veo 3.
- Seamless navigation between modes after generating content
- Download generated images & videos
- Cut videos directly in the browser to specific time ranges
Follow these steps to get the application running locally for development and testing.
1. Prerequisites:
- Node.js and npm (or yarn/pnpm)
GEMINI_API_KEY
: The application requires a GEMINI API key. Either create a.env
file in the project root and add your API key:GEMINI_API_KEY="YOUR_API_KEY"
or set the environment variable in your system.
Warning
Google Veo 3, Imagen 4, and Gemini 2.5 Flash Image are part of the Gemini API Paid tier. You will need to be on the paid tier to use these models.
2. Install Dependencies:
npm install
3. Run Development Server:
npm run dev
Open your browser and navigate to http://localhost:3000
to see the application.
The project is a standard Next.js application with the following key directories:
app/
: Contains the main application logic and pagespage.tsx
: Main page with the unified composer UI.api/
: API routes for different operationsimagen/generate/
: Image generation with Imagen 4gemini/generate/
: Image generation with Gemini 2.5 Flash Imagegemini/edit/
: Image editing/composition with Gemini 2.5 Flash Imageveo/generate/
: Video generation operationsveo/operation/
: Check video generation statusveo/download/
: Download generated videos
components/
: Reusable React componentsui/Composer.tsx
: The main unified composer for all interactions.ui/VideoPlayer.tsx
: Video player with trimmingui/ModelSelector.tsx
: Model selection componentui/dropzone.tsx
: Drag-and-drop component for file uploads.
lib/
: Utility functions and schema definitionspublic/
: Static assets
- Gemini API docs:
https://ai.google.dev/gemini-api/docs
- Veo 3 Guide:
https://ai.google.dev/gemini-api/docs/video?example=dialogue
- Imagen 4 Guide:
https://ai.google.dev/gemini-api/docs/imagen
The application uses the following API routes to interact with the Google models:
app/api/imagen/generate/route.ts
: Handles image generation requests with Imagen 4app/api/gemini/generate/route.ts
: Handles image generation requests with Gemini 2.5 Flash Imageapp/api/gemini/edit/route.ts
: Handles image editing and composition with Gemini 2.5 Flash (supports multiple images)
app/api/veo/generate/route.ts
: Handles video generation requests with Veo 3app/api/veo/operation/route.ts
: Checks the status of video generation operationsapp/api/veo/download/route.ts
: Downloads generated videos
- Next.js - React framework for building the user interface
- React - JavaScript library for building user interfaces
- Tailwind CSS - For styling
- Gemini API with:
- Veo 3 - For video generation
- Imagen 4 - For high-quality image generation
- Gemini 2.5 Flash - For fast image generation, editing, and composition
- Want a feature? Please open an issue describing the use case and proposed behavior.
This project is licensed under the Apache License 2.0.