HomeBig DataConstruct Your Personal Video Era WebApp - TimeCapsule

Construct Your Personal Video Era WebApp – TimeCapsule


We’ve all seen AI write essays, compose music, and even paint jaw-dropping portraits. But there may be one other frontier that’s far more thrilling – AI-generated movies. Think about stepping right into a film scene, sending an animated greeting, or witnessing a historic reenactment, all crafted by AI. Till now, most of us have been simply curious spectators, giving directions and hoping for the best output. However what should you may transcend that and construct your personal video technology webapp?

That’s precisely what I did with Time Capsule. Right here is the way it works: you add a photograph, choose a time interval, choose a career, and similar to that, you’re transported into the previous with a personalised picture and a brief video. Easy, proper? However the true magic occurred once I took this concept to the Knowledge Hack Summit, essentially the most futuristic AI convention in India.

We turned Time Capsule right into a GenAI playground sales space designed purely for enjoyable and engagement. It turned the favorite sales space not only for attendees, however for audio system and GenAI leaders too. Watching individuals’s faces mild up as they noticed themselves as astronauts, kings, or Victorian-era students jogged my memory why constructing with AI is so thrilling.

So I assumed, why not share this concept with the beautiful viewers of AnalyticsVidhya. Buckle up, as I take you behind the scenes of how Time Capsule went from an thought to an interactive video technology webapp.

The Idea of a Video Era WebApp (With Time Capsule Instance)

At its core, a video technology webapp is any utility that takes person enter and transforms it into a brief, AI-created video. The enter may very well be a selfie, textual content, or just a few easy decisions. The AI then turns them into transferring visuals that really feel distinctive and private.

Each video technology app works by way of three principal blocks:

  • Enter: What the person gives – this may very well be a photograph, textual content, or choices.
  • Transformation: The AI interprets the enter and creates visuals.
  • Output: The ultimate outcome, delivered as a video (and generally a picture too).

The actual energy lies in personalization: Generic AI movies on-line may be enjoyable, however movies starring you immediately grow to be extra participating and memorable.

Ideas like Time Capsule thrive as a result of they don’t simply generate random clips, they generate your story, or on this case, your journey by way of time.

How Time Capsule Works

Right here is the brief and simple method during which Time Capsule, our video technology webapp, works.

  • Add a photograph of your self.
  • Choose ethnicity, time interval, career, and motion.
  • AI generates a personalised portrait and brief video.

As soon as finished, you obtain your personal time-travel expertise, whether or not as a Roman gladiator, a Renaissance artist, or perhaps a futuristic explorer.

Video generation webapp steps

Now that you just’ve seen how the method works, it’s time to begin constructing your personal ‘Time Capsule’.

Applied sciences Utilized in TimeCapsule

Listed here are all of the applied sciences utilized in constructing our very personal video technology webapp – TimeCapsule.

Programming Language

  • Python: Core language for scripting the appliance and integrating AI companies.

AI & Generative Fashions

  • OpenAI API: For enhancing prompts and producing text-based steering for photographs and movies.
  • Google Gemini (genai): For picture evaluation (e.g., gender detection) and generative duties.
  • RunwayML: AI picture technology from prompts and reference photographs.
  • fal_client (FAL AI): Accessing Seeddance professional mannequin for video technology from a single picture and motion immediate.

Pc Imaginative and prescient

  • OpenCV (cv2): Capturing photographs from a webcam and processing video frames.
  • Pillow (PIL): Dealing with photographs, overlays, and including a brand to movies.
  • NumPy: Array manipulation for photographs and frames throughout video processing.

Electronic mail Integration

  • Yagmail: Sending emails with attachments (generated picture and video).

Utilities & System

  • Requests: Downloading generated photographs and movies through HTTP requests.
  • uuid: Producing distinctive identifiers for recordsdata.
  • os: Listing creation, file administration, and atmosphere entry.
  • dotenv: Loading API keys and credentials from .env recordsdata.
  • datetime: Timestamping generated recordsdata.
  • base64: Encoding photographs for API uploads.
  • enum: Defining structured choices for ethnicity, time interval, career, and actions.
  • re: Sanitizing and cleansing textual content prompts for AI enter.

Learn how to Make Your Personal Time Capsule

Now that all the weather that make the Time Capsule potential, right here is the precise, step-by-step blueprint to make your personal video-generation webapp.

1. Import All Libraries

You’ll first must import all obligatory libraries for the challenge.

import cv2
import os
import uuid
import base64
import requests
import yagmail
import fal_client
import numpy as np
from PIL import Picture
import google.generativeai as genai
from enum import Enum
from dotenv import load_dotenv
from openai import OpenAI
from datetime import datetime
import time
import re
from runwayml import RunwayML


# Load atmosphere variables
load_dotenv()

2. Enter from Person

The method of the net app begins with the person importing a private photograph. This photograph types the inspiration of the AI-generated character. Customers then choose ethnicity, time interval, career, and motion, offering structured enter that guides the AI. This ensures the generated picture and video are personalised, contextually correct, and visually participating.

Seize Picture

The capture_image technique makes use of OpenCV to take a photograph from the person’s digicam. Customers can press SPACE to seize or ESC to cancel. It contains fallbacks for circumstances when the digicam GUI isn’t accessible, robotically capturing a picture if wanted. Every photograph is saved with a singular filename to keep away from overwriting.

1. Initialize Digicam

Listed here are the steps to initialize the digicam.

cap = cv2.VideoCapture(0)
  • Opens the default digicam (machine 0).
  • Checks if the digicam is accessible; if not, prints an error and exits.

2. Begin Seize Loop

whereas True:
ret, body = cap.learn()
  • Repeatedly reads frames from the digicam.
  • ret is True if a body is efficiently captured.
  • The body accommodates the precise picture knowledge.

3. Show the Digicam Feed

Strive:
cv2.imshow('Digicam - Press SPACE to seize, ESC to exit', body)
       key = cv2.waitKey(1) & 0xFF
besides cv2.error as e:
       # If GUI show fails, use automated seize after delay
       print("GUI show not accessible. Utilizing automated seize...")
       print("Capturing picture in 3 seconds...")
       time.sleep(3)
key = 32  # Simulate SPACE key press
  • Reveals the reside digicam feed in a window.
  • Waits for person enter:
    – SPACE (32) → Seize the picture.
    – ESC (27) → Cancel seize.
  • Fallback: If the GUI show fails (e.g., working in a headless atmosphere), the code waits 3 seconds and robotically captures the picture.

4. Save the Picture

unique_id = str(uuid.uuid4())
timestamp = datetime.now().strftime("%Ypercentmpercentd_percentHpercentMpercentS")
filename = f"captured_{timestamp}_{unique_id}.jpg"
filepath = os.path.be part of('captured_images', filename)
# Save the picture
cv2.imwrite(filepath, body)

Right here is the entire code to seize the picture, in a single piece:

def capture_image(self):
       """Seize picture utilizing OpenCV with fallback strategies"""
       print("Initializing digicam...")
       cap = cv2.VideoCapture(0)
      
       if not cap.isOpened():
           print("Error: Couldn't open digicam")
           return None
      
       strive:
           print("Digicam prepared! Press SPACE to seize picture, ESC to exit")
          
           whereas True:
               ret, body = cap.learn()
               if not ret:
                   print("Error: Couldn't learn body")
                   break
              
               # Attempt to show the body
               strive:
                   cv2.imshow('Digicam - Press SPACE to seize, ESC to exit', body)
                   key = cv2.waitKey(1) & 0xFF
               besides cv2.error as e:
                   # If GUI show fails, use automated seize after delay
                   print("GUI show not accessible. Utilizing automated seize...")
                   print("Capturing picture in 3 seconds...")
                   # import time
                   time.sleep(3)
                   key = 32  # Simulate SPACE key press
                if key == 32:  # SPACE key
                   # Generate UUID for distinctive filename
                   unique_id = str(uuid.uuid4())
                   timestamp = datetime.now().strftime("%Ypercentmpercentd_percentHpercentMpercentS")
                   filename = f"captured_{timestamp}_{unique_id}.jpg"
                   filepath = os.path.be part of('captured_images', filename)
                  
                   # Save the picture
                   cv2.imwrite(filepath, body)
                   print(f"Picture captured and saved as: {filepath}")
                   break
               elif key == 27:  # ESC key
                   print("Seize cancelled")
                   filepath = None
                   break
      
       besides Exception as e:
           print(f"Error throughout picture seize: {e}")
           # Fallback: seize with out GUI
           print("Making an attempt fallback seize...")
           strive:
               ret, body = cap.learn()
               if ret:
                   unique_id = str(uuid.uuid4())
                   timestamp = datetime.now().strftime("%Ypercentmpercentd_percentHpercentMpercentS")
                   filename = f"captured_{timestamp}_{unique_id}.jpg"
                   filepath = os.path.be part of('captured_images', filename)
                   cv2.imwrite(filepath, body)
                   print(f"Fallback picture captured and saved as: {filepath}")
               else:
                   filepath = None
           besides Exception as e2:
               print(f"Fallback seize additionally failed: {e2}")
               filepath = None
      
       lastly:
           cap.launch()
           strive:
               cv2.destroyAllWindows()
           besides:
               go  # Ignore if GUI cleanup fails
      
       return filepath

Output:

Select ethnicity, Time interval, Career, and Motion

The get_user_selections technique permits customers to customise their character by selecting from the next choices: ethnicity, time interval, career, and motion. Choices are displayed with numbers, and the person inputs their selection. The choices are returned and used to create a personalised picture and video.

Listed here are all of the choices accessible to select from:

class EthnicityOptions(Enum):
   CAUCASIAN = "Caucasian"
   AFRICAN = "African"
   ASIAN = "Asian"
   HISPANIC = "Hispanic"
   MIDDLE_EASTERN = "Center Japanese"
   MIXED = "Blended Heritage"


class TimePeriodOptions(Enum):
   Jurassic = "Jurassic Interval (200-145 million in the past)"
   ANCIENT = "Historical Occasions (Earlier than 500 AD)"
   MEDIEVAL = "Medieval (500-1500 AD)"
   RENAISSANCE = "Renaissance (1400-1600)"
   COLONIAL = "Colonial Period (1600-1800)"
   VICTORIAN = "Victorian Period (1800-1900)"
   EARLY_20TH = "Early twentieth Century (1900-1950)"
   MID_20TH = "Mid twentieth Century (1950-1990)"
   MODERN = "Trendy Period (1990-Current)"
   FUTURISTIC = "Futuristic (Close to Future)"


class ProfessionOptions(Enum):
   WARRIOR = "Warrior/Soldier"
   SCHOLAR = "Scholar/Trainer"
   MERCHANT = "Service provider/Dealer"
   ARTISAN = "Artisan/Craftsperson"
   FARMER = "Farmer/Agricultural Employee"
   HEALER = "Healer/Medical Skilled"
   ENTERTAINER = "Entertainer/Performer"
   NOBLE = "Noble/Aristocrat"
   EXPLORER = "Explorer/Adventurer"
   SPIRITUAL = "Religious Chief/Clergy"


class ActionOptions(Enum):
   SELFIE = "Taking a selfie from digicam view"
   DANCING = "Dancing to music"
   WORK_ACTION = "Performing work/skilled motion"
   WALKING = "Easy strolling"
   COMBAT = "Fight/preventing motion"
   CRAFTING = "Crafting/creating one thing"
   SPEAKING = "Talking/giving a speech"
   CELEBRATION = "Celebrating/cheering"

Right here is the code block to seize the choice:

def get_user_selections(self):
       """Get person choices for character customization"""
       print("n=== Character Customization ===")
      
       # Ethnicity choice
       print("nSelect Ethnicity:")
       for i, choice in enumerate(EthnicityOptions, 1):
           print(f"{i}. {choice.worth}")
       ethnicity_choice = int(enter("Enter selection (1-6): ")) - 1
       ethnicity = record(EthnicityOptions)[ethnicity_choice]
      
       # Time Interval choice
       print("nSelect Time Interval:")
       for i, choice in enumerate(TimePeriodOptions, 1):
           print(f"{i}. {choice.worth}")
       period_choice = int(enter("Enter selection (1-9): ")) - 1
       time_period = record(TimePeriodOptions)[period_choice]
      
       # Career choice
       print("nSelect Career:")
       for i, choice in enumerate(ProfessionOptions, 1):
           print(f"{i}. {choice.worth}")
       profession_choice = int(enter("Enter selection (1-10): ")) - 1
       career = record(ProfessionOptions)[profession_choice]


       # Motion Choice
       print("n=== Video Motion Choice ===")
       for i, motion in enumerate(ActionOptions, 1):
           print(f"{i}. {motion.worth}")
      
       action_choice = int(enter("Choose motion (1-8): ")) - 1
       action_choice =  record(ActionOptions)[action_choice]


       return ethnicity, time_period, career,action_choice

Detect Gender from the Picture

The detect_gender_from_image operate makes use of Google Gemini 2.0 Flash to determine the gender from an uploaded picture. It handles errors gracefully, returning ‘particular person’ if detection fails. This helps personalize the generated video, making certain the mannequin precisely represents the person and avoids producing a male picture for a feminine or vice versa.

def detect_gender_from_image(self, image_path):
"""Detect gender from captured picture utilizing Google Gemini 2.0 Flash"""
       strive:
       	print("Analyzing picture to detect gender...")
           	# Add picture to Gemini
           	uploaded_file = genai.upload_file(image_path)
          
           	# Anticipate the file to be processed
           	# import time
           	whereas uploaded_file.state.title == "PROCESSING":
              	print("Processing picture...")
              	time.sleep(2)
              	uploaded_file = genai.get_file(uploaded_file.title)
          
           	if uploaded_file.state.title == "FAILED":
               	print("Didn't course of picture")
               	return 'particular person'
          
           	# Generate response
           	response = self.gemini_model.generate_content([
               uploaded_file,
               "Look at this image and determine if the person appears to be male or female. Respond with only one word: 'male' or 'female'."
           ])
          
           	# Clear up the uploaded file
           	genai.delete_file(uploaded_file.title)
          
           	gender = response.textual content.strip().decrease()
           	if gender in ['male', 'female']:
               	return gender
           	else:
               	return 'particular person'  # fallback
              
besides Exception as e:
       	print(f"Error detecting gender with Gemini: {e}")
           	return 'particular person'  # fallback

3. Generate an Picture from the Inputs

Now that we’ve got the enter from the person for all of the parameters, we will proceed to creating an AI picture utilizing the identical. Listed here are the steps for that:

Generate a Immediate for Picture Era

After accumulating the person’s choices, we use the enhance_image_prompt_with_openai operate to create an in depth and interesting immediate for the picture technology mannequin. It transforms the fundamental inputs like gender, ethnicity, career, time interval, and motion right into a inventive, skilled, and age-appropriate immediate, making certain the generated photographs are correct, visually interesting, and personalised.

For this, we’re utilizing the “gpt-4.1-mini” mannequin with a temperature of 0.5 to introduce some randomness and creativity. If the OpenAI service encounters an error, the operate falls again to a easy default immediate, maintaining the video technology course of easy and uninterrupted.

def enhance_image_prompt_with_openai(self, ethnicity, time_period, career, gender,motion):
"""Use OpenAI to reinforce the picture immediate based mostly on person choices"""
       base_prompt = f"""
       Create a easy, clear immediate for AI picture technology:
       - Gender: {gender}
       - Ethnicity: {ethnicity.worth}
       - Career: {career.worth}
       - Time interval: {time_period.worth}
       - Performing Motion: {motion.worth}
       - Present applicable clothes and setting
       - Make the background a bit distinctive within the immediate
       - Hold it applicable for all ages
       - Most 30 phrases
       """
       strive:
           response = self.openai_client.chat.completions.create(
               mannequin="gpt-4.1-mini",
               messages=[{"role": "user", "content": base_prompt}],
               max_tokens=80,
               temperature=0.5
           )
           enhanced_prompt = response.decisions[0].message.content material.strip()
           return enhanced_prompt
       besides Exception as e:
           print(f"Error with OpenAI: {e}")
           # Fallback immediate
           return f"{gender} {ethnicity.worth} {career.worth} from {time_period.worth} performing {motion.worth}, skilled portrait"

After producing a immediate, we have to clear and sanitize it for API compatibility. Right here is the operate for sanitizing the immediate.

def sanitize_prompt(self, immediate):
"""Sanitize and restrict immediate for API compatibility"""
       # Take away problematic characters and restrict size
       import re
      
       # Take away additional whitespace and newlines
       immediate = re.sub(r's+', ' ', immediate.strip())
      
       # Take away particular characters which may trigger points
       immediate = re.sub(r'[^ws,.-]', '', immediate)
      
       # Restrict to 100 phrases most
       phrases = immediate.break up()
       if len(phrases) > 100:
           immediate=" ".be part of(phrases[100])
      
       # Guarantee it isn't empty
       if not immediate:
           immediate = "Skilled portrait {photograph}"
       return immediate

Generate Knowledge URI for the Picture

The image_to_data_uri operate converts a picture right into a Knowledge URI, permitting it to be despatched straight in API requests or embedded in HTML. It encodes the file as Base64, detects its kind (JPEG, PNG, or GIF), and creates a compact string format for seamless integration.

 def image_to_data_uri(self, filepath):
       """Convert picture file to knowledge URI for API"""
       with open(filepath, "rb") as image_file:
           encoded_string = base64.b64encode(image_file.learn()).decode('utf-8')
          
           mime_type = "picture/jpeg"
           if filepath.decrease().endswith(".png"):
               mime_type = "picture/png"
           elif filepath.decrease().endswith(".gif"):
               mime_type = "picture/gif"
          
           return f"knowledge:{mime_type};base64,{encoded_string}"

Generate Picture utilizing RunwayML

As soon as we’ve got generated the Immediate and Knowledge URI of the Picture. Now its time for the AI to do its magic. We are going to use runwayML to generate Picture. You need to use completely different picture technology instrument accessible available in the market.

The operate generate_image_with_runway is answerable for producing a picture utilizing RunwayML.

Import and initialize RunwayML

from runwayml import RunwayML
runway_client = RunwayML()

It hundreds the RunwayML library and creates a consumer to work together with the API.

Put together the immediate

print(f"Utilizing immediate: {immediate}")
print(f"Immediate size: {len(immediate)} characters")      
# Sanitize immediate yet another time
immediate = self.sanitize_prompt(immediate)

The immediate offered by the person is printed and cleaned (sanitized) to make sure it doesn’t break the mannequin.

Convert reference picture

data_uri = self.image_to_data_uri(image_path)

The enter picture is transformed right into a Knowledge URI (a Base64 string) so it may be handed to RunwayML as a reference.

Generate picture with RunwayML

process = runway_client.text_to_image.create(
mannequin="gen4_image",
      	prompt_text=immediate,
      	ratio='1360:768',
       reference_images=[{
       	"uri": data_uri}]
).wait_for_task_output()

It sends the sanitized immediate + reference picture to RunwayML’s gen4_image mannequin to generate a brand new picture.

Obtain and save the generated picture

image_url = process.output[0]
response = requests.get(image_url)


unique_id = str(uuid.uuid4())
timestamp = datetime.now().strftime("%Ypercentmpercentd_percentHpercentMpercentS")
filename = f"generated_{timestamp}_{unique_id}.png"
filepath = os.path.be part of('intermediate_images', filename)

As soon as RunwayML returns a URL, it downloads the picture. A novel filename (based mostly on time and UUID) is created, and the picture is saved within the intermediate_images folder.

Error dealing with & fallback

  • If one thing goes flawed with the principle immediate, the operate retries with a less complicated immediate (simply ethnicity + career).
  • If even that fails, it returns None.

Right here the entire Code Block for Picture Era:

def generate_image_with_runway(self, image_path, immediate, ethnicity, time_period, career):
"""Generate picture utilizing RunwayML"""
       strive:
           runway_client = RunwayML()
          
           print("Producing picture with RunwayML...")
           print(f"Utilizing immediate: {immediate}")
           print(f"Immediate size: {len(immediate)} characters")
          
           # Sanitize immediate yet another time
           immediate = self.sanitize_prompt(immediate)
           print(f"Sanitized immediate: {immediate}")
          
           data_uri = self.image_to_data_uri(image_path)
          
           process = runway_client.text_to_image.create(
               mannequin="gen4_image",
               prompt_text=immediate,
               ratio='1360:768',
               reference_images=[{
                   "uri": data_uri
               }]
           ).wait_for_task_output()
          
           # Obtain the generated picture
           image_url = process.output[0]
           response = requests.get(image_url)
          
           if response.status_code == 200:
               # Generate distinctive filename
               unique_id = str(uuid.uuid4())
               timestamp = datetime.now().strftime("%Ypercentmpercentd_percentHpercentMpercentS")
               filename = f"generated_{timestamp}_{unique_id}.png"
               filepath = os.path.be part of('intermediate_images', filename)
              
               with open(filepath, 'wb') as f:
                   f.write(response.content material)
              
               print(f"Generated picture saved as: {filepath}")
               return filepath
           else:
               print(f"Didn't obtain picture. Standing code: {response.status_code}")
               return None
              
       besides Exception as e:
           print(f"Error producing picture: {e}")
           print("Making an attempt with a less complicated immediate...")
          
           # Fallback with quite simple immediate
           simple_prompt = f"{ethnicity.worth} {career.worth} portrait"
           strive:
               data_uri = self.image_to_data_uri(image_path)
              
               process = runway_client.text_to_image.create(
                   mannequin="gen4_image",
                   prompt_text=simple_prompt,
                   ratio='1360:768',
                   reference_images=[{
                       "uri": data_uri
                   }]
               ).wait_for_task_output()
              
               # Obtain the generated picture
               image_url = process.output[0]
               response = requests.get(image_url)
              
               if response.status_code == 200:
                   # Generate distinctive filename
                   unique_id = str(uuid.uuid4())
                   timestamp = datetime.now().strftime("%Ypercentmpercentd_percentHpercentMpercentS")
                   filename = f"generated_{timestamp}_{unique_id}.png"
                   filepath = os.path.be part of('intermediate_images', filename)
                  
                   with open(filepath, 'wb') as f:
                       f.write(response.content material)
                  
                   print(f"Generated picture saved as: {filepath}")
                   return filepath
               else:
                   print(f"Didn't obtain picture. Standing code: {response.status_code}")
                   return None
                  
           besides Exception as e2:
               print(f"Fallback additionally failed: {e2}")
               return None

Output:

4. Generate Video from the Picture

Now that we’ve got generated a picture based mostly precisely on the person enter, right here is the method to transform this picture to a video.

Generate a Immediate for Video Era

The enhance_video_prompt_with_openai operate turns person decisions into secure, inventive video prompts utilizing GPT-4.1-mini. It adapts tone based mostly on career i.e., severe for warriors, mild and humorous for others, whereas maintaining content material family-friendly.

To keep up consistency, it additionally ensures the character’s face stays the identical throughout the video. Together with person choices, the picture technology immediate is handed too, so the video has full context of the character and background. If OpenAI fails, a fallback immediate retains issues working easily.

def enhance_video_prompt_with_openai(self, motion, image_prompt, ethnicity, time_period, career, gender):
       """Enhanced video immediate technology - simplified and secure"""


       if career == "WARRIOR":


           video_prompt_base = f"""
           Context from picture immediate : {image_prompt}
           Get Context from picture immediate and generate an in depth and secure video immediate:
           - Character: A {gender} {ethnicity.worth} {career.worth}
           - Motion: {motion.worth}
           - Time Interval: {time_period.worth}
           - Focus completely on the motion.
           - Hold the language easy and applicable.
           - Scene ought to be practical.
           - Keep away from controversial matters or violence.
           - Video ought to be applicable for all ages.
           """
       else:
           video_prompt_base = f"""
           Context from picture immediate : {image_prompt}
           Get Context from picture immediate and generate a easy, secure and humorous video immediate:
           - Character: A {gender} {ethnicity.worth} {career.worth}
           - Motion: {motion.worth}
           - Time Interval: {time_period.worth}
           - Focus completely on the motion.
           - Hold the language easy and applicable.
           - Make it little bit humorous.
           - Scene ought to be practical and humorous.
           - Keep away from controversial matters or violence.
           - Video ought to be applicable for all ages
           """
      
       strive:
           response = self.openai_client.chat.completions.create(
               mannequin="gpt-4.1-mini",
               messages=[{"role": "user", "content": video_prompt_base}],
               max_tokens=60,
               temperature=0.5
           )
           enhanced_video_prompt = response.decisions[0].message.content material.strip()
           enhanced_video_prompt = enhanced_video_prompt+" Hold face of the @particular person constant in entire video."
           return enhanced_video_prompt
       besides Exception as e:
           print(f"Error enhancing video immediate: {e}")
           # Fallback immediate - quite simple
           return f"{gender} {ethnicity.worth} {career.worth} from {time_period.worth} performing {motion.worth}, skilled video"

Generate Video with Seedance V1  Professional

For video technology, we’re utilizing the Seedance V1 Professional mannequin. To entry this mannequin, we’re utilizing fal.ai. Fal AI gives entry to Seedance Professional at a less expensive value. I’ve examined many video technology fashions like Veo3, Kling AI, and Hailuo. I discover Seedance finest for this function because it has significantly better face consistency and is less expensive. The one disadvantage is that it doesn’t present audio/music within the video.

def generate_video_with_fal(self, image_path, video_prompt):
       """Generate video utilizing fal_client API with error dealing with"""
       strive:
           print("Producing video with fal_client...")
           print(f"Utilizing video immediate: {video_prompt}")
          
           # Add the generated picture to fal_client
           print("Importing picture to fal_client...")
           image_url = fal_client.upload_file(image_path)
           print(f"Picture uploaded efficiently: {image_url}")
          
           # Name the mannequin with the uploaded picture URL
           print("Beginning video technology...")
           outcome = fal_client.subscribe(
               "fal-ai/bytedance/seedance/v1/professional/image-to-video",
               arguments={
                   "immediate": video_prompt,
                   "image_url": image_url,
                   "decision": "720p",
                   "period": 10
               },
               with_logs=True,
               on_queue_update=self.on_queue_update,
           )
          
           print("Video technology accomplished!")
           print("Consequence:", outcome)
          
           # Extract video URL from outcome
           if outcome and 'video' in outcome and 'url' in outcome['video']:
               video_url = outcome['video']['url']
               print(f"Video URL: {video_url}")
              
               # Obtain the video
               print("Downloading generated video...")
               response = requests.get(video_url)
              
               if response.status_code == 200:
                   # Generate distinctive filename for video
                   unique_id = str(uuid.uuid4())
                   timestamp = datetime.now().strftime("%Ypercentmpercentd_percentHpercentMpercentS")
                   video_filename = f"generated_video_{timestamp}_{unique_id}.mp4"
                   video_filepath = os.path.be part of('final_videos', video_filename)
                  
                   # Save the video
                   with open(video_filepath, "wb") as file:
                       file.write(response.content material)
                  
                   print(f"Video generated efficiently: {video_filepath}")
                   return video_filepath
               else:
                   print(f"Didn't obtain video. Standing code: {response.status_code}")
                   return None
           else:
               print("No video URL present in outcome")
               print("Full outcome construction:", outcome)
               return None
              
       besides Exception as e:
           print(f"Error producing video with fal_client: {e}")
           if "delicate" in str(e).decrease():
               print("Content material flagged as delicate. Making an attempt with a less complicated immediate...")
               # Strive with very primary immediate
               basic_prompt = "particular person transferring"
               strive:
                   image_url = fal_client.upload_file(image_path)
                   outcome = fal_client.subscribe(
                       "fal-ai/bytedance/seedance/v1/professional/image-to-video",
                       arguments={
                           "immediate": basic_prompt,
                           "image_url": image_url,
                           "decision": "720p",
                           "period": 10
                       },
                       with_logs=True,
                       on_queue_update=self.on_queue_update,
                   )
                  
                   if outcome and 'video' in outcome and 'url' in outcome['video']:
                       video_url = outcome['video']['url']
                       response = requests.get(video_url)
                      
                       if response.status_code == 200:
                           unique_id = str(uuid.uuid4())
                           timestamp = datetime.now().strftime("%Ypercentmpercentd_percentHpercentMpercentS")
                           video_filename = f"generated_video_{timestamp}_{unique_id}.mp4"
                           video_filepath = os.path.be part of('final_videos', video_filename)
                          
                           with open(video_filepath, "wb") as file:
                               file.write(response.content material)
                          
                           print(f"Video generated with primary immediate: {video_filepath}")
                           return video_filepath
                  
               besides Exception as e2:
                   print(f"Even primary immediate failed: {e2}")
                   return None
           return None

Output:

Add your brand within the video (non-compulsory):

In the event you’re making a video on your group, you’ll be able to simply add a watermark to it. This helps shield your content material by stopping others from utilizing the video for business functions.

The add_logo_to_video operate provides a brand watermark to a video. It checks if the brand exists, resizes it, and locations it within the bottom-right nook of each body. The processed frames are saved as a brand new video with a singular title. If one thing goes flawed, it skips the overlay and retains the unique video.

def add_logo_to_video(self, video_path, logo_width=200):
       """Add brand overlay to video earlier than emailing"""
       strive:
           print("Including brand overlay to video...")
          
           # Examine if brand file exists
           if not os.path.exists(self.logo_path):
               print(f"Brand file not discovered at {self.logo_path}. Skipping brand overlay.")
               return video_path
          
           # Load brand with transparency utilizing Pillow
           brand = Picture.open(self.logo_path).convert("RGBA")
          
           # Open the enter video
           cap = cv2.VideoCapture(video_path)
          
           # Get video properties
           width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
           top = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
           fps = cap.get(cv2.CAP_PROP_FPS)
           fourcc = cv2.VideoWriter_fourcc(*'mp4v')
          
           # Create output filename
           unique_id = str(uuid.uuid4())
           timestamp = datetime.now().strftime("%Ypercentmpercentd_percentHpercentMpercentS")
           output_filename = f"video_with_logo_{timestamp}_{unique_id}.mp4"
           output_path = os.path.be part of('final_videos', output_filename)
          
           out = cv2.VideoWriter(output_path, fourcc, fps, (width, top))
          
           # Resize brand to specified width
           logo_ratio = logo_width / brand.width
           logo_height = int(brand.top * logo_ratio)
           brand = brand.resize((logo_width, logo_height))
          
           frame_count = 0
           total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
          
           whereas True:
               ret, body = cap.learn()
               if not ret:
                   break
              
               # Present progress
               frame_count += 1
               if frame_count % 10 == 0:
                   progress = (frame_count / total_frames) * 100
                   print(f"Processing body {frame_count}/{total_frames} ({progress:.1f}%)")
              
               # Convert body to PIL Picture
               frame_pil = Picture.fromarray(cv2.cvtColor(body, cv2.COLOR_BGR2RGB)).convert("RGBA")
              
               # Calculate place (backside proper nook with padding)
               pos = (frame_pil.width - brand.width - 10, frame_pil.top - brand.top - 10)
              
               # Paste the brand onto the body
               frame_pil.alpha_composite(brand, dest=pos)
              
               # Convert again to OpenCV BGR format
               frame_bgr = cv2.cvtColor(np.array(frame_pil.convert("RGB")), cv2.COLOR_RGB2BGR)
              
               out.write(frame_bgr)
          
           cap.launch()
           out.launch()
          
           print(f"Brand overlay accomplished: {output_path}")
          
           return output_path
          
       besides Exception as e:
           print(f"Error including brand to video: {e}")
           print("Persevering with with unique video...")
           return video_path

Ship this Video through Electronic mail

As soon as the video is generated, customers will need to view and obtain it. This operate makes it potential by sending the video on to the e-mail deal with they offered.

def send_email_with_attachments(self, recipient_email, image_path, video_path):
       """Ship e-mail with generated content material utilizing yagmail"""
       strive:
           # Get e-mail credentials from atmosphere variables
           sender_email = os.getenv('SENDER_EMAIL')
           sender_password = os.getenv('SENDER_PASSWORD')
          
           if not sender_email or not sender_password:
               print("Electronic mail credentials not present in atmosphere variables")
               return False
          
           yag = yagmail.SMTP(sender_email, sender_password)
          
           topic = "Your AI Generated Character Picture and Video"
          
           physique = f"""
           Hiya!
          
           Your AI-generated character content material is prepared!
          
           Connected you may discover:
           - Your generated character picture
           - Your generated character video (with brand overlay)
          
           Thanks for utilizing our AI Picture-to-Video Generator!
          
           Greatest regards,
           AI Generator Group
           """
          
           attachments = []
           if image_path:
               attachments.append(image_path)
           if video_path:
               attachments.append(video_path)
          
           yag.ship(
               to=recipient_email,
               topic=topic,
               contents=physique,
               attachments=attachments
           )
          
           print(f"Electronic mail despatched efficiently to {recipient_email}")
           return True
          
       besides Exception as e:
           print(f"Error sending e-mail: {e}")
           return False

At this stage, you’ve constructed the core engine of your internet app, producing a picture, making a video, including a brand, and sending it to the person through e-mail. The following step is to attach all of it collectively by creating the frontend and backend for the net app.

Challenges

Constructing a personalised video technology internet app comes with a number of technical and operational challenges:

1. Dealing with AI Failures and API Errors

  • AI fashions for picture and video technology can fail unexpectedly.
  • APIs might return errors or produce undesired outputs.
  • Fallback methods have been important to make sure easy operation, equivalent to utilizing simplified prompts or different technology strategies.

2. Managing Delicate Content material

  • AI-generated content material can inadvertently produce inappropriate outcomes.
  • Implementing checks and secure prompts ensured that every one outputs remained family-friendly.

3. Person Expectations for Personalization

  • Customers count on extremely correct and personalised outcomes.
  • Guaranteeing gender, ethnicity, career, and different particulars have been appropriately mirrored required cautious immediate design and validation.

4. Finalizing the Video Era Mannequin

  • Discovering a mannequin that maintained face consistency at an affordable value was difficult.
  • After testing a number of choices, Seedance V1 Professional through Fal.ai supplied the most effective stability of high quality, consistency, and value.

5. Faux or Unreliable Aggregators

  • Selecting a dependable mannequin supplier was tough. Fal.ai labored properly, however earlier I experimented with others:
  • Replicate.com: Quick however restricted customization choices, and later confronted fee points.
  • Pollo.ai: Good interface, however their API service was unreliable; it generated no movies.
  • The important thing takeaway: keep away from faux or unreliable suppliers; all the time take a look at totally earlier than committing.

6. Time Administration and Efficiency

  • Video technology is time-consuming, particularly for real-time demos.
  • Optimizations like LRU caching and a number of API situations helped cut back latency and enhance efficiency throughout occasions.

Past the Time Capsule: What Else May You Construct?

The Time Capsule is only one instance of a personalised video technology app. The core engine may be tailored to create quite a lot of progressive purposes:

  • Customized Greetings: Generate birthday or competition movies that includes family and friends in historic or fantasy settings.
  • Advertising and marketing & Branding: Produce promotional movies for companies, including logos and customised characters to showcase services or products.
  • Academic Content material: Deliver historic figures, scientific ideas, or literature scenes to life in a visually participating method.
  • Interactive Storytelling: Enable customers to create mini-movies the place characters evolve based mostly on person enter, decisions, or actions.
  • Gaming Avatars & Animations: Generate personalised in-game characters, motion sequences, or brief cutscenes for sport storytelling.

The probabilities are infinite, any situation the place you need personalised, visible, and interactive content material, this engine may help carry concepts to life.

Conclusion

The Time Capsule internet app exhibits simply how far AI has come, from producing textual content and pictures to creating personalised movies that really really feel like your personal. You begin with a easy photograph, choose a time interval, career, and motion, and in moments, the AI brings your historic or fantasy self to life. Alongside the way in which, we deal with challenges like AI errors, delicate content material, and time-consuming video technology with sensible fallbacks and optimizations. What makes this thrilling isn’t simply the expertise, it’s the infinite potentialities.

From enjoyable personalised greetings to academic storytelling, advertising and marketing movies, or interactive mini-movies, this engine may be tailored to carry numerous inventive concepts to life. With just a little creativeness, your Time Capsule may be the beginning of one thing actually magical.

Technical content material strategist and communicator with a decade of expertise in content material creation and distribution throughout nationwide media, Authorities of India, and personal platforms

Login to proceed studying and luxuriate in expert-curated content material.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments