🚀 How I Created 144 Fully Automated YouTube Videos Per Day Using ChatGPT & Leonardo.ai 🎥 (+Full Source Code)

With the rise of automated content creation tools like Dubdup and Canva, there’s been a lot of buzz about generating thousands of YouTube Shorts effortlessly. All they said “AI Side Hustle for $XXX per day” “Faceless” like:

Fascinated by these claims, I decided to take on the challenge of creating fully automated YouTube videos using cutting-edge AI LLM technologies like ChatGPT and Leonardo.ai. My goal was to produce 5–10 minute videos at scale, similar to this example, without relying on third-party tools. Here’s a detailed walkthrough of how I achieved this, including the insights and challenges faced along the way.

AI-generated Youtube Video

Overview of the Process

In this post, I will outline the steps I took to create a script that automates the production of YouTube videos. From generating video topics to creating thumbnails, this end-to-end solution demonstrates the power and potential of AI in content creation. Below is a summary of the key functions and their purposes:

  1. Generate Topic: Use ChatGPT to recommend engaging and relevant topics for a self-motivation YouTube channel.
  2. Generate YouTube Script: Create a detailed, engaging script for the video, ensuring it includes an introduction, key points, personal stories, and a motivational conclusion. This step also involves removing unnecessary characters to improve the quality of the text-to-speech output.
  3. Generate Leonardo Prompts: Develop prompts for Leonardo.ai to create images that reflect the video’s content. These images are then upscaled and animated using Leonardo.ai’s motion generation API, as outlined in their documentation.
  4. Text-to-Speech Conversion: Convert the YouTube script into a high-quality audio narration using Google Cloud’s Text-to-Speech API. This setup includes configuring the voice for a documentary-style narration.
  5. Search and Download Music: Initially, I attempted to use Pixabay’s API to search for music, but due to limitations, I decided to download royalty-free music locally. Alternatively, you can download music from YouTube Studio.
  6. Concatenate Video with Music and TTS: Combine the generated video clips, background music, and narration into a single cohesive video. This involves adding cross-dissolve transitions between video clips and overlaying subtitles.
  7. Create YouTube Thumbnail: Generate a visually appealing thumbnail for the video using a frame from the video and adding the video title text centrally.
  8. Save and Organize Content: Save all generated content, including the video, thumbnail, and script, in a structured folder system.
  9. Automate with Scheduling: Use the schedule library to run the script multiple times a day, enabling continuous video production.
My script creates different HD video per 10 mins.

However, uploading videos via the YouTube API requires additional verification, so I opted to upload the auto-created videos myself. With the help of ChatGPT, I managed to write the code for this project in just 2 hours. Here’s a detailed guide on how I created 144 videos per day, producing one video every 10 minutes.

Summary of Requests and Adjustments

  1. Initial Request: I aimed to create a YouTube channel focused on the mindsets and motivations of successful people.
  2. Script Generation: I needed a script generator that would produce engaging content with minimal awkward phrases. This involved removing unnecessary characters like “Host:” to ensure smooth text-to-speech conversion.
  3. Image and Video Generation: Using Leonardo.ai, I generated images, upscaled them, and converted them into videos. This process was guided by the Leonardo.ai documentation.
  4. Text-to-Speech Conversion: I configured a documentary-style narration using Google Cloud Text-to-Speech. This service offers a free quota, and new users receive $300 in free credits. You need to create a Google Cloud account and enable the API to use this service.
  5. Subtitle Generation: Subtitles were generated using OpenAI’s Whisper API.
  6. Background Music and Transitions: Initially, I attempted to use the Pixabay API for background music. However, since they do not provide an API for music search, I opted to download royalty-free music locally or from YouTube Studio.
  7. Thumbnail Creation: Thumbnails were created using the Python Imaging Library (PIL).
  8. Saving and Uploading Content: All generated content was saved in a structured folder and uploaded to YouTube.
  9. Upload Video to YouTube: Using Youtube Data API, upload all created video with generated title, description and thumbnail.
  10. Scheduling the Script: I used the schedule library in Python to run the script multiple times a day.

Ideation with ChatGPT

The overall process with ChatGPT was straightforward yet powerful. Initially, I tasked ChatGPT with creating an idea-generating function for self-improvement and motivational content tailored for a YouTube channel. Leveraging its API, I then requested a function to generate a detailed script for the video content. With the generated script in hand, I asked ChatGPT to produce 5–10 prompts suitable for creating images and videos using Leonardo.ai, ensuring the prompts aligned with the script’s theme.

Following this, I sought functions to handle text-to-speech (TTS) and video editing through Python APIs. While ChatGPT initially recommended the built-in TTS library, gtts, I opted for the more advanced Google Cloud Text-to-Speech API, given my existing Google Cloud account. This API provides a more natural and versatile speech synthesis.

Further, I requested functions to generate and upscale images using Leonardo.ai, convert these images into videos, and then apply cross-dissolve effects to match the voice-over’s duration. Additionally, I wanted to incorporate royalty-free background music and subtitles within the video. To streamline the output, I asked for a function that saves all generated content into a timestamped folder. Lastly, to automate this process, I decided to use Python’s Schedule library to run the script at regular intervals.

Overcoming Challenges and Fine-Tuning

Throughout the development process, a few fine-tuning steps were necessary. Although ChatGPT understood my requests and provided the essential functions, there were initial errors. For instance, the code initially provided did not align with the latest OpenAI Python library versions. However, using OpenAI’s suggestion to employ the openai migrate command resolved this issue.

Moreover, ChatGPT initially misunderstood the usage of Leonardo.ai, but by referring to Leonardo.ai’s API documentation and recipes, I guided ChatGPT to generate the appropriate code. Collaborating with ChatGPT significantly reduced the time required to complete this task, from an estimated 6 hours to about 2 hours.

Experimentation with Background Music

During the testing phase, I discovered that the Pixabay API did not support background music downloads. I then asked ChatGPT to use Selenium for direct downloads, but this approach proved cumbersome due to the need for ChromeDriver and a headless browser. Ultimately, I manually downloaded the top 40 motivational tracks, considering potential copyright issues with Pixabay content on YouTube.

Detailed Walkthrough

1. Generate Topic

The first step involves generating a compelling topic for the YouTube video. Using OpenAI’s ChatGPT, we create an engaging and inspirational topic relevant to an audience seeking personal growth and motivation. This ensures that our content remains fresh and appealing to viewers.

def generate_topic():
prompt = "Recommend a compelling topic for a self-motivation YouTube channel. The topic should be engaging, inspirational, and relevant to an audience seeking personal growth and motivation."

response = client.chat.completions.create(
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt},

topic = response.choices[0].message.content.strip()
return topic

2. Generate YouTube Script

Once we have a topic, we generate a detailed YouTube script. This script includes an engaging introduction, key points, personal stories, and a motivational conclusion. We ensure the script is free of unnecessary characters, which enhances the quality of the text-to-speech output. Removing unnecessary characters helps to avoid awkward TTS pronunciations, like reading “Host:.”

def generate_youtube_script(topic):
prompt = (
f"Create a YouTube script for a self-motivation channel. The topic is '{topic}'. "
"The script should include an engaging introduction, a main section with key points, "
"personal stories or examples, and a conclusion with a motivational message. "
"The script length should be at least 10-15 minutes in normal speech. "
"The subtopics should be at least 10. "
"Delete any text within brackets like [INTRO] in this text and all numbering, and remove 'Host: '. "
"Rephrase the script to be more engaging and introduce the channel name 'Motivation Marvel'. "
"Ask users to subscribe and like our channel and this video in the end of script."

response = client.chat.completions.create(
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt},

script = response.choices[0].message.content.strip()
# Remove text within brackets
script = re.sub(r"\[.*?\]", "", script)
# Remove numbering
script = re.sub(r"^\d+\.\s*", "", script, flags=re.MULTILINE)
# Remove 'Host: '
script = re.sub(r"\bHost:\s*", "", script)
# Remove all characters except alphanumeric, space, and specified punctuation
script = re.sub(r"[^a-zA-Z0-9\s,?.!]", "", script)
return script

3. Generate Leonardo Prompts

For the visual content, we utilize Leonardo.ai’s API. This step involves generating high-quality images, upscaling them, and creating motion videos. This process transforms static images into dynamic visual content, making the video more engaging. I referred to the documentation at Leonardo.ai’s API recipes to understand the process of generating images, upscaling them, and creating motion videos using their motion generation API.

def generate_leonardo_prompts(youtube_script):
prompt = (
f"Based on the following YouTube script, generate 10 text prompts for Leonardo.ai. "
f"The prompts should feature successful rich people's motivation. Each prompt should describe a single person and no face in the picture. "
f"No text should be in the picture. Ensure the face does not appear in the picture. Please do not include numbering in the prompts. "
f"Additionally, do not include any text to the created image, and the person's face should not look at the front in the created photo. "
f"Here is the YouTube script:\n\n{youtube_script}"

response = client.chat.completions.create(
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt},

prompts = response.choices[0].message.content.strip().split("\n")
prompts = [
prompt for prompt in prompts if prompt.strip()
] # Remove any empty prompts

# Add the additional requirements to each prompt
prompts = [
f"{prompt}. Additionally, do not include any text to the created image, and the person's face should not look at the front in the created photo."
for prompt in prompts

return prompts

4. Text-to-Speech Conversion

The script is then converted into speech using Google Cloud’s Text-to-Speech API. We configure the voice to have a documentary-style tone, ensuring it sounds professional and engaging. Google Cloud provides a free quota for this service, making it cost-effective. You need to create a Google Cloud account and enable the Text-to-Speech API to use this service, which comes with $300 in free credits.

def convert_text_to_speech(text, filename):
# Initialize the Google Cloud Text-to-Speech client
client = texttospeech.TextToSpeechClient.from_service_account_json(

# Set the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(text=text)

# Build the voice request
voice = texttospeech.VoiceSelectionParams(
name="en-US-Wavenet-B", # Example of a WaveNet male voice

# Set the audio configuration
audio_config = texttospeech.AudioConfig(
pitch=-2.0, # Slightly lower pitch for seriousness
speaking_rate=0.9, # Slightly slower speaking rate for calmness
volume_gain_db=0.0, # Default volume

# Perform the text-to-speech request
response = client.synthesize_speech(
input=synthesis_input, voice=voice, audio_config=audio_config

# Save the response to an audio file
with open("temp.mp3", "wb") as out:
print(f'Audio content written to file "temp.mp3"')

# Load the generated speech
audio = AudioSegment.from_file("temp.mp3", format="mp3")

# Split on silence and remove it
chunks = split_on_silence(
audio, silence_thresh=-50, min_silence_len=500, keep_silence=250

# Concatenate chunks to form the final audio
trimmed_audio = AudioSegment.empty()
for chunk in chunks:
trimmed_audio += chunk

# Export the final audio without silence
trimmed_audio.export(filename, format="mp3")
print(f"Saved TTS audio to {filename}")

# Remove temporary file

5. Search and Download Music

To enhance accessibility and viewer engagement, we generate subtitles for the video. Using OpenAI’s Whisper API, we transcribe the audio and create an SRT file. This file is then used to overlay subtitles onto the video.

6. Concatenate Video with Music and Voice Over

Initially, I attempted to use Pixabay’s API to search for motivational music. However, since they don’t provide an API for music search, I downloaded royalty-free music tracks to my local storage. These tracks are added to the video to enhance its emotional impact. Alternatively, you can download royalty-free music from YouTube Studio.

def concatenate_video_with_music_and_tts(
target_resolution=(720, 1280),
video_clip = VideoFileClip(video_path)
tts_clip = AudioFileClip(tts_path)

# Resize the video clip to the target resolution
video_clip = video_clip.resize(width=1920)

# Load music files from the specified folder
# Load and randomly select music files from the specified folder
music_files = [
os.path.join(music_folder, file)
for file in os.listdir(music_folder)
if file.endswith(".mp3")
num_music_files = len(music_files)
num_clips_to_select = min(
num_music_files, 40
) # Adjust the number of music clips to select as needed
music_paths = random.sample(music_files, num_clips_to_select)
music_clips = [
AudioFileClip(music_path).fx(afx.volumex, 0.1) for music_path in music_paths
] # Reduce volume by 50%

# Repeat the video to match the length of the TTS clip
video_clip = vfx.loop(video_clip, duration=tts_clip.duration)

# Loop the audio clips sequentially to match the length of the TTS clip
looped_music_clip = loop_audio_clips_sequentially(music_clips, tts_clip.duration)

# Create a composite audio clip with TTS and music starting at the same time
composite_audio = CompositeAudioClip([tts_clip, looped_music_clip]).set_duration(

# Create subtitle clips from SRT file
subtitles = parse_srt(srt_path)
font_path = "NotoSans-Bold.ttf"
subtitle_clips = [
# bg_color="rgba(0, 0, 0, 0.5)", # semi-transparent background
.set_position(("center", "center"))
for (start, end, text) in subtitles

# Overlay subtitles on the video clip
video_with_subtitles = CompositeVideoClip([video_clip] + subtitle_clips)

# Set the composite audio to the video clip with subtitles
final_video = video_with_subtitles.set_audio(composite_audio)

# Add fade-in and fade-out effects
final_video = final_video.fadein(2).fadeout(2)

# Write the final video to a file
output_path, codec="libx264", audio_codec="aac", fps=30, preset="ultrafast"
print(f"Saved final video to {output_path}")

7. Create YouTube Thumbnail

A custom thumbnail is essential for attracting viewers. The script captures a frame from the video and overlays text to create a visually appealing thumbnail. This step ensures that the video stands out in YouTube search results and recommendations. The create_youtube_thumbnail function generates a high-quality thumbnail by capturing a frame from the video, resizing it to 1280×720, and overlaying text in the center with a professional font and style.

def create_youtube_thumbnail(video_path, text, output_path):
# Remove 'Title: '
text = re.sub(r"\bTitle:\s*", "", text)
# Load the video clip
video_clip = VideoFileClip(video_path)

# Get a frame from the middle of the video
frame = video_clip.get_frame(video_clip.duration / 8)

# Convert the frame to an image
image = Image.fromarray(frame)

# Draw the text on the image
draw = ImageDraw.Draw(image)
font_path = "NotoSans-Bold.ttf" # Provide the correct path to your font file

# Check if the font file exists
if not os.path.exists(font_path):
raise FileNotFoundError(f"Font file not found: {font_path}")

font = ImageFont.truetype(font_path, 50)

# Function to wrap text
def draw_text(draw, text, font, max_width):
lines = []
words = text.split(" ")
line = []
for word in words:
test_line = " ".join(line + [word])
bbox = draw.textbbox((0, 0), test_line, font=font)
width = bbox[2] - bbox[0]
if width <= max_width:
lines.append(" ".join(line))
line = [word]
lines.append(" ".join(line))
return lines

max_width = image.width - 100 # Maximum width for the text
lines = draw_text(draw, text, font, max_width)

# Calculate the total height of the text block
total_height = (
draw.textbbox((0, 0), line, font=font)[3]
- draw.textbbox((0, 0), line, font=font)[1]
for line in lines
+ (len(lines) - 1) * 10
y_text = (image.height - total_height) // 2 # Start drawing from the center

for line in lines:
bbox = draw.textbbox((0, 0), line, font=font)
width = bbox[2] - bbox[0]
height = bbox[3] - bbox[1]
((image.width - width) / 2, y_text),
y_text += height + 10

# Resize and crop the image to fill the entire target dimensions
target_width, target_height = 1280, 720
original_aspect = image.width / image.height
target_aspect = target_width / target_height

if original_aspect > target_aspect:
# Crop the width
new_height = target_height
new_width = int(target_height * original_aspect)
image = image.resize((new_width, new_height), Image.Resampling.LANCZOS)
crop_x = (new_width - target_width) // 2
image = image.crop((crop_x, 0, crop_x + target_width, new_height))
# Crop the height
new_width = target_width
new_height = int(target_width / original_aspect)
image = image.resize((new_width, new_height), Image.Resampling.LANCZOS)
crop_y = (new_height - target_height) // 2
image = image.crop((0, crop_y, new_width, crop_y + target_height))

# Save the image as a thumbnail
print(f"Saved YouTube thumbnail to {output_path}")

8. Save and Organize Content

The final step involves uploading the video to YouTube. While additional verification is required to use the YouTube Data API for automated uploads, the script provides a structured approach to manually upload the video if necessary. This function uses the YouTube Data API to upload the video, set the title and description, and add the custom thumbnail.

def save_to_folder(
# Generate the current date and time in YYYYMMDD_HHMMSS format
current_time = datetime.now().strftime("%Y%m%d_%H%M%S")
# Create the folder
folder_name = f"{current_time}"
os.makedirs(folder_name, exist_ok=True)

# Save the text content to a file
text_filename = os.path.join(folder_name, f"{current_time}_content.txt")
with open(text_filename, "w") as file:
file.write(f"Topic: {topic}\n\n")
file.write(f"YouTube Script:\n{youtube_script}\n\n")
file.write(f"Title: {title}\n\n")

# Move the generated files to the folder
os.rename(tts_filename, os.path.join(folder_name, tts_filename))
os.rename(video_filename, os.path.join(folder_name, video_filename))
os.rename(srt_filename, os.path.join(folder_name, srt_filename))
os.rename(thumbnail_filename, os.path.join(folder_name, thumbnail_filename))

print(f"Saved all files to folder {folder_name}")

9. Upload Video to YouTube

The final step involves uploading the video to YouTube. While additional verification is required to use the YouTube Data API for automated uploads, the script provides a structured approach to manually upload the video if necessary. This function uses the YouTube Data API to upload the video, set the title and description, and add the custom thumbnail.

# Setup YouTube API credentials
CLIENT_SECRETS_FILE = "path/to/your/client_secret.json"
SCOPES = ["https://www.googleapis.com/auth/youtube.upload"]

def get_authenticated_service():
credentials = None
if os.path.exists("token.pickle"):
with open("token.pickle", "rb") as token:
credentials = pickle.load(token)
if not credentials or not credentials.valid:
if credentials and credentials.expired and credentials.refresh_token:
flow = InstalledAppFlow.from_client_secrets_file(CLIENT_SECRETS_FILE, SCOPES)
credentials = flow.run_local_server(port=0)
with open("token.pickle", "wb") as token:
pickle.dump(credentials, token)
return build("youtube", "v3", credentials=credentials)

def upload_video_to_youtube(youtube, video_file, title, description, thumbnail_file):
body = {
"snippet": {
"title": title,
"description": description,
"tags": ["motivation", "self-improvement", "success"],
"categoryId": "22" # Category: People & Blogs
"status": {
"privacyStatus": "public",
"selfDeclaredMadeForKids": False

# Call the API's videos.insert method to create and upload the video.
media = MediaFileUpload(video_file, chunksize=-1, resumable=True)
request = youtube.videos().insert(

response = request.execute()
print("Uploaded video with ID: " + response["id"])

# Set the thumbnail for the uploaded video

10. Automate with Scheduling

To automate the entire process, we use the schedule Python library. This allows the script to run multiple times a day, ensuring a steady stream of content is produced and ready for upload.


Thoughts of Coding with ChatGPT

This project demonstrated the impressive capabilities of ChatGPT and Leonardo.ai in automating content creation workflows. However, it’s important to note that YouTube does not allow monetization for AI-generated videos. This process, while technically feasible, can incur substantial API costs. Moreover, tools like Canva and DubDup also require investment. Many videos promoting automated YouTube shorts as a lucrative side hustle might be motivated by affiliate marketing. Therefore, I advise using ChatGPT to validate automation ideas with minimal or no cost before committing to more expensive tools.


While the process of automating YouTube video creation using AI tools like ChatGPT and Leonardo.ai is fascinating and technically impressive, it’s essential to consider the broader implications. YouTube’s policies on monetization do not favor AI-generated content, and creating such videos can be resource-intensive in terms of API costs and time.

Moreover, the allure of automated content as a side hustle often stems from affiliate marketing strategies, urging users to purchase tools and services. Instead, I recommend leveraging free trials and low-cost solutions to explore your automation ideas. Ultimately, the goal should be to create valuable content that genuinely engages and benefits your audience, rather than contributing to the proliferation of low-effort, high-volume videos.

Despite successfully automating the video creation process, I chose not to pursue this further to avoid creating wasteful content and to respect the valuable time of viewers. Consider carefully how you can use these powerful tools to make meaningful contributions rather than merely chasing monetization opportunities.

Full code: https://medium.com/@matthew.chang/how-i-created-144-fully-automated-youtube-videos-per-day-using-chatgpt-leonardo-ai-475deb4ab63e

Categories: Machine Learning

Written by:Matthew All posts by the author

Leave a reply

Your email address will not be published. Required fields are marked *

CAPTCHA ImageChange Image