Steps of making an AI audiobook using Enbee V2

The Complete AI Audiobook Production Process
STEP 1: Prepare Your Manuscript
Time Required: 30-60 minutes
What You Need:
- Final edited manuscript
- Consistent character name spellings
- Clean formatting (standard dialogue tags)
- File format: EPUB, PDF, DOC, or Word
Quick Checklist: ✓ Run spell check on character names and invented terms ✓ Verify dialogue uses standard quotation formatting ✓ Create pronunciation guide for unusual words ✓ Remove unnecessary formatting (excessive line breaks, special characters)
STEP 2: Upload to Narration Box
Time Required: 2-5 minutes
The Process:
- Log into Narration Box platform
- Access audiobook creation product
- Click upload and select your manuscript file
- System automatically detects chapters and analyzes structure
What Happens Automatically:
- Chapter detection and separation
- Text structure analysis
- Language identification
- Content preparation for narration
STEP 3: Select Your Enbee V2 Voice
Time Required: 10-15 minutes
Available Enbee V2 Narrators:
Ivy → Warm, relatable delivery
- Best for: Contemporary fiction, memoir, personal development
- Tone: Conversational and emotionally expressive
Harvey → Authoritative and clear
- Best for: Business books, historical works, educational content
- Tone: Professional and measured
Harlan → Versatile and adaptive
- Best for: Multi-perspective fiction, thriller, mystery
- Tone: Dynamic with strong range
Lenora → Sophisticated and nuanced
- Best for: Literary fiction, upmarket commercial fiction
- Tone: Interpretive and elegant
Etta → Engaging and light
- Best for: Romance, cozy mystery, humorous non-fiction
- Tone: Warm with playful energy
Action Step: Listen to voice samples with your actual content (generate test chapter if needed)
STEP 4: Configure Style Prompting
Time Required: 5-10 minutes
Style Prompt Examples:
For Mystery/Thriller: "Speak in measured pacing with British accent, building tension naturally"
For Romance: "Use warm, emotionally expressive tone with natural conversational rhythm"
For Business Non-Fiction: "Deliver with authoritative clarity, slightly slower pacing for information retention"
For Memoir: "Speak in reflective, intimate tone as if sharing personal stories with a friend"
For Fantasy/Sci-Fi: "Use dynamic pacing with dramatic emphasis on action scenes"
Multilingual Option: Add language instruction: "Speak in French with Canadian accent" or "Narrate in Spanish with authentic Castilian pronunciation"
STEP 5: Add Inline Emotion Tags (Optional)
Time Required: 30-90 minutes
When to Use: Insert emotion tags for precise control over dramatic moments your automatic detection might miss
Common Emotion Tags:
[whisper] → Intimate revelations, secrets, tension "I know what you did last summer [whisper] and I have proof"
[excited] → Breakthrough moments, victories, realizations "We found it [excited] the evidence was there all along!"
[somber] → Grief, loss, serious reflection "She never came back [somber] and we never knew why"
[laughs] → Humor, levity, joy "That's the worst plan I've ever heard [laughs] but let's do it anyway"
[shouting] → Conflict, urgency, alarm "Get out of there [shouting] the building is coming down!"
[sarcastic] → Irony, biting humor, criticism "Oh that's just perfect [sarcastic] exactly what we needed"
Best Practice: Use strategically at pivotal emotional beats, not throughout entire manuscript
STEP 6: Generate Your Audiobook
Time Required: 15-30 minutes (automatic processing)
What Happens During Generation:
Minute 1-5:
- Text parsing and linguistic analysis
- Dialogue vs. narrative identification
- Emotional context detection
Minute 5-15:
- Prosody generation (rhythm, stress, intonation)
- Character voice distinction application
- Pacing adjustment by scene type
Minute 15-30:
- Audio synthesis and rendering
- Chapter file creation
- Quality verification
Processing Speed: Approximately 3,000-5,000 words per minute Standard 80,000 word novel completes in 15-25 minutes
STEP 7: Review Complete Audiobook
Time Required: 8-10 hours (actual listening time)
Systematic Review Process:
Listen While Reading:
- Follow manuscript text while audio plays
- Mark pronunciation errors immediately
- Note pacing issues or emotional mismatches
- Identify awkward sentence flow
What to Check: ✓ Character name consistency and pronunciation ✓ Technical terms and invented words ✓ Emotional delivery at dramatic moments ✓ Pacing through action vs. reflective scenes ✓ Chapter transitions and breaks ✓ Overall tonal consistency
Documentation: Create spreadsheet with: Word/Phrase | Location (Chapter/Page) | Current Pronunciation | Desired Correction
STEP 8: Make Corrections and Revisions
Time Required: 2-4 hours
Types of Corrections:
Pronunciation Dictionary: Add mispronounced words with phonetic guidance Example: Seraphine = "sare-ah-FEEN"
Enhanced Emotion Tags: Insert tags where automatic detection missed intent Add [sarcastic], [whisper], [excited] at specific moments
Style Prompt Adjustments: Refine overall delivery for specific chapters "Chapter 12: speak in tense, urgent tone building to climax"
Regenerate Selectively: Only reprocess chapters where you made changes Saves time vs. full audiobook regeneration
STEP 9: Export in Distribution Format
Time Required: 10-20 minutes
Platform-Specific Export Settings:
Findaway Voices:
- Format: MP3
- Bitrate: 192 kbps
- Chapter files: Individual MP3 per chapter
- Metadata: Embedded chapter titles
Google Play Books:
- Format: MP3 or M4B
- Bitrate: 128-192 kbps
- Single file with chapter markers
Apple Books:
- Format: M4B (preferred) or MP3
- Bitrate: 64-128 kbps
- Chapter markers embedded
Direct Sales (Your Website):
- Format: MP3 (universal compatibility)
- Bitrate: 192 kbps
- ZIP file of chapter MP3s or single file
What You Receive:
- Finished audio files in selected format
- Chapter timing information
- Technical specifications report
- Metadata for distribution upload
STEP 10: Upload to Distribution Platforms
Time Required: 1-3 hours per platform
Major Distribution Channels:
Findaway Voices (Aggregator)
- Distributes to: Libraries, Spotify, Kobo, Scribd
- Upload: Audio files + cover image + metadata
- Disclosure: Mark as AI narration in settings
- Review time: 3-5 business days
Google Play Books
- Direct upload through Partner Center
- Narrator field: List as "AI Narration (Enbee V2)"
- Sample audio: Upload first chapter preview
- Review time: 1-3 business days
Apple Books
- Upload via Books Partner Portal
- AI disclosure: Include in description
- Audio sample: Required for listing
- Review time: 2-5 business days
Your Own Website
- Integration: PayPal, Stripe, Gumroad, or BookFunnel
- Delivery: Automated download links
- Pricing: You set (keep 90-95% after processing fees)
- Setup time: 2-4 hours initial configuration
Total Production Timeline
Day 1:
- Morning: Manuscript preparation and upload (1-2 hours)
- Afternoon: Voice selection and generation (1 hour including processing)
Day 2-3:
- Complete review listening (8-10 hours spread across 2 days)
- Note corrections and issues
Day 4:
- Make revisions (2-4 hours)
- Regenerate corrected sections (30 minutes)
- Final quality check (1 hour)
Day 5:
- Export files (20 minutes)
- Upload to distribution platforms (2-3 hours)
TOTAL: 5 days from manuscript to distribution (Traditional production: 6-8 weeks minimum)
Cost Comparison
Traditional Human Narration:
- Narrator fee: $1,200 - $3,200 (based on $200-400/finished hour)
- Studio rental: $300 - $900
- Audio editing: $400 - $800
- Mastering: $300 - $500 TOTAL: $3,000 - $15,000 per audiobook
AI Narration with Enbee V2:
- Narration Box subscription: Under $100/month
- Unlimited audiobook production
- No studio fees
- No editing costs
- Automatic mastering TOTAL: Under $100 for unlimited audiobooks
Break-even point: Traditional narration requires 600-1,000+ sales to recover costs AI narration recovers costs in first 20-30 sales
Quality Assurance Checklist
Before finalizing your audiobook, verify:
☐ All character names pronounced consistently throughout
☐ Technical terms and invented words corrected ☐ Emotional delivery matches manuscript intent at key moments
☐ Chapter transitions feel natural
☐ Audio levels consistent across all chapters
☐ No long awkward silences or pacing issues
☐ File formats match distribution requirements
☐ Metadata includes proper AI narration disclosure
☐ Cover image meets platform specifications (square, 2400x2400px minimum)
☐ Sample chapter uploaded for listener preview
Pro Tips for Best Results
Manuscript Preparation: Read your entire manuscript aloud before upload to catch sentences that sound awkward when spoken
Voice Testing: Generate your three most diverse chapters with multiple voices before committing to full production
Emotion Tags: Use sparingly at pivotal moments only for maximum impact
Review Method: Listen at 1.25x speed first to catch major issues, then review problem sections at normal speed
Pronunciation: Create master pronunciation guide to reuse across all your audiobooks with recurring terms
Distribution Strategy: Start with platforms accepting AI narration (Findaway, Google, Apple) before attempting Audible workarounds
Marketing: Release first chapter free on SoundCloud or your website to let readers sample quality before purchase
Ready to create your audiobook? Start with Narration Box's audiobook platform
Upload your manuscript today and hear your story narrated by Enbee V2 voices with automatic emotion detection and multilingual capability.
