Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save andrewarrow/b6a20109ccb96dd7336419c8750a2946 to your computer and use it in GitHub Desktop.
Save andrewarrow/b6a20109ccb96dd7336419c8750a2946 to your computer and use it in GitHub Desktop.
OCR-to-TMDB Movie/Show Detection System
good, now write out all the logic for OCR parsing to get the right title,
stars, year, and the right logic to query TMDB api and then further refine
the results. Use psuedo code so I can give this document to another developer
and they could implement it in another language.
⏺ OCR-to-TMDB Movie/Show Detection System
Overview
This system processes shared images containing movie/show information, extracts text via OCR,
parses relevant metadata, queries TMDB API, and intelligently presents the best match to the user.
Phase 1: Image Sharing & Storage
FUNCTION handleSharedImage(imageData):
// Store image in shared container for main app access
sharedContainer = getSharedContainer("group.app.identifier")
sharedImagePath = sharedContainer + "/shared_image.jpg"
WRITE imageData TO sharedImagePath
// Store path in shared UserDefaults
sharedDefaults = getUserDefaults("group.app.identifier")
sharedDefaults.set("sharedImagePath", sharedImagePath)
LOG "Image saved to shared container: " + sharedImagePath
END FUNCTION
Phase 2: OCR Text Extraction
FUNCTION performOCR(imagePath):
image = loadImage(imagePath)
IF image == NULL:
LOG "Failed to load image"
RETURN
// Configure OCR for high accuracy
ocrRequest = createTextRecognitionRequest()
ocrRequest.recognitionLevel = ACCURATE
ocrRequest.useLanguageCorrection = TRUE
// Extract text with bounding box metadata
results = performTextRecognition(image, ocrRequest)
textWithMetadata = []
FOR EACH observation IN results:
text = observation.topCandidate.string
boundingBox = observation.boundingBox
textWithMetadata.append({
text: text,
boundingBox: boundingBox,
fontSize: boundingBox.height // Relative font size
})
LOG "OCR extracted " + textWithMetadata.length + " text elements"
// Proceed to parsing
parseMovieInfo(textWithMetadata)
END FUNCTION
Phase 3: Intelligent Text Parsing
Font Size Analysis
FUNCTION calculateFontMetrics(textElements):
fontSizes = []
FOR EACH element IN textElements:
fontSizes.append(element.fontSize)
avgFontSize = average(fontSizes)
maxFontSize = maximum(fontSizes)
RETURN { avgFontSize, maxFontSize }
END FUNCTION
Title Detection
FUNCTION parseMovieInfo(textWithMetadata):
metrics = calculateFontMetrics(textWithMetadata)
title = NULL
year = NULL
stars = []
titleCandidates = []
FOR EACH element IN textWithMetadata:
text = element.text.trim()
IF text.length < 2: CONTINUE
// Extract year using regex
yearMatch = REGEX_MATCH(text, "\b(19|20)\d{2}\b")
IF yearMatch AND year == NULL:
year = yearMatch
LOG "Found year: " + year
CONTINUE
// Extract stars after "starring:" prefix
IF text.toLowerCase().startsWith("starring:"):
starText = text.substring(9).trim()
starNames = starText.split(",")
FOR EACH name IN starNames:
cleanName = name.trim()
IF cleanName.length > 2:
stars.append(cleanName)
LOG "Found stars: " + stars.join(", ")
CONTINUE
// Score potential titles
titleScore = scoreTitleCandidate(text, element.index, element.fontSize, metrics)
IF titleScore > 0:
titleCandidates.append({
text: text,
score: titleScore,
index: element.index
})
// Detect additional star names
additionalStars = detectStarNames(textWithMetadata)
stars.addAll(additionalStars)
stars = removeDuplicates(stars)
// Select best title
IF titleCandidates.length > 0:
// Sort by score (desc), then by index (asc)
titleCandidates.sortBy(candidate => [-candidate.score, candidate.index])
validCandidates = titleCandidates.filter(c =>
!c.text.toLowerCase().contains("untitled") AND
!c.text.contains("#") AND
c.score > 30
)
IF validCandidates.length > 0:
bestCandidate = validCandidates[0]
title = buildCompleteTitle(textWithMetadata, bestCandidate, metrics.avgFontSize)
// Proceed to TMDB query
queryTMDBAPI(title, year, stars)
END FUNCTION
Title Scoring Algorithm
FUNCTION scoreTitleCandidate(text, index, fontSize, metrics):
score = 0
lowerText = text.toLowerCase()
// EXCLUSIONS (return 0 if any match)
excludeKeywords = ["watch", "play", "trailer", "director:", "starring:",
"follows", "rating", "untitled", "mins", "hr"]
FOR EACH keyword IN excludeKeywords:
IF lowerText.contains(keyword):
RETURN 0
// Exclude UI elements and codes
IF text.contains("#") OR text.contains("=") OR text.contains("*"):
RETURN 0
// Exclude purely numeric or short patterns
IF text.length <= 2 OR REGEX_MATCH(text, "^[0-9]+$") OR REGEX_MATCH(text, "^[A-Z][0-9]+$"):
RETURN 0
// POSITIVE SCORING
// Font size bonus (most important for titles)
fontRatio = fontSize / metrics.avgFontSize
IF fontRatio >= 1.5:
score += 30 // Very large font
ELSE IF fontRatio >= 1.2:
score += 20 // Larger than average
ELSE IF fontRatio >= 1.0:
score += 10 // Average or above
// Maximum font size bonus
IF fontSize == metrics.maxFontSize AND fontSize > metrics.avgFontSize:
score += 25
// Position bonus (earlier text is more likely to be title)
IF index <= 2: score += 20
ELSE IF index <= 5: score += 10
// Length scoring (sweet spot for titles)
IF text.length >= 5 AND text.length <= 25: score += 15
ELSE IF text.length >= 3 AND text.length <= 40: score += 10
// All caps bonus (common for movie posters)
IF text == text.toUpperCase() AND text.length >= 3: score += 10
// Title case bonus
IF REGEX_MATCH(text, "^[A-Z][a-z]*(\s+[A-Z][a-z]*)*$"): score += 8
// Common title words
titleWords = ["the", "a", "an", "of", "in", "on", "at", "to", "for", "with", "by"]
IF titleWords.contains(lowerText): score += 15
RETURN score
END FUNCTION
Star Name Detection
FUNCTION detectStarNames(textWithMetadata):
detectedStars = []
processedIndices = SET()
FOR index, element IN textWithMetadata:
IF processedIndices.contains(index): CONTINUE
text = element.text.trim()
IF !isPotentialNameWord(text): CONTINUE
// Look for first+last name combinations
fullName = findFullName(index, textWithMetadata, processedIndices)
IF fullName != NULL:
detectedStars.append(fullName)
RETURN detectedStars
END FUNCTION
FUNCTION isPotentialNameWord(text):
// Must be reasonable length and all caps (typical for posters)
IF text.length < 3 OR text.length > 12: RETURN FALSE
IF text != text.toUpperCase(): RETURN FALSE
// Exclude title words and UI elements
excludeWords = ["THE", "TRUTH", "ABOUT", "AND", "UNTITLED"]
IF excludeWords.contains(text): RETURN FALSE
// Exclude numbers and symbols
IF text.contains("#") OR text.contains("=") OR REGEX_MATCH(text, "^[0-9]+$"):
RETURN FALSE
RETURN TRUE
END FUNCTION
FUNCTION findFullName(startIndex, textElements, processedIndices):
startText = textElements[startIndex].text.trim()
// Look for companion name within next 2 positions
FOR offset = 1 TO 2:
searchIndex = startIndex + offset
IF searchIndex >= textElements.length: BREAK
IF processedIndices.contains(searchIndex): CONTINUE
candidateText = textElements[searchIndex].text.trim()
IF isPotentialNameWord(candidateText) AND
isValidNameCombination(startText, candidateText):
processedIndices.add(startIndex)
processedIndices.add(searchIndex)
RETURN startText + " " + candidateText
RETURN NULL
END FUNCTION
Phase 4: TMDB API Query & Intelligent Filtering
Initial Search
FUNCTION queryTMDBAPI(title, year, stars):
IF title == NULL OR title.isEmpty():
LOG "No title found, cannot query TMDB"
RETURN
LOG "Querying TMDB for: " + title + ", year: " + year + ", stars: " + stars
allResults = []
// Search both movies and shows concurrently
movieResults = searchMovies(title)
showResults = searchShows(title)
allResults.addAll(movieResults)
allResults.addAll(showResults)
// Apply intelligent filtering
filterAndPresentResults(allResults, title, year, stars)
END FUNCTION
Multi-Stage Filtering
FUNCTION filterAndPresentResults(allResults, searchTitle, searchYear, searchStars):
LOG "Processing " + allResults.length + " total results"
filteredResults = allResults
// Stage 1: Filter by year if provided
IF searchYear != NULL:
yearInt = parseInt(searchYear)
yearFilteredResults = filteredResults.filter(show => show.releaseYear == yearInt)
LOG "After year filtering (" + searchYear + "): " + yearFilteredResults.length + "
results"
IF yearFilteredResults.length > 0:
filteredResults = yearFilteredResults
ELSE:
LOG "No year matches, keeping all results"
// Stage 2: Apply star filtering if multiple results and stars available
IF searchStars.length > 0 AND filteredResults.length > 1:
LOG "Multiple results found, checking cast info for star matches"
checkCastAndPresentResults(filteredResults, searchTitle, searchStars)
ELSE:
presentFilteredResults(filteredResults, searchTitle)
END FUNCTION
Star-Based Cast Filtering
FUNCTION checkCastAndPresentResults(results, searchTitle, searchStars):
resultsWithStarMatches = []
// Fetch cast info for all results concurrently
FOR EACH result IN results:
IF result.mediaType == "movie":
credits = fetchMovieCredits(result.id)
ELSE:
credits = fetchShowCredits(result.id)
IF credits != NULL:
matchCount = countStarMatches(credits.cast, searchStars)
IF matchCount > 0:
resultsWithStarMatches.append({
show: result,
matchCount: matchCount
})
LOG result.name + " (" + result.mediaType + "): " + matchCount + " star matches"
// Analyze matches and decide presentation
IF resultsWithStarMatches.length > 0:
// Sort by match count (descending)
resultsWithStarMatches.sortBy(item => -item.matchCount)
bestMatch = resultsWithStarMatches[0]
LOG "Best star match: " + bestMatch.show.name + " with " + bestMatch.matchCount + "
matching stars"
// Present directly if clear winner
IF resultsWithStarMatches.length == 1 OR bestMatch.matchCount >= 2:
LOG "Found clear winner with " + bestMatch.matchCount + " star matches"
presentMovieDetail(bestMatch.show)
ELSE:
// Present filtered search results
starFilteredResults = resultsWithStarMatches.map(item => item.show)
LOG "Found " + starFilteredResults.length + " results with star matches"
presentSearchResults(starFilteredResults, searchTitle)
ELSE:
// No star matches, present all original results
LOG "No star matches found, presenting all results"
presentFilteredResults(results, searchTitle)
END FUNCTION
Star Matching Algorithm
FUNCTION countStarMatches(castMembers, searchStars):
matchCount = 0
FOR EACH star IN searchStars:
FOR EACH castMember IN castMembers:
// Bidirectional fuzzy matching
IF castMember.name.toLowerCase().contains(star.toLowerCase()) OR
star.toLowerCase().contains(castMember.name.toLowerCase()):
matchCount += 1
LOG "Star match: '" + star + "' matches '" + castMember.name + "'"
BREAK // Don't count same star multiple times
RETURN matchCount
END FUNCTION
Final Presentation Logic
FUNCTION presentFilteredResults(results, searchTitle):
IF results.length == 1:
// Single result - present detail directly
result = results[0]
LOG "Found exactly 1 result: " + result.name
presentMovieDetail(result)
ELSE IF results.length > 1:
// Multiple results - show search UI
LOG "Found " + results.length + " results, presenting search UI"
presentSearchResults(results, searchTitle)
ELSE:
LOG "No suitable results found for '" + searchTitle + "'"
END FUNCTION
FUNCTION presentMovieDetail(movie):
// Present full detail view for the movie/show
showDetailView(movie.id, movie.mediaType)
END FUNCTION
FUNCTION presentSearchResults(results, searchTitle):
// Present list of results for user selection
showSearchResultsList(results, searchTitle)
END FUNCTION
Key Algorithm Features
🎯 Intelligence Hierarchy
1. Font Size Analysis - Largest text likely to be title
2. Position Weighting - Earlier text prioritized
3. Pattern Recognition - Excludes UI elements, descriptions
4. Year Validation - Regex pattern matching
5. Star Name Detection - Combines individual names into full names
6. Multi-stage Filtering - Year β†’ Stars β†’ Presentation
πŸ” Matching Strategies
- Exact Year Matching - Filters by release year when available
- Fuzzy Star Matching - Bidirectional substring matching for names
- Score-based Title Selection - Weighted scoring system
- Fallback Mechanisms - Graceful degradation when filtering fails
πŸ“Š Decision Matrix
- Single Result β†’ Present directly
- Multiple with Clear Star Winner (β‰₯2 matches) β†’ Present directly
- Multiple with Star Matches β†’ Show filtered list
- Multiple without Star Matches β†’ Show all results
- No Results β†’ Log error
This system provides robust, intelligent parsing with multiple validation layers and smart
fallbacks for reliable movie/show identification from OCR text.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment