Overview: Turning Semantic Insights Into Video Search Advantage
As AI Search slowly becomes the default way that people search, Google increasingly blends multiple content types into its results. Video remains one of the most overlooked formats, even though it is growing in importance and is easier to produce than ever.
For SEOs, understanding how YouTube surfaces content is no longer optional. You’ll need to incorporate video production into a cohesive omnimedia content strategy to improve AI Search visibility.
My research demonstrates that YouTube’s ranking system heavily relies on semantic relevance across a video’s transcript, title, and description. I analyzed over 100,000 videos across 1,000 keywords using a custom tool that scraped, segmented, and compared each video to its target keyword.
The findings reveal a clear pattern: videos that contain highly relevant transcript segments, optimized titles, and descriptions are far more likely to rank higher.
The data backs this up. The strongest signal is the relevance of the most relevant transcript segment to the searched keyword, which shows a Pearson correlation of 0.937 with position in YouTube search results (based on the convention of the 100th position being the lowest) (R² = .878). The next strongest signals are title relevance (R² = .824) and description relevance (R² = .765).
For brands and creators, the implication is simple: optimizing transcript segments, titles, and descriptions for topic relevance significantly increases the odds of ranking highly on YouTube and appearing prominently in Google’s blended search results.
Cosine Similarity, Semantic Relevance & Video Optimization
Video searches account for a significant share of total visibility for brands. Between YouTube’s own search engine, Google queries that return primarily YouTube or video content, and traditional Google search, video is now one of the most powerful channels for discovery.
According to the research presented by Phil Nottingham at his SEO Week talk, video cannot be ignored in your omnimedia search strategy.
YouTube alone handles roughly one billion searches a day, with Google adding another 2.5 billion queries that surface YouTube results. This scale makes video a critical part of any SEO strategy.
Both Google and YouTube function as search engines. They rely on vector embeddings, which are mathematical representations of content, to understand relevance. In their newer systems, like AI Overviews and AI Mode, as well as in traditional search, these embeddings are multimodal, covering text, images, and video. Google evaluates your transcripts, titles, and descriptions in the same way it evaluates text on a web page, comparing them to user queries in high-dimensional space.
This article asks a simple but important question: How does the semantic relevance of a video transcript relative to the searched keyword affect its ranking on a YouTube results page? The research shows that you can apply the same content relevance analysis used for web pages to YouTube results and that doing so can improve visibility across Google’s entire search ecosystem.
Along the way, we’ll draw on insights from iPullRank’s AI Search Manual and Mike King’s article on content relevance to explain concepts like cosine similarity and vector embeddings. We’ll also connect these ideas to Google’s own patents, such as “Method for Text Rankings with Pairwise Ranking Prompting,” to show how these technologies underpin the ranking of both web pages and videos.
Metrics
Semantic Similarity Metrics
- Average keyword similarity – The average of all semantic chunks in a given transcript relative to the searched keyword
- Max_keyword_similarity – The cosine similarity of the most relevant chunk in a given transcript relative to the searched keyword.
- segments_count – Number of semantic chunks in the video as determined by FAISS.
- original_segments_count – Number of segments in the .str file for the transcript.
On Page Metrics
- Keyword – The searched query used to find the video and its position
- Position – Ranking position in YouTube search results retrieved for any given keyword
- Title – Title of the video
- Duration – Duration of the video
- View Count: Views received for the video
- Published Date – When the video was uploaded
Channel Metrics
- Channel – Name of the channel posting the video being ranked
- Subscriber Count – Subscriber count of the channel
Calculated Metrics
- Monthly Velocity – Total view count divided by the number of months it has been available on YouTube
- KOB (Keyword Opposition to Benefit Ratio) – Median View Count of an aggregate of 100 videos per given keyword divided by the median subscriber count of the aggregate of the channels whose videos are being retrieved
Methodology
To uncover how semantic relevance affects video rankings, we built a large-scale dataset of YouTube content and analyzed it at the transcript, title, and description level. This process combined keyword sampling, automated video scraping, transcript extraction, vector embedding, and cosine similarity analysis to measure which factors correlate most strongly with ranking position.
Keyword Selection
Built a list of 1,113 design-related keywords covering short-tail, medium-tail, and long-tail queries.
Video Collection
- Scraped the top 100 videos for each keyword using Playwright to capture both the video URLs and their ranking positions.
- Removed duplicate videos appearing for the same keyword.
Filtered out videos longer than 1.5 hours to focus on content likely to be relevant to searchers.
Transcript Extraction
- Pulled closed-caption (CC) transcripts from each video URL.
- Broke each transcript into individual timestamped segments using the YouTube transcript HTML structure.
- Segment text located in <yt-formatted-string class=”segment-text style-scope ytd-transcript-segment-renderer”>.
- Timestamps located in <div class=”segment-timestamp style-scope ytd-transcript-segment-renderer”>.
Embedding Creation
- Converted every timestamped segment into a numerical vector using the Sentence Transformers model “mixedbread-ai/mxbai-embed-large-v1” (1024 dimensions).
- Created vector embeddings for every segment to capture semantic meaning.
Note: Mixedbread is optimized for semantic similarity, retrieval and RAG. It is trained to be multilingual and can interpret context in longer formats. Using other embedding models can affect results and regression models.
Semantic Chunking
- Used FAISS to group segments into semantic chunks, with a minimum chunk size of 10 words and a maximum of 120 words.
- Normalized embeddings with faiss.normalize_L2() to prepare for cosine similarity calculations.
- Built a searchable FAISS index of all segment embeddings.
For each segment, retrieved up to 20 most similar segments to group semantically related content together.
Data Compilation
Compiled all embedded video data into spreadsheets for analysis.
Comparative Analysis
- Measured cosine similarity between the target keyword and the top semantic chunk of each transcript relative to its YouTube ranking position.
- Calculated the average cosine similarity of all chunks in a transcript relative to the keyword.
- Measured cosine similarity between the keyword and the video’s title.
- Measured cosine similarity between the keyword and the video’s description.
Results
Strongest Signal: Transcript Relevance
We analyzed 110,000 videos and organized them by position in YouTube search results (YTRP). The most powerful signal was the relevance of the most relevant transcript segment (“semantic chunk”) to the searched keyword. Videos with higher cosine similarity between the keyword and their top transcript segment consistently ranked higher.

Key takeaway: The more relevant the top transcript segment is to the searched keyword, the higher the odds of ranking well.
Subscriber Count and Channel Authority
What about channel subscriber counts you might ask? Don’t they influence the position of the retrieved videos?
Yes, absolutely.
Subscriber count acts like a “domain authority” for YouTube channels. Higher subscriber counts strongly correlated with higher rankings, following a logarithmic pattern. The same pattern appeared with monthly velocity (views divided by months published).

Key takeaway: Channels with larger audiences and faster-growing views have an advantage in ranking, even when transcript relevance is similar.
Keyword Opposition to Benefit Ratio (KOB)
Using Phil Nottingham’s research as a baseline, we created the KOB metric by comparing median subscriber counts to median monthly velocity across the top 100 videos for a keyword. This ratio identifies keywords where a smaller channel can realistically compete.

Key takeaway: Target keywords with higher KOB scores. They offer a better opportunity for visibility with optimized transcripts, titles, and descriptions.
Title and Description Relevance
Beyond transcripts, the semantic relevance of titles and descriptions also showed strong correlations with ranking position.


Key takeaway: Transcripts, titles, and descriptions are all fully controllable and together form the three strongest levers for improving YouTube visibility.
Timing of Key Segments
We tested whether the placement of the highest relevance segment within the video matters. There was a weak positive correlation (R² = .250) suggesting a small advantage when the most relevant segment occurs early, especially within the first 30 seconds.

Key takeaway: Front-load your most relevant content early in the transcript. While not definitive, it is a low-risk way to help rankings.

Key takeaway: A well-optimized transcript throughout still matters, even if one segment is highly relevant.
Positions 7 to 13 Disparity
Charts showed a consistent anomaly between positions 7 and 13. Videos in this band tended to be less semantically relevant but performed well on engagement metrics such as views and subscriber counts.


Key takeaway: Engagement metrics can offset weaker semantic relevance for mid-tier positions, showing that the algorithm balances relevance with performance signals.
Bottom Line: How To Get More Views with Semantic Relevance
The research shows that semantic relevance is not only a text-based SEO concept, it also drives visibility across YouTube and Google’s blended search ecosystem.
By analyzing over 110,000 videos across 1,000 keywords, we found three controllable factors: transcript content, title relevance, and description relevance. All three are consistently tied to higher rankings.
For in-house SEOs, this creates a clear action plan.
- Treat video content with the same rigor you apply to web pages.
- Develop scripts with at least one highly relevant segment early in the video.
- Ensure titles and descriptions closely match the intent of your target keywords while reflecting natural language.
- Evaluate subscriber count and monthly velocity across competing channels to identify realistic keyword opportunities using the Keyword Opposition to Benefit (KOB) metric.
This study also shows how Google and YouTube balance semantic relevance with authority and engagement signals. Channels with strong subscriber bases and high monthly velocity can rank even when their transcripts are less semantically aligned. SEOs need to integrate relevance optimization with audience growth, retention, and content promotion.
The findings highlight a larger shift in search behavior. As multimodal AI Search expands, every piece of content, including video, becomes part of a brand’s organic footprint.
The future of SEO is about shaping all content types so that algorithms understand and surface them. Brands that adopt this approach early gain a durable advantage in visibility and market share.