A simple device that allows you to search for text in YouTube videos. In this way, you can immediately find what interests you without being forced to see the entire content.
It often happens that you don’t have the material time to watch a long presentation posted on YouTube. One of the most interesting and useful features of the Google service is the so-called automatic subtitles.
Subtitles are a great way to make content more accessible: YouTube is capable of using Google’s speech recognition technology to automatically generate subtitles for videos, including those uploaded by users.
The YouTube closed captioning is produced using algorithms machine learning more and more effective thanks to continuous improvements. The speech-to-text transformation, however, can lead to variable results depending on the language used, any inflections, any imperfect pronunciation of foreign terms, the topic covered in the video. Before discovering how to search for text in YouTube subtitles and export them to a text file, we suggest checking that the Subtitles button is actually present in the lower right corner of the YouTube video.
By clicking on this button it is possible to activate subtitles and to overlay everything that is said in the video.
With a click on the Settings button (it looks like a small gear), you can check if the YouTube subtitles have been generated automatically or uploaded by the author of the content.
An extremely useful possibility is that it allows you to obtain the translation of speech in real-time: if a video was made in another language, by clicking on the Settings icon you can choose Italian and read the subtitles thanks to automatic translation.
How to extract YouTube video subtitles and search in text
To search for a text in the subtitles of YouTube videos, you can follow a procedure that does not require the use of any tool made by third parties:
1) Install and launch Google Chrome.
2) Open the YouTube video of interest.
3) Right-click in a free area of the page and choose Inspect.
4) From the menu at the top right, choose the Network item.
5) Click once or twice on the Subtitles button: in the Network tab, you will see an indication referring to an element called timed text.
6) Right-click on < timedtext and select Open in new tab .
7) You will get a file in JSON format containing the entire transcription of the text spoken in the YouTube video. By pressing the key combination CTRL + F you can start a search.
The fact is that the JSON file contains references to the texts used in the video in the utf8 objects and each occurrence usually contains only one word.
8) To remove all superfluous elements by extracting only the contents of utf8 objects, we suggest you “arm yourself” with the Notepad ++ text editor.
In the open tab in Chrome, you have to press the key combination Ctrl + A to select all the contents of the JSON file containing the transcription of the YouTube text then paste it into Notepad ++ by pressing CTRL + V.
9) By choosing Search, Replace ( Find, Replace, in Italia) from Notepad ++ (alternative menu you can press CTRL + H ), you will have to tick the box Regular expression ( Regular Expression ) and paste. +? Utf8 “:” (. +?). “+? \} in the field Find what ( Find ).
Pressing the button Replace all ( Replace all ), Notepad ++ automatically extracts only the content of objects utf8.
10) To remove the occurrences of \ n added in the transcription of YouTube, just then type \ n in the field Find what ( Find ) and press the spacebar once in the field below the Replace with ( Replace with ).
However, as the Search Type, you will need to specify Normal instead of Regular Expression. Finally, click on Replace all.
Alternatively, can obtain the subtitles of any YouTube video simply by pasting the URL in the appropriate field and choosing Select action.