LawalMcCoy
Improving Subtitle Visibility in Low Budget Movies
Introduction
Subtitles are used as common tools to help audiences understand dialogue that they otherwise would be unable to interpret. However, by simply overlaying white text above the image there is a chance that the subtitles themselves will be unreadable. The main goal of our project was to develop a program in MATLAB that achieves visually pleasing viewing properties for a given subtitle and frame (picture) combination. Although the best solution is to have real people go through a movie, frame by frame, and decide what viewing properties for the subtitles makes them most visible, developing an automated solution may be the next best thing. Many films, particularly foreign films, may not have the budget to personally add subtitles to their movies and would greatly benefit from such an algorithm.
When deciding how to create the optimal subtitles, we investigated a number of variables, including the location of the subtitles, the color of the subtitles, and any potential shadowing around the subtitles. When comparing colors, our program uses the CIELAB metric of DeltaE as a proxy for perceptual differences between the surrounding image and the subtitles. Color theory states that the larger the DeltaE between two objects, the more discernible they are to the human eye. In this situation we analyzed the average DeltaE difference between text and other pixels in the textbox.
One of our major goals was to answer the following question: do larger DeltaE differences appear more visually appealing to the viewer or do more subjective factors (such as shadowing and location) dominate? Furthermore, is there even really an advantage to larger DeltaE differences (with all other factors held constant) in determining visually pleasing subtitles?
Method
Placing Subtitles on an Image
First we began by placing subtitles onto single images. We loaded the images into MATLAB, and used the build-in “text” function in MATLAB to place the text onto the image. We found that the appropriate text size for most images is 16 point. We created a main wrapper function, Place_Captions.m.
image = Place_Captions(im, txt, position, fontsize, fontcolor, shadow_size)
This function takes in an image, a string to overlay onto that image, a position with an x and y coordinate for the top left corner of the textbox, a fontsize (typically 16), a vector representing the RGB values of the fontcolor, and the length of the shadow extending from the subtitles. We used this function to investigate how the various text locations, font colors and shadows impacted how the text appeared on the image.
Subtitle Location
Based on our research into subtitling, we found that subtitles are typically placed near the bottom of the frame and aligned left. We attempted to mimic this position, without spending too much time focusing on this aspect. The subtitle's upper left location was selected to be, in most cases, 10% above the bottom and 25% to the right from the left boundary of the image. This position sometimes had to be adjusted, as did the font size, when the movies read in were small. Instead of trying to create a very general function to find the appropriate location, we made a fairly simple one and just adjusted it as needed. Another issue we ran into was when subtitles were too long for one line. To solve this, we check if the size of the textbox surrounding the subtitle is longer than the image itself. If it is, we split the subtitle into two lines and place the second line directly beneath the first. We created a simple MATLAB function, pos_text.m, that would take in the size of the image and the width of the text and output the top-left corner of the box to place the text within.
[x y] = pos_text(image_size, textwidth)
Shadow
The goal of shadowing is to have the text blend into the background as seamlessly as possible. Unlike caption boxes, which cover up much of the background image, we designed our shadowing to retain as much background information as possible. In order to place a shadow around the text, we first had to obtain a mask that would allow us to separate the text from the actual image. To achieve this we place the same text at the same position on a completely black image that was the same size as the original image. By calculating the distance from each pixel that was near the text to the actual text, we were able to determine how much shadowing to place on each pixel. For example, a pixel right next to the text would be turned completely black, while a pixel three pixels away would only be darkened. One of the inputs into our place_captions function is shadow_size, which will determine how many pixels beyond the text the shadow will extend. We found that a shadow size of 3-5 pixels was most effective. We created a Make_Shadow function to add the shadow after we have placed the text.
shadow_image = Make_Shadow(image, mask, shadow_size, shadow_factor)
This function takes in the image, a mask containing only the text with a black background, the number of pixels that the shadow will extend from the text, and a scale factor that represents how quickly the shadow fades away. The function works by weighting the distances to nearby text from each pixel and summing these weights. Once normalized, this is the factor we use to darken nearby pixels. Thus if a pixel is very close to dense subtitles, it will be significantly darkened. There are many potential shadowing algorithms, but this one seemed to work fine for our purposes.
Subtitle Color
We started by looking at both white and yellow subtitles. We decided that in most cases white subtitles looked better and thus stuck with white and shades of gray (equal amounts of each RGB channel) as potential subtitle colors. We did this in spite of the fact that yellow subtitles had higher DeltaE values in general when compared to nearby pixels. We also decided that we would only use brighter shades of gray and not black subtitles. We felt that black would contrast too much with the film and would not be as visually appealing. Thus, we designed an algorithm that would select the optimal color based on comparing the average DeltaE difference between the text color and the other pixels in the text box surrounding the subtitles.
optimal_color = Optimize_DeltaE(im, txt, fontsize)
We decided to stick with only optimizing individual images, because of the computing time required for the algorithm. To find the optimal color for a given frame would take 3-5 minutes depending on the size of the image and amount of text. . This clearly makes maximizing DeltaE impractical for clips of reasonable length using our program. However, for the vast majority of frames we tried this on, we found that a completely white subtitle actually maximized DeltaE. This means that we could assume a white subtitle and perform some sort of quick check to see if optimizing would help a frame. This would allow the algorithm to be effective for clips without imposing an extremely excessive computation time.
After optimizing the color, the shadowing algorithm had to be adjusted to darken the pixels surrounding the text based on the color of the text. However, this algorithm for placing the shadows on subtitles with non-white colors could be significantly improved. We found that after optimizing the color, it was preferred to leave the shadow off. Below is an example of a frame that had a non-white optimal color. It shows how the shadow can have more of a "blurring" effect rather than the desired effect of creating contrast.
White Text with Shadow
Optimized Gray Text with Shadow
Optimized Gray Text without Shadow
Text Luminance vs. Avg. DeltaE Difference (prior to adding shadows)
The above three images represent the bright white subtitle (not DeltaE differential maximized) overlaid on an image, the DeltaE differential maximized shade of gray for the subtitle with a shadow, and the DeltaE maximized shade of gray with the subtitle without the shadow.
Placing Subtitles on a Movie Clip
In order to place subtitles on actual movie clips, we first found clips online and converted them in MATLAB movie structs. We then manually transcribed the subtitles to be added with when they occurred and placed them in a text file. We then wrote a function to add the subtitles for each frame that had dialogue, Movie_Captions.m.
movie_out = Movie_Captions( movie_in, text_file, fps)
The function takes in a MATLAB movie struct, and the location of the text file with the number of frames per second to line up the subtitles with the corresponding frame. We ran into problems by assuming the clips we had were 30 frames per second, when really some were 23 or 29. To add subtitles to even a clip of just roughly 30 seconds, it typically took our program over 4-10 minutes depending on how many subtitles were present in the clip and the resolution of the movie. This was simply producing white subtitles with a shadow behind them, not optimizing the subtitles. After creating the movie struct we had to run the MATLAB movie2avi command and then compress the movie further into .mp4 form.
Watch some clips that we created with audio included:
Clips without audio:
Challenges
Computation Time
To add subtitles to even a clip of just roughly 30 seconds, it typically took our program over 4-10 minutes this depending on how many subtitles were present in the clip and the resolution of the images. This was simply running our program to produce white subtitles with a shadow behind them, like most commercial subtitles. Furthermore, when trying to optimize for color, for each individual frame it would take approximately 4 minutes for the program to give us the correct gray shade that maximized the DeltaE difference between the text color and the pixels within the text box. This clearly makes maximizing DeltaE impractical for clips of reasonable length using our program. Thus we only optimized single frames. However, for the vast majority of frames we tried optimizing, we found that a completely white subtitle actually maximized DeltaE. This means that under most cases our assumption of white subtitles would be correct and that we should develop some test to see whether optimizing would help before devoting all the computation to calculating it.
Inconsistency Among Computers
When running these programs to place subtitles on computers, we ran into several compatibility issues. First, not all versions of MATLAB are able to process videos equally. We had to find computers with the 2011a version of MATLAB to effectively create the movie clips. Also, MATLAB on Apple computers when compared to Windows computers created very different looking text on the images, despite being the same font, size, weight, etc. To avoid this issue, we decided to stick with using Windows computers for developing our images and videos.
Results
Survey
We performed a survey that was divided into two parts.
In the first part of the survey we showed people five different video clips with bright white subtitles. The participants were asked to rate on an absolute scale of 1-5 (5 as being the best) how visible the white subtitles with the shadow were.
In the second part of the survey we showed three different frames from a movie. In one of the frames there was a white color subtitle with a shadow (which was not optimal), an optimal shade of gray subtitle with a shadow, and lastly an optimal shade of gray subtitle without a shadow.
For the first part of the survey our results indicated that the average rating was 3.85, with a standard deviation of 0.75. There were no statistically significant difference between the five videos used in the polling, this includes the black and white video. For the second part of the survey we found that consistently people considered the DeltaE differential maximized subtitle with a shadow as the least visually appealing. Generally, people found the DeltaE maximized shade of gray without the subtitle to be the best and the white subtitle with the shadow to be middle best. While there was a distinctive ranking between the three images, subjective feedback later on led us to believe the gap between the DeltaE maximized shade of gray without the subtitle and that of the white subtitle with the shadow was not extreme. This yields us to believe that the effect of the shadow works best with white subtitles, and perhaps can compensate for lack of DeltaE differential.
More Images
Below are more examples of images with optimal subtitles and their luminance vs. DeltaE plots. In the following images, if the optimal color was white, we kept the shadowing on the image. If the optimal color was not white, we removed the shadowing.
Scarface Luminance Plot
Scarface Optimal Subtitle
Mockingbird Luminance Plot
Mockingbird Optimal Subtitle
Pokeball Luminance Plot
Pokeball Optimal Subtitle
Conclusions
Both optimizing the color of subtitles and creating a shadowing-effect around the subtitles can be very effective ways to increase the readability of subtitles. These methods however seem to be more effective when used without the other method. Both the white subtitles with the shadowing and the optimized gray subtitles without the shadowing ranked higher among our survey results than the optimized gray subtitles with the shadowing. This may be because our shadowing algorithm was developed for white subtitles and would need to be further adjusted to be effective on non-white subtitles.
This helps to answer our original question of whether DeltaE represents a valuable metric for how discernible subtitle text is to the viewer. It clearly helps to have higher DeltaE values, however, subjective factors such as font, font size, font weight, shadowing, etc. are also as important and cannot be ignored. Increased discriminability translates to increased clarity in subtitles.
Potential Extensions
- Optimize shadow in addition to subtitle color.
- Write program in faster language (such as C), instead of in MATLAB, to make it practical for longer clips.
- Split the subtitles into separate shades gray, depending on the background behind each section.
- Take position into account when maximizing DeltaE.
References
http://www.dcmp.org/captioningkey/text.html
Appendix I - Code and Data
Appendix II - Work Partition
Logan - Developed most of algorithm, helped with slides and wiki
Tola - Helped with algorithm by making helper functions, found movie clips and frames online, made presentation slides
Together - Decided on the basic algorithm, ran the scripts to create movie clips with captions.











