Video recording/processing procedure hints

raymondsheh · June 11, 2021, 4:10am

Hi All!

The remote video attestation process requires synchronised quad-screen videos, meaning that teams will be dealing with a lot of video. Clearly processing all of this with a traditional video editing suite is going to be prohibitive.

The following is a process I’ve tried that is somewhat more streamlined than doing things manually, and only makes use of free software. You don’t need to do it this way but it might be a useful starting point to get these videos processed in a vaguely reasonable timeframe.

1: When you start a set of runs, start all 4 cameras (I’m counting your screen recorder as a camera if you’re using that) at roughly the same time. You do not need to start/stop the cameras between each run, it may be easier to just record a long video on each camera and cut them up later (see below). If your cameras will run for 4 or 8 hours on a card/battery (or perhaps plugged in) this will save you some logistics time.

2: When all the cameras are started but before you start the runs, turn a light on that is visible to all of the cameras (directly or via the robot). This marks the “start” of the final video. It’s easier to scan through the start of the video looking for the frame that the light turns on, rather than trying to figure out when a clap happens in the sound track.

3: When the light turns on, start a normal (hour/minute/second/millisecond) stopwatch and keep it running for the length of the video.

4: At the start and end of each run, note the time on the stopwatch (but don’t start/stop it). It might be useful to record this in a spreadsheet, along with a filename that describes the run.

5: When you’ve done all the runs you want for this set of recordings (e.g. after a half-day or a day) stop all the cameras and download the files. If your camera splits long recordings across multiple files, you may need to combine them. The “ffmpeg” program, which we also use below, can combine videos in batch if you don’t already have a program to do it, see here for instructions.

6: For each of the 4 videos, look through the start of the video until you find the frame that the light turns on. Write this frame down (in hour:minute:second.frame).

7: Use the freely available, open source “ffmpeg” program (available for Windows, Mac, and Linux) to combine the videos into a quad video. Depending on your computer, this might process at close to realtime but you can batch this up and run it overnight unattended without needing to actually watch the whole video. Here’s an example of a command line to do it. This should all be on one line, I’ve split it up to make it easier to read.

ffmpeg 
-ss 00:00:00.00 -i DetailVideo.mp4 
-ss 00:00:00.00 -i OverviewVideo.mp4 
-ss 00:00:00.00 -i OCUVideo.mp4 
-ss 00:00:00.00 -i OperatorVideo.mp4 
-filter_complex "
[0:v] scale=960x540 [upperleft];
[1:v] scale=960x540 [lowerleft];
[2:v] scale=960x540 [upperright];
[3:v] scale=960x540 [lowerright];
nullsrc=size=1920x1080 [base]; 
[base][upperleft] overlay=shortest=1 [tmp1];
[tmp1][upperright] overlay=shortest=1:x=960 [tmp2];
[tmp2][lowerleft] overlay=shortest=1:y=540 [tmp3];
[tmp3][lowerright] overlay=shortest=1:x=960:y=540
" -vcodec h264 -acodec aac Output.mp4

where:

ffmpeg : call to the ffmpeg program. Replace this with the full path to where your ffmpeg executable is if it isn’t in your path.
-ss 00:00:00.00 -i DetailVideo.mp4 : The input detail video. The “-ss” flag indicates when to start the video - in this case the frame that the light turns on. So if the light turns on at the 1 minute, 2 second, 5 frame mark, this should be “-ss 00:01:02.05”. Note, placing this before the video it refers to does keyframe parsing so it doesn’t need to actually decode the whole video.
-ss 00:00:00.00 -i OverviewVideo.mp4 : As above for the overview video.
-ss 00:00:00.00 -i OCUVideo.mp4 : As above for the OCU video.
-ss 00:00:00.00 -i OperatorVideo.mp4 : As above for the operator interface/hands video.
-filter_complex" : defines a complex filter, the subsequent part of the command (within quotes) defines the filter.
[0:v] scale=960x540 [upperleft]; : Sets the “zero’th” video scale to half of full HD (960x540) and calls it upperleft. Similar for the other 3 videos.
nullsrc=size=1920x1080 [base]; : Create a blank frame on which to paste all the others.
[base][upperleft] overlay=shortest=1 [tmp1]; : Add the first video to the blank frame and assign it to video stream called “tmp1”. The command “shortest=1” means that the result stops when the first video stops.
[tmp1][upperright] overlay=shortest=1:x=960 [tmp2]; : Take the video stream “tmp1” from above, add the “upperright video” to it, output to video stream called “tmp2”.
[tmp2][lowerleft] overlay=shortest=1:y=540 [tmp3]; : Similar to above.
[tmp3][lowerright] overlay=shortest=1:x=960:y=540" : Similar to above. Note the lack of semicolon and the close quote (matching the quote at “filter_complex”). This is not assigned to any subsequent video which means it becomes the output.
-vcodec h264 : Use the h264 codec to encode the resulting video.
-acodec aac : Use the AAC codec to encode the resulting audio. Might need to go “-acodec libfaac” on some systems.
Output.mp4 : The final filename is the desired output file.

8: You now have one long quad screen video. You can use ffmpeg to break this up using the times from the stopwatch that you recorded previously, without re-encoding. The command line is:

ffmpeg -ss 00:00:00.00 -i input.mp4 -t 00:10:00.00 -vcodec copy -acodec copy output.mp4

where:

ffmpeg : call to the ffmpeg program as before.
-ss 00:00:00.00 : the start time from the stopwatch. If this run is at the 2 hour, 13 minute, 3 second mark, this would be “-ss 02:13:03.00”.
-i input.mp4 : the long input file.
-t 00:10:00.00 : the length of the video (the end time minus the start time from the stopwatch). If the run was 13 minutes and 26 seconds, this would be “-t 00:13:26.00”.
-vcodec copy -acodec copy : Copies the raw data rather than re-encoding so it’s a lot faster. It should still be frame accurate as ffmpeg is smart enough to reconstruct the necessary frames at the start and the end of the trim points.
output.mp4 : the output file.

You can write yourself a little batch script, or even do some spreadsheet text manipulation to automatically run this.

Hopefully this gives you some ideas as to how to streamline the process of processing the videos to upload!

Cheers!

Raymond

raymondsheh · June 11, 2021, 4:28am

Oh and as for chopping up the videos, here’s a script that might be useful as a starting point. It assumes that you have a text file that’s laid out “start time,duration,filename” with no spaces. For instance:

00:00:05.00,00:13:26.00,LinearInspect_30cm_white
00:00:22.00,00:11:08.00,LinearInspect_30cm_yellow
00:00:35.00,00:15:32.00,LinearInspect_30cm_orange

This text file would be similar to what you would get if you saved the spreadsheet as a comma-separated file. It is very important that there are no spaces in each line!

You could then automatically generate the individual video files from the master quadscreen video file that you created using ffmpeg in the previous post using a script like this, passing the name of the master video file as the first parameter and the list text file as the second.

#!/bin/bash
masterfile=$1
listfile=$2
for line in `cat $listfile`; do
    starttime=`echo $line | sed "s/,.*//g"`
    duration=`echo $line | sed "s/^[0-9:]*,//g" | sed "s/,.*//g"`
    filename=`echo $line | sed "s/.*,//g"`
    ffmpeg -ss $starttime -i $masterfile -t $duration -vcodec copy -acodec copy $filename".mp4"
done

Note that this has no error checking so please be careful! Also of course there are a bunch of ways of splitting up the line, that’s just the way that first came to mind.

Note also that depending on your text editor, system, etc. you might need to force the list file, or script, to be written in Unix mode (Unix line breaks).

Cheers!

Raymond