In the course of a typical project a number of PoC’s are done to determine the best route; while working on a project involving capturing an image from a webcam I found there to be too much information to limit to a single post. This post is about mid-level abstractions, or capturing an image with command line tools.
Just Take the Damn Picture
Taking pictures in Linux is largely supported via the driver v4l2 or video4linux2. This driver manages the API that is used to interface with the physical hardware within the webcam. Ffmpeg is a powerful video tool that has the ability to capture from a webcam built in, this means the quality is not the best due to a lack of focus. Adding a delay helped with the overall saturation, but the quality of a fully auto focused image was still superior to the best that I could muster from ffmpeg after tuning as many options as I could. Strangely when specifying the size / resolution based upon the growing list of options, the image is distorted horizontally by around 30%.
Standard Capture vs Advanced Capture
This is where things got weird, after evaluating all of the different resulting images available through v4l2 s I remembered the webcam I was using for another project had much higher quality images when using the high-level web API through the browser. Using the Linux application Cheese to test my theory I was blown away by the difference in quality and had to investigate further. Cheese uses gstreamer, which produces a much cleaner final image. Gstreamer is the fully-loaded luxury vehicle compared to the base-economy model, where all of the ffmpeg plugins that are available are also able to be used by gstreamer. Getting gstreamer on Ubuntu is janky; where I was trying to get it to install locally but zsh failed to update the command line alias (pro tip: open a new terminal before spending more than 5 minutes troubleshooting).
Right Tool for the Job
This guy has the best explanation of how gstreamer does what it does: http://www.einarsundgren.se/gstreamer-basic-real-time-streaming-tutorial/
With this one being older, but more indepth: http://wiki.oz9aec.net/index.php/Gstreamer_cheat_sheet
Wow, gstreamer is really cool it can totally fix my problem, right?!? Wrong. The output still looked horrible. After doing more research I then went back to ffmpeg and decided to read through the text on the command line. Looks like the input from /dev/video0 is being captured at 640 by 480 with an insane bitrate then scaled up to the resolution requested and optimized for a video (since the single frame has now dropped from 147456kb/s to 200kb/s. Hmm that does not seem right….
Input #0, video4linux2,v4l2, from ‘/dev/video0’:
Duration: N/A, start: 6533.615178, bitrate: 147456 kb/s
Stream #0:0: Video: rawvideo (YUY2 / 0x32595559), yuyv422, 640×480, 147456 kb/s, 30 fps, 30 tbr, 1000k tbn, 1000k tbc
[swscaler @ 0x1ec6280] deprecated pixel format used, make sure you did set range correctly
Output #0, image2, to ‘test.jpeg’:
encoder : Lavf56.40.101
Stream #0:0: Video: mjpeg, yuvj422p(pc), 1920×1080, q=2-31, 200 kb/s, 30 fps, 30 tbn, 30 tbc
encoder : Lavc56.60.100 mjpeg
Stream #0:0 -> #0:0 (rawvideo (native) -> mjpeg (native))
That is when it clicked, ffmpeg is a cross-platform solution to record, convert and stream audio and video. It does not give a damn what sort of video you give it, it will just convert it and apply whatever transforms you request from the command line. Going back to v4l2-ctrl was where the solution was waiting to be found. Setting it there caused the image to be captured at the proper size.
This was a lot of fun learning how to access a webcam from the command line. The most surprising thing for me was that this was not cumulatively documented.