OpenCV Camera Calibration from Google Colab

Mar 31, 20235 min read

Updated: Apr 12, 2023

Our previous post on camera calibration directly from a video stream using OpenCV is outdated. In a future post, we will develop a Colab Notebook to calculate the distance between already detected fiduciary markers; we need to calibrate the camera performing the detection. This post provides a method for directly obtaining the camera matrix and distortion coefficients using OpenCV (the technique is in this link) using Google Colab.

The code for this demonstration is heavily adapted from our previous post on ARuco marker detection and is not necessarily optimal. We will need the following libraries:

import base64
import io
import time
import numpy as np
from PIL import Image
import cv2
from string import Template
import pandas as pd
import html
from IPython.display import display, Javascript, clear_output
from google.colab.output import eval_js

We are reusing the same section of Javascript code (adapted from this external post) that will enable the use of our webcam through the internet and into the Colab server:

# The start_input function code is adapted from:
# https://towardsdatascience.com/yolov3-pytorch-streaming-on-google-colab-16f4fe72e7b
def start_input(video_width=512, video_height=512):
  js_script = Template('''
    var video;
    var div = null;
    var stream;
    var captureCanvas;
    var imgElement;
    var labelElement;
    
    var pendingResolve = null;
    var shutdown = false;
    
    function removeDom() {
       stream.getVideoTracks()[0].stop();
       video.remove();
       div.remove();
       video = null;
       div = null;
       stream = null;
       imgElement = null;
       captureCanvas = null;
       labelElement = null;
    }
    
    function onAnimationFrame() {
      if (!shutdown) {
        window.requestAnimationFrame(onAnimationFrame);
      }
      if (pendingResolve) {
        var result = "";
        if (!shutdown) {
          captureCanvas.getContext('2d').drawImage(video, 0, 0, $video_width, $video_height);
          result = captureCanvas.toDataURL('image/jpeg', 0.8)
        }
        var lp = pendingResolve;
        pendingResolve = null;
        lp(result);
      }
    }
    
    async function createDom() {
      if (div !== null) {
        return stream;
      }
      div = document.createElement('div');
      div.style.border = '2px solid black';
      div.style.padding = '3px';
      div.style.width = '100%';
      div.style.maxWidth = '600px';
      document.body.appendChild(div);
      
      const modelOut = document.createElement('div');
      modelOut.innerHTML = "<span>Status:</span>";
      labelElement = document.createElement('span');
      labelElement.innerText = 'No data';
      labelElement.style.fontWeight = 'bold';
      modelOut.appendChild(labelElement);
      div.appendChild(modelOut);
           
      video = document.createElement('video');
      video.style.display = 'block';
      video.width = div.clientWidth - 6;
      video.setAttribute('playsinline', '');
      video.onclick = () => { shutdown = true; };
      stream = await navigator.mediaDevices.getUserMedia(
          {video: { facingMode: "environment"}});
      div.appendChild(video);
      imgElement = document.createElement('img');
      imgElement.style.position = 'absolute';
      imgElement.style.zIndex = 1;
      imgElement.onclick = () => { shutdown = true; };
      div.appendChild(imgElement);
      
      const instruction = document.createElement('div');
      instruction.innerHTML = 
          '<span style="color: red; font-weight: bold;">' +
          'Click here or on the video window to stop stream.</span>';
      div.appendChild(instruction);
      instruction.onclick = () => { shutdown = true; };
      
      video.srcObject = stream;
      await video.play();
      captureCanvas = document.createElement('canvas');
      captureCanvas.width = $video_width; //video.videoWidth;
      captureCanvas.height = $video_height; //video.videoHeight;
      window.requestAnimationFrame(onAnimationFrame);
      
      return stream;
    }

    async function takePhoto(label, imgData) {
      if (shutdown) {
        removeDom();
        shutdown = false;
        return '';
      }
      var preCreate = Date.now();
      stream = await createDom();
      
      var preShow = Date.now();
      if (label != "") {
        labelElement.innerHTML = label;
      }
            
      if (imgData != "") {
        var videoRect = video.getClientRects()[0];
        imgElement.style.top = videoRect.top + "px";
        imgElement.style.left = videoRect.left + "px";
        imgElement.style.width = videoRect.width + "px";
        imgElement.style.height = videoRect.height + "px";
        imgElement.src = imgData;
      }
      
      var preCapture = Date.now();
      var result = await new Promise(function(resolve, reject) {
        pendingResolve = resolve;
      });
      
      shutdown = false;
      
      return {'create': preShow - preCreate, 
              'show': preCapture - preShow, 
              'capture': Date.now() - preCapture,
              'img': result};
    }
    ''')
    
  js = Javascript(js_script.substitute(video_width=video_width,
                                     video_height=video_height))

  display(js)

Of course, we are reusing the image capture, modification, and broadcasting code. In addition, we are altering the get_drawing_array function to perform a calibration function instead of a marker detection function:

def take_photo(label, img_data):
  js_function = Template('takePhoto("$label", "$img_data")')
  data = eval_js(js_function.substitute(label=label, img_data=img_data))
  return data

def js_reply_to_image(js_reply):
  jpeg_bytes = base64.b64decode(js_reply['img'].split(',')[1])
  image_PIL = Image.open(io.BytesIO(jpeg_bytes))
  image_array = np.array(image_PIL)
  return image_array
  
def get_drawing_array(image_array, video_width=512, video_height=512):    
  drawing_array = np.zeros([video_width, video_height, 4], dtype=np.uint8)
  # The drawing_array is assigned from calibrate now:
  drawing_array, results = calibrate(image_array, drawing_array)
  drawing_array[:, :, 3] = (drawing_array.max(axis=2) > 0 ).astype(int)*255
  return drawing_array, results

def drawing_array_to_bytes(drawing_array):
  drawing_PIL = Image.fromarray(drawing_array, 'RGBA')
  iobuf = io.BytesIO()
  drawing_PIL.save(iobuf, format='png')
  var_js = str(base64.b64encode(iobuf.getvalue()), 'utf-8')
  fixed_js = Template('data:image/png;base64, $var_js')
  drawing_bytes = fixed_js.substitute(var_js=var_js)
  return drawing_bytes

The constant and empty arrays needed to perform the calibration are declared next. Changing MIN_POINTS will require fewer points for the calibration and will take less time, but the results may be worse:

# Define the dimensions of checkerboard
CHECKERBOARD = (7 ,7)
MIN_POINTS = 15

# Stop the iteration when specified
# accuracy, epsilon, is reached or
# specified number of iterations are completed.
criteria = (cv2.TERM_CRITERIA_EPS +
            cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

# Vector for the 3D points:
threedpoints = []

# Vector for 2D points:
twodpoints = [] 

# 3D points real world coordinates:
objectp3d = np.zeros((1, CHECKERBOARD[0]
                      * CHECKERBOARD[1],
                      3), np.float32)

objectp3d[0, :, :2] = np.mgrid[0:CHECKERBOARD[0],
                              0:CHECKERBOARD[1]].T.reshape(-1, 2)
prev_img_shape = None

Our calibration function will take the input image from the camera, travel through the internet, and, taking a while, will try to detect the chessboard pattern and calculate the camera aberration introduced in lines that should otherwise be straight:

def calibrate(image, output_image):

  matrix, distortion = None, None
  grayColor = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

  progress_message = "Points: "+str(len(twodpoints))+" of "+ str(MIN_POINTS)
  cv2.putText(output_image, progress_message,
          (10, 30), cv2.FONT_HERSHEY_DUPLEX,
          1, (0, 255, 0), 2)

  # Find the chess board corners
  # if desired number of corners are
  # found in the image then ret = true:
  ret, corners = cv2.findChessboardCorners(
                  grayColor, CHECKERBOARD,
                  cv2.CALIB_CB_ADAPTIVE_THRESH
                  + cv2.CALIB_CB_FAST_CHECK +
                  cv2.CALIB_CB_NORMALIZE_IMAGE)

  # If desired number of corners can be detected then,
  # refine the pixel coordinates and display
  # them on the images of checker board
  if ret == True:
      threedpoints.append(objectp3d)

      # Refining pixel coordinates
      # for given 2d points.
      corners2 = cv2.cornerSubPix(
          grayColor, corners, CHECKERBOARD, (-1, -1), criteria)

      twodpoints.append(corners2)
      
      
      output_image = cv2.drawChessboardCorners(output_image, 
                                          CHECKERBOARD,
                                          corners2, ret)
      
      # When we have minimum number of data points, stop:
      if len(twodpoints) > MIN_POINTS:        
        h, w = output_image.shape[:2]

        # Perform camera calibration by
        # passing the value of above found out 3D points (threedpoints)
        # and its corresponding pixel coordinates of the
        # detected corners (twodpoints):
        ret, matrix, distortion, r_vecs, t_vecs = cv2.calibrateCamera(
            threedpoints, twodpoints, grayColor.shape[::-1], None, None)       

  return output_image, (ret, matrix, distortion)

Our main loop will stream the camera feed and stop when the minimum number of required points has been detected:

NoneType = type(None)
start_input()
label_html = 'Capturing Webcam Stream.'
img_data = ''

while True:
    js_reply = take_photo(label_html, img_data)    
    if not js_reply:
        break

    image = js_reply_to_image(js_reply)
    drawing_array, results = get_drawing_array(image)

    # When the calibration is done, we gather the results and stop:
    if type(results[1]) is not NoneType  and type(results[2]) is not NoneType:
      clear_output(wait=False)
      print("Camera Matrix:")
      # Turn the matrices into Pandas dataframes and then save as csv.
      # Of course these can be saved in any other format you prefer:
      cm = pd.DataFrame(results[1])
      print(cm)
      cm.to_csv(r'cm.csv')
      print("\nCamera Distortion:")
      cd = pd.DataFrame(results[2])
      cd.to_csv(r'cd.csv')
      print(cd)
      break 

    drawing_bytes = drawing_array_to_bytes(drawing_array)
    img_data = drawing_bytes

A chessboard pattern is needed to perform the calibration, we have made one using toy construction bricks, but the pattern does not form perfectly aligned lines, so that the calibration may be of low quality. Alternatively, a flat image of a chessboard can be used; the flatter and straighter the pattern is, the better the camera calibration will be. Effectively these chessboards are acting as our calibration tools. These are the videos showing the calibration being performed for 15 minimum points:

We will put this calibration to the test by calculating the distance between two markers in a follow-up post. Do not hesitate to contact us if you require quantitative model development, deployment, verification, or validation. We will also be glad to help you with your machine learning or artificial intelligence challenges when applied to asset management, automation, or intelligence gathering from satellite, drone, or fixed-point imagery.

The notebook to fully replicate this demonstration is here.

OSTIRION

OpenCV Camera Calibration from Google Colab

Recent Posts