Posted in Self-Driving Car

Advance Computer Vision

Processing Each Image

In this chapter, the first thing you’ll do is to compute the camera calibration matrix and distortion coefficients. You only need to compute these once, and then you’ll apply them to undistort each new frame. Next, you’ll apply thresholds to create a binary image and then apply a perspective transform.


You’ll want to try out various combinations of color and gradient thresholds to generate a binary image where the lane lines are clearly visible. There’s more than one way to achieve a good result, but for example, given the image above, the output you’re going for should look something like this:

Perspective Transform

Next, you want to identify four source points for your perspective transform. In this case, you can assume the road is a flat plane. This isn’t strictly true, but it can serve as an approximation for this project. You would like to pick four points in a trapezoidal shape (similar to region masking) that would represent a rectangle when looking down on the road from above.

The easiest way to do this is to investigate an image where the lane lines are straight, and find four points lying along the lines that, after perspective transform, make the lines look straight and vertical from a bird’s eye view perspective.

Here’s an example of the result you are going for with straight lane lines:

Now for curved lines

Those same four source points will now work to transform any image (again, under the assumption that the road is flat and the camera perspective hasn’t changed). When applying the transform to new images, the test of whether or not you got the transform correct, is that the lane lines should appear parallel in the warped images, whether they are straight or curved.

Here’s an example of applying a perspective transform to your thresholded binary image, using the same source and destination points as above, showing that the curved lines are (more or less) parallel in the transformed image:

Locate the Lane Lines

Thresholded and perspective transformed image

You now have a thresholded warped image and you’re ready to map out the lane lines! There are many ways you could go about this, but here’s one example of how you might do it:

Line Finding Method: Peaks in a Histogram

After applying calibration, thresholding, and a perspective transform to a road image, you should have a binary image where the lane lines stand out clearly. However, you still need to decide explicitly which pixels are part of the lines and which belong to the left line and which belong to the right line.

Plotting a histogram of where the binary activations occur across the image is one potential solution for this. In the quiz below, let’s take a couple quick steps to create our histogram!

123456789101112131415161718192021222324import numpy as npimport matplotlib.image as mpimgimport matplotlib.pyplot as plt# Load our image# `mpimg.imread` will load .jpg as 0-255, so normalize back to 0-1img = mpimg.imread(‘warped_example.jpg’)/255print(img.shape)def hist(img): # TO-DO: Grab only the bottom half of the image # Lane lines are likely to be mostly vertical nearest to the car bottom_half = img[img.shape[0]//2:,:] print(bottom_half) # TO-DO: Sum across image pixels vertically – make sure to set `axis` # i.e. the highest areas of vertical lines should be larger values histogram = np.sum(bottom_half, axis = 0) return histogram# Create histogram of image binary activationshistogram = hist(img)# Visualize the resulting histogramplt.plot(histogram)

Here’s the approach I took.

I take a histogram along all the columns in the lower half of the image like this:

import numpy as np
import matplotlib.pyplot as plt

histogram = np.sum(img[img.shape[0]//2:,:], axis=0)

The result looks like this:

Sliding Window

With this histogram we are adding up the pixel values along each column in the image. In our thresholded binary image, pixels are either 0 or 1, so the two most prominent peaks in this histogram will be good indicators of the x-position of the base of the lane lines. We can use that as a starting point for where to search for the lines. From that point, we can use a sliding window, placed around the line centers, to find and follow the lines up to the top of the frame.

Implement Sliding Windows and Fit a Polynomial

As shown in the previous animation, we can use the two highest peaks from our histogram as a starting point for determining where the lane lines are, and then use sliding windows moving upward in the image (further along the road) to determine where the lane lines go.

Split the histogram for the two lines

The first step we’ll take is to split the histogram into two sides, one for each lane line.

import numpy as np
import cv2
import matplotlib.pyplot as plt

# Assuming you have created a warped binary image called "binary_warped"
# Take a histogram of the bottom half of the image
histogram = np.sum(binary_warped[binary_warped.shape[0]//2:,:], axis=0)
# Create an output image to draw on and visualize the result
out_img = np.dstack((binary_warped, binary_warped, binary_warped))*255
# Find the peak of the left and right halves of the histogram
# These will be the starting point for the left and right lines
midpoint =[0]//2)
leftx_base = np.argmax(histogram[:midpoint])
rightx_base = np.argmax(histogram[midpoint:]) + midpoint

Note that in the above, we also create out_img to help with visualizing our output later on.

Set up windows and window hyperparameters

Our next step is to set a few hyperparameters related to our sliding windows, and set them up to iterate across the binary activations in the image. We have some base hyperparameters below, but don’t forget to try out different values in your own implementation to see what works best!

# Choose the number of sliding windows
nwindows = 9
# Set the width of the windows +/- margin
margin = 100
# Set minimum number of pixels found to recenter window
minpix = 50

# Set height of windows - based on nwindows above and image shape
window_height =[0]//nwindows)
# Identify the x and y positions of all nonzero (i.e. activated) pixels in the image
nonzero = binary_warped.nonzero()
nonzeroy = np.array(nonzero[0])
nonzerox = np.array(nonzero[1])
# Current positions to be updated later for each window in nwindows
leftx_current = leftx_base
rightx_current = rightx_base

# Create empty lists to receive left and right lane pixel indices
left_lane_inds = []
right_lane_inds = []

Iterate through nwindows to track curvature

Now that we’ve set up what the windows look like and have a starting point, we’ll want to loop for nwindows, with the given window sliding left or right if it finds the mean position of activated pixels within the window to have shifted.

You’ll implement this part in the quiz below, but here’s a few steps to get you started:

  1. Loop through each window in nwindows
  2. Find the boundaries of our current window. This is based on a combination of the current window’s starting point (leftx_current and rightx_current), as well as the margin you set in the hyperparameters.
  3. Use cv2.rectangle to draw these window boundaries onto our visualization image out_img. This is required for the quiz, but you can skip this step in practice if you don’t need to visualize where the windows are.
  4. Now that we know the boundaries of our window, find out which activated pixels from nonzeroy and nonzerox above actually fall into the window.
  5. Append these to our lists left_lane_inds and right_lane_inds.
  6. If the number of pixels you found in Step 4 are greater than your hyperparameter minpix, re-center our window (i.e. leftx_current or rightx_current) based on the mean position of these pixels.

Fit a polynomial

Now that we have found all our pixels belonging to each line through the sliding window method, it’s time to fit a polynomial to the line. First, we have a couple small steps to ready our pixels.

# Concatenate the arrays of indices (previously was a list of lists of pixels)
left_lane_inds = np.concatenate(left_lane_inds)
right_lane_inds = np.concatenate(right_lane_inds)

# Extract left and right line pixel positions
leftx = nonzerox[left_lane_inds]
lefty = nonzeroy[left_lane_inds] 
rightx = nonzerox[right_lane_inds]
righty = nonzeroy[right_lane_inds]

We’ll let you implement the function for the polynomial in the quiz below using np.polyfit.

# Assuming we have `left_fit` and `right_fit` from `np.polyfit` before
# Generate x and y values for plotting
ploty = np.linspace(0, binary_warped.shape[0]-1, binary_warped.shape[0])
left_fitx = left_fit[0]*ploty**2 + left_fit[1]*ploty + left_fit[2]
right_fitx = right_fit[0]*ploty**2 + right_fit[1]*ploty + right_fit[2]

Take note of how we fit the lines above – while normally you calculate a y-value for a given x, here we do the opposite. Why? Because we expect our lane lines to be (mostly) vertically-oriented.

Image for post
The ‘S’ channel, or Saturation, with binary activation
Image for post
A few more thresholds (left) for activation, with the resulting perspective transformation
Image for post
Sliding windows and a decent-looking result


My name is Truong Thanh, graduated Master of Information Technology and Artificial Intelligent in Frankfurt University,Germany. I create this Blog to share my experience about life, study, travel...with friend who have the same hobbies.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s