Posted in Self-Driving Car

## Canny Edge Detection in Action

Now that you have a conceptual grasp on how the Canny algorithm works, it’s time to use it to find the edges of the lane lines in an image of the road. So let’s give that a try.

First, we need to read in an image:

```import matplotlib.pyplot as plt
import matplotlib.image as mpimg
plt.imshow(image)

```

Here we have an image of the road, and it’s fairly obvious by eye where the lane lines are, but what about using computer vision?

Let’s go ahead and convert to grayscale.

```import cv2  #bringing in OpenCV libraries
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY) #grayscale conversion
plt.imshow(gray, cmap='gray')

```

Let’s try our Canny edge detector on this image. This is where OpenCV gets useful. First, we’ll have a look at the parameters for the OpenCV `Canny` function. You will call it like this:

```edges = cv2.Canny(gray, low_threshold, high_threshold)

```

In this case, you are applying `Canny` to the image `gray` and your output will be another image called `edges``low_threshold` and `high_threshold` are your thresholds for edge detection.

The algorithm will first detect strong edge (strong gradient) pixels above the `high_threshold`, and reject pixels below the `low_threshold`. Next, pixels with values between the `low_threshold` and `high_threshold` will be included as long as they are connected to strong edges. The output `edges` is a binary image with white pixels tracing out the detected edges and black everywhere else. See the OpenCV Canny Docs for more details.

What would make sense as a reasonable range for these parameters? In our case, converting to grayscale has left us with an 8-bit image, so each pixel can take 2^8 = 256 possible values. Hence, the pixel values range from 0 to 255.

This range implies that derivatives (essentially, the value differences from pixel to pixel) will be on the scale of tens or hundreds. So, a reasonable range for your threshold parameters would also be in the tens to hundreds.

As far as a ratio of `low_threshold` to `high_threshold`John Canny himself recommended a low to high ratio of 1:2 or 1:3.

We’ll also include Gaussian smoothing, before running `Canny`, which is essentially a way of suppressing noise and spurious gradients by averaging (check out the OpenCV docs for GaussianBlur). `cv2.Canny()` actually applies Gaussian smoothing internally, but we include it here because you can get a different result by applying further smoothing (and it’s not a changeable parameter within `cv2.Canny()`!).

You can choose the `kernel_size` for Gaussian smoothing to be any odd number. A larger `kernel_size` implies averaging, or smoothing, over a larger area. The example in the previous lesson was `kernel_size = 3`.

Note: If this is all sounding complicated and new to you, don’t worry! We’re moving pretty fast through the material here, because for now we just want you to be able to use these tools. If you would like to dive into the math underpinning these functions, please check out the free Udacity course, Intro to Computer Vision, where the third lesson covers Gaussian filters and the sixth and seventh lessons cover edge detection.

```#doing all the relevant imports
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import cv2

# Read in the image and convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# Define a kernel size for Gaussian smoothing / blurring
# Note: this step is optional as cv2.Canny() applies a 5x5 Gaussian internally
kernel_size = 3
blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size), 0)

# Define parameters for Canny and run it
# NOTE: if you try running this code you might want to change these!
low_threshold = 1
high_threshold = 10
edges = cv2.Canny(blur_gray, low_threshold, high_threshold)

# Display the image
plt.imshow(edges, cmap='Greys_r')

```

Here I’ve called the OpenCV function `Canny` on a Gaussian-smoothed grayscaled image called `blur_gray` and detected edges with thresholds on the gradient of `high_threshold`, and `low_threshold`.

In the next quiz you’ll get to try this on your own and mess around with the parameters for the Gaussian smoothing and Canny Edge Detection to optimize for detecting the lane lines and not a lot of other stuff.

Posted in Self-Driving Car

In the last exercise, we alredy know how select the color of the lane on highway, this is very important for the camera of self-driving car. In this case, I’ll assume that the front facing camera that took the image is mounted in a fixed position on the car, such that the lane lines will always appear in the same general region of the image. Next, I’ll take advantage of this by adding a criterion to only consider pixels for color selection in the region where we expect to find the lane lines.

Check out the code below. The variables `left_bottom``right_bottom`, and `apex` represent the vertices of a triangular region that I would like to retain for my color selection, while masking everything else out. Here I’m using a triangular mask to illustrate the simplest case, but later you’ll use a quadrilateral, and in principle, you could use any polygon.

```import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np

# Read in the image and print some stats
print('This image is: ', type(image),
'with dimensions:', image.shape)

# Pull out the x and y sizes and make a copy of the image
ysize = image.shape[0]
xsize = image.shape[1]
region_select = np.copy(image)

# Define a triangle region of interest
# Keep in mind the origin (x=0, y=0) is in the upper left in image processing
# Note: if you run this code, you'll find these are not sensible values!!
# But you'll get a chance to play with them soon in a quiz
left_bottom = [0, 539]
right_bottom = [900, 300]
apex = [400, 0]

# Fit lines (y=Ax+B) to identify the  3 sided region of interest
# np.polyfit() returns the coefficients [A, B] of the fit
fit_left = np.polyfit((left_bottom[0], apex[0]), (left_bottom[1], apex[1]), 1)
fit_right = np.polyfit((right_bottom[0], apex[0]), (right_bottom[1], apex[1]), 1)
fit_bottom = np.polyfit((left_bottom[0], right_bottom[0]), (left_bottom[1], right_bottom[1]), 1)

# Find the region inside the lines
XX, YY = np.meshgrid(np.arange(0, xsize), np.arange(0, ysize))
region_thresholds = (YY > (XX*fit_left[0] + fit_left[1])) & \
(YY > (XX*fit_right[0] + fit_right[1])) & \
(YY < (XX*fit_bottom[0] + fit_bottom[1]))

# Color pixels red which are inside the region of interest
region_select[region_thresholds] = [255, 0, 0]

# Display the image
plt.imshow(region_select)

# uncomment if plot does not display
# plt.show()
```

# Combining Color and Region Selections

Now you’ve seen how to mask out a region of interest in an image. Next, let’s combine the mask and color selection to pull only the lane lines out of the image.

Check out the code below. Here we’re doing both the color and region selection steps, requiring that a pixel meet both the mask and color selection requirements to be retained.

```import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np

# Grab the x and y sizes and make two copies of the image
# With one copy we'll extract only the pixels that meet our selection,
# then we'll paint those pixels red in the original image to see our selection
# overlaid on the original.
ysize = image.shape[0]
xsize = image.shape[1]
color_select= np.copy(image)
line_image = np.copy(image)

# Define our color criteria
red_threshold = 0
green_threshold = 0
blue_threshold = 0
rgb_threshold = [red_threshold, green_threshold, blue_threshold]

# Define a triangle region of interest (Note: if you run this code,
# Keep in mind the origin (x=0, y=0) is in the upper left in image processing
# you'll find these are not sensible values!!
# But you'll get a chance to play with them soon in a quiz 😉
left_bottom = [0, 539]
right_bottom = [900, 300]
apex = [400, 0]

fit_left = np.polyfit((left_bottom[0], apex[0]), (left_bottom[1], apex[1]), 1)
fit_right = np.polyfit((right_bottom[0], apex[0]), (right_bottom[1], apex[1]), 1)
fit_bottom = np.polyfit((left_bottom[0], right_bottom[0]), (left_bottom[1], right_bottom[1]), 1)

# Mask pixels below the threshold
color_thresholds = (image[:,:,0] < rgb_threshold[0]) | \
(image[:,:,1] < rgb_threshold[1]) | \
(image[:,:,2] < rgb_threshold[2])

# Find the region inside the lines
XX, YY = np.meshgrid(np.arange(0, xsize), np.arange(0, ysize))
region_thresholds = (YY > (XX*fit_left[0] + fit_left[1])) & \
(YY > (XX*fit_right[0] + fit_right[1])) & \
(YY < (XX*fit_bottom[0] + fit_bottom[1]))
color_select[color_thresholds] = [0,0,0]
# Find where image is both colored right and in the region
line_image[~color_thresholds & region_thresholds] = [255,0,0]

# Display our two output images
plt.imshow(color_select)
plt.imshow(line_image)

# uncomment if plot does not display
# plt.show()

```

In the next quiz, you can vary your color selection and the shape of your region mask (vertices of a triangle `left_bottom``right_bottom`, and `apex`), such that you pick out the lane lines and nothing else.

After combine region-making and color-classification:

```import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np

# Grab the x and y size and make a copy of the image
ysize = image.shape[0]
xsize = image.shape[1]
color_select = np.copy(image)
line_image = np.copy(image)

# Define color selection criteria
# MODIFY THESE VARIABLES TO MAKE YOUR COLOR SELECTION
red_threshold = 200
green_threshold = 200
blue_threshold = 200

rgb_threshold = [red_threshold, green_threshold, blue_threshold]

# Define the vertices of a triangular mask.
# Keep in mind the origin (x=0, y=0) is in the upper left
# MODIFY THESE VALUES TO ISOLATE THE REGION
# WHERE THE LANE LINES ARE IN THE IMAGE
left_bottom = [115, 540]
right_bottom = [800, 540]
apex = [455, 300]

# Perform a linear fit (y=Ax+B) to each of the three sides of the triangle
# np.polyfit returns the coefficients [A, B] of the fit
fit_left = np.polyfit((left_bottom[0], apex[0]), (left_bottom[1], apex[1]), 1)
fit_right = np.polyfit((right_bottom[0], apex[0]), (right_bottom[1], apex[1]), 1)
fit_bottom = np.polyfit((left_bottom[0], right_bottom[0]), (left_bottom[1], right_bottom[1]), 1)

# Mask pixels below the threshold
color_thresholds = (image[:,:,0] < rgb_threshold[0]) | \
(image[:,:,1] < rgb_threshold[1]) | \
(image[:,:,2] < rgb_threshold[2])

# Find the region inside the lines
XX, YY = np.meshgrid(np.arange(0, xsize), np.arange(0, ysize))
region_thresholds = (YY > (XX*fit_left[0] + fit_left[1])) & \
(YY > (XX*fit_right[0] + fit_right[1])) & \
(YY < (XX*fit_bottom[0] + fit_bottom[1]))

# Mask color and region selection
color_select[color_thresholds | ~region_thresholds] = [0, 0, 0]
# Color pixels red where both color and region selections met
line_image[~color_thresholds & region_thresholds] = [255, 0, 0]

# Display the image and show region and color selections
plt.imshow(image)
x = [left_bottom[0], right_bottom[0], apex[0], left_bottom[0]]
y = [left_bottom[1], right_bottom[1], apex[1], left_bottom[1]]
plt.plot(x, y, 'b--', lw=4)
plt.imshow(color_select)
plt.imshow(line_image)

```

Final result we have:

Posted in Self-Driving Car

## Color Selection

Finding Lane Lines on the Road

Which of the following features could be useful in the identification of lane lines on the road?

Answer : Color, shape, orientation, Position of the image.

# Coding up a Color Selection

Let’s code up a simple color selection in Python.

No need to download or install anything, you can just follow along in the browser for now.

We’ll be working with the same image you saw previously.

Check out the code below. First, I import `pyplot` and `image` from `matplotlib`. I also import `numpy` for operating on the image.

```import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np

```

I then read in an image and print out some stats. I’ll grab the x and y sizes and make a copy of the image to work with. NOTE: Always make a copy of arrays or other variables in Python. If instead, you say “a = b” then all changes you make to “a” will be reflected in “b” as well!

```# Read in the image and print out some stats
print('This image is: ',type(image),
'with dimensions:', image.shape)

# Grab the x and y size and make a copy of the image
ysize = image.shape[0]
xsize = image.shape[1]
# Note: always make a copy rather than simply using "="
color_select = np.copy(image)

```

Next I define a color threshold in the variables `red_threshold``green_threshold`, and `blue_threshold` and populate `rgb_threshold` with these values. This vector contains the minimum values for red, green, and blue (R,G,B) that I will allow in my selection.

```# Define our color selection criteria
# Note: if you run this code, you'll find these are not sensible values!!
# But you'll get a chance to play with them soon in a quiz
red_threshold = 0
green_threshold = 0
blue_threshold = 0
rgb_threshold = [red_threshold, green_threshold, blue_threshold]

```

Next, I’ll select any pixels below the threshold and set them to zero.

After that, all pixels that meet my color criterion (those above the threshold) will be retained, and those that do not (below the threshold) will be blacked out.

```# Identify pixels below the threshold
thresholds = (image[:,:,0] < rgb_threshold[0]) \
| (image[:,:,1] < rgb_threshold[1]) \
| (image[:,:,2] < rgb_threshold[2])
color_select[thresholds] = [0,0,0]

# Display the image
plt.imshow(color_select)
plt.show()

```

The result, `color_select`, is an image in which pixels that were above the threshold have been retained, and pixels below the threshold have been blacked out.

In the code snippet above, `red_threshold``green_threshold` and `blue_threshold` are all set to `0`, which implies all pixels will be included in the selection.

In the next quiz, you will modify the values of `red_threshold``green_threshold` and `blue_threshold` until you retain as much of the lane lines as possible while dropping everything else. Your output image should look like the one below.

Posted in Self-Driving Car

## Power of Camera

Let’s code up a simple color selection in Python.

No need to download or install anything, you can just follow along in the browser for now.

We’ll be working with the image below:

Check out the code below. First, I import `pyplot` and `image` from `matplotlib`. I also import `numpy` for operating on the image.

``````import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
``````

I then read in an image and print out some stats. I’ll grab the x and y sizes and make a copy of the image to work with. NOTE: Always make a copy of arrays or other variables in Python. If instead, you say “a = b” then all changes you make to “a” will be reflected in “b” as well!

``````# Read in the image and print out some stats
print('This image is: ',type(image),
'with dimensions:', image.shape)

# Grab the x and y size and make a copy of the image
ysize = image.shape[0]
xsize = image.shape[1]
# Note: always make a copy rather than simply using "="
color_select = np.copy(image)
``````

Next I define a color threshold in the variables `red_threshold``green_threshold`, and `blue_threshold` and populate `rgb_threshold` with these values. This vector contains the minimum values for red, green, and blue (R,G,B) that I will allow in my selection.

``````# Define our color selection criteria
# Note: if you run this code, you'll find these are not sensible values!!
# But you'll get a chance to play with them soon in a quiz
red_threshold = 0
green_threshold = 0
blue_threshold = 0
rgb_threshold = [red_threshold, green_threshold, blue_threshold]
``````

Next, I’ll select any pixels below the threshold and set them to zero.

After that, all pixels that meet my color criterion (those above the threshold) will be retained, and those that do not (below the threshold) will be blacked out.

``````# Identify pixels below the threshold
thresholds = (image[:,:,0] < rgb_threshold[0]) \
| (image[:,:,1] < rgb_threshold[1]) \
| (image[:,:,2] < rgb_threshold[2])
color_select[thresholds] = [0,0,0]

# Display the image
plt.imshow(color_select)
plt.show()
``````

The result, `color_select`, is an image in which pixels that were above the threshold have been retained, and pixels below the threshold have been blacked out.

In the code snippet above, `red_threshold``green_threshold` and `blue_threshold` are all set to `0`, which implies all pixels will be included in the selection.

In the next quiz, you will modify the values of `red_threshold``green_threshold` and `blue_threshold` until you retain as much of the lane lines as possible while dropping everything else. Your output image should look like the one below.

Image after color selection

Form

Posted in Self-Driving Car

## Localization, Path Planning, Control, and System Integration

In a self-driving car car, GPS (Global Positioning Systems) use trilateration to locate our position.

In these measurements, there may be an error from 1 to 10 meters. This error is too important and can potentially be fatal for the passengers or the environment of the autonomous vehicle. We therefore include a step called localization.

Localization is the implementation of algorithms to estimate where is our vehicle with an error of less than 10 cm.

This article follows articles AI … And the vehicle went autonomous and Sensor Fusion.

Localization is a step implemented in the majority of robots and vehicles to locate with a really small margin of error. If we want to make decisions like overtaking a vehicle or simply defining a route, we need to know what’s around us (sensor fusion) and where we are (localization). Only with this information we can define a trajectory.

# How to locate precisely?

There are many different techniques to help an autonomous vehicle locate itself.

• Odometry — This first technique, odometry, uses a starting position and a wheel displacement calculation to estimate a position at a time t. This technique is generally very inaccurate and leads to an accumulation of errors due to measurement inaccuracies, wheel slip, …
• Kalman filter — The previous article evoked this technique to estimate the state of the vehicles around us. We can also implement this to define the state of our own vehicle.
• Particle Filter — The Bayesian filters can also have a variant called particle filters. This technique compares the observations of our sensors with the environmental map. We then create particles around areas where the observations are similar to the map.
• SLAM — A very popular technique if we also want to estimate the map exists. It is called SLAM (Simultaneous Localization And Mapping). In this technique, we estimate our position but also the position of landmarks. A traffic light can be a landmark.

## Sensors

• Inertial Measurement Unit (IMU)is a sensor capable of defining the movement of the vehicle along the yaw, pitch, roll axis. This sensor calculates acceleration along the X, Y, Z axes, orientationinclination, and altitude.
• Global Positioning System (GPS) or NAVSTAR are the US system for positioning. In Europe, we talk about Galileo; in Russia, GLONASS. The term Global Navigation Satellite System (GNSS) is a very common satellite positioning system today that can use many of these subsystems to increase accuracy.

## Vocabulary

• Observation — An observation can be a measurement, an image, an angle …
• Control — This is our movements including our speeds and yaw, pitch, roll values retrieved by the IMU.
• The position of the vehicle — This vector includes the (x, y) coordinates and the orientation θ.
• The map — This is our landmarks, roads … There are several types of maps; companies like Here Technologies produce HD Maps, accurate maps centimeter by centimeter. These cards are produced according to the environment where the autonomous car will be able to drive.

# Kalman Filters

Explained in the previous article, a Kalman filter can estimate the state of a vehicle. As a reminder, this is the implementation of the Bayes Filter, with a prediction phase and an update phase.

Our first estimate is an equal distribution across the area.
We then have a measurement telling us that we are located next to a door.
Our distribution then changes to give a higher probability to areas located near doors.

We then perform a motion, our probabilities are shifted with greater uncertainties.

We take a new measurement, telling us that we are next to a door again.

The only possibility is to be located near the middle door.

In this example of a Kalman filter, we were able to locate ourselves using a few measurements and a comparison with the map. It is essential to know the map (including information only the 1st door has an adjacent door) to make deductions. This technique makes it possible not to use an initial position, which is preferable to the technique using odometry.

# Particle Filters

Particle Filter is another implementation of the Bayes Filter.

In a Particle Filter, we create particles throughout the area defined by the GPS and we assign a weight to each particle.

The weight of a particle represents the probability that our vehicle is at the location of the particle.

Unlike the Kalman filter, we have our probabilities are not continuous values but discrete values, we talk about weights.

## Algorithm

The implementation of the algorithm is according to the following scheme. We distinguish four stages (Initialization, Prediction, Update, Sampling) realized with the help of several data (GPS, IMU, speeds, measurements of the landmarks).

• Initialization — We use an initial estimate from the GPS and add noise (due to sensor inaccuracy) to initialize a chosen number of particlesEach particle has a position (x, y) and an orientation θ.
This gives us a particle distribution throughout the GPS area with equal weights.
• Prediction — Once our particles are initialized, we make a first prediction taking into account our speed and our rotations. In every prediction, our movements will be taken into account. We use equations describing x, y, θ (orientation) to describe the motion of a vehicle.
• Update — In our update phase, we first realize a match between our measurements n and the map m.

We use sensor fusion data to determine surrounding objects and then update our weights with the following equation :

In this equation, for each particle:
– σx and σy are our uncertainties
– x and y are the observations of the landmarks
– μx and μy are the ground truth coordinates of the landmarks coming from the map.

In the case where the error is strong, the exponential term is 0, the weight of our particle is 0 as well. In the case where it is very low, the weight of the particle is 1 standardized by the term 2π.σx.σy.

• Resampling — Finally, we have one last stage where we select the particles with the highest weights and destroy the least likely ones.
The higher the weight, the more likely the particle is to survive.

The cycle is then repeated with the most probable particles, we take into account our displacements since the last computation and realize a prediction then a correction according to our observations.

Particle filters are effective and can locate a vehicle very precisely. For each particle, we compare the measurements made by the particle with the measurements made by the vehicle and calculate a probability or weight. This calculation can makes the filter slow if we have a lot of particles. It also requires having the map of the environment where we drive permanently.

Results

My projects with Udacity taught me how to implement a Particle Filter in C ++. As in the algorithm described earlier, we implement localization by defining 100 particles and assigning a weight to each particle through measurements made by our sensors.

In the following video, we can see :

• green laser representing the measurements from the vehicle.
• blue laser representing the measurements from the nearest particle (blue circle).
• particle locating the vehicle (blue circle).
• The black circles are our landmarks (traffic lights, signs, bushes, …) coming from the map.

# SLAM (Simultaneous Localization And Mapping)

Another very popular method is called SLAM, this technique makes it possible to estimate the map (the coordinates of the landmarks) in addition to estimating the coordinates of our vehicle.

To work, we can with the Lidar find walls, sidewalks and thus build a map. SLAM’s algorithms need to know how to recognize landmarks, then position them and add elements to the map.

# Conclusion

Localization is an essential topic for any robot or autonomous vehicle. If we can locate our vehicle very precisely, we can drive independently. This subject is constantly evolving, the sensors are becoming more and more accurate and the algorithms are more and more efficient.

SLAM techniques are very popular for outdoor and indoor navigation where GPS are not very effective. Cartography also has a very important role because without a map, we cannot know where we are.
Today, research is exploring localization using deep learning algorithms and cameras.

Now that we are localized and know our environment, we can discuss algorithms for creating trajectories and making decisions!

Posted in Self-Driving Car

## Computervision, Deep Learning and Sensor

Computer vision (CV) is a process (and a branch of computer science) that involves capturing, processing and analyzing real-world images and video to allow machines to extract meaningful, contextual information from the physical world. Today, computer vision is the foundation and a key means of testing and exploiting deep-learning models that are propelling the evolution of artificial intelligence toward ubiquitous, useful and practical applications. A lot of advancements are expected to occur between 2018 and 2020.

But…what is computer vision?

Back in 1955, researchers assumed they could describe the processes that make up human intelligence and automate them, creating an artificial intelligence (AI). Despite being in a time before 1st demonstration of integrated circuits (IC) in 1958, or 1st commercially available microprocessor by Intel in 1971, or the term graphic processing units (GPU) popularized by Nvidia in 1999, serious researches began and one of the most notable “AI” researches started along three distinct lines: replicating the eye (to see); replicating the visual cortex (to describe); and replicating the rest of the brain (to understand). Along these three distinct lines, various degrees of progresses have been made:

• To See:
Reinventing the eye is the area with most success. Over the past few decades, sensors and image processors have been created to match or even exceed the human eye’s capabilities. With larger, more optically perfect lenses and nanometer-scaled image sensor and processor, the precision and sensitivity of modern cameras are incredible, especially compared to common human eyes. Cameras can also record thousands of images per second, detect distances and see better in dark environment. However, despite the high fidelity of the outputs, they merely record the distribution of photons coming in a given direction. The best camera sensor ever made couldn’t capture images in 3D until recent hardware breakthroughs (such as flood illuminator with NIR). Modern cameras also provide a much richer and more flexible platform for hardware to work with software.
• To Describe:
Seeing isn’t enough, but to describe is unfathomably complex. A computer can apply a series of transformations to an image, and therefore discover edges, the objects that these edges imply, and the perspective and movement when presented with multiple pictures, and so on. The processes involve a great deal of math and statistics, and wasn’t made possible until recent advances in parallel computing powered by GPU.
• To Understand:
Even achieving a toddler’s intelligence has been proven to be extremely complex. Researcher could build a system that recognizes every variety of apples, from every angle, in any situation, at rest or in motion, with bites taken out, anything — and it still wouldn’t be able to recognize an orange. For that matter, it couldn’t even tell you what an apple is, whether it’s edible, how big it is or what they’re used for. Why? Because we barely understand how our minds work: Short and long term memory, input from our other senses, attention and cognition, a billion lessons learned from a trillion interactions with the world, etc. This is not a dead end, but it’s definitely hard to pin down. The past efforts of building a know-it-all expert systems have been proven to be fruitless. A new AI architecture has emerged in the past 5 years or so.

As three key interlocking factors has begun to come together since 2012, the concepts of “context, attention, intention” are slowly evolving into computer vision, a new branch of AI:

Achieved by highly parallel GPU with the rise of foundry-fabless business model (such as TSMC and Nvidia). Liberating IC design and manufacturing from the proprietary-minded IDMs has installed more flexibility into hardware and thus allows software development to prosper. TSMC achieving 28nm mass production in 2012 has been the inflection point. Intel’s 10nm meltdown could further cement this trend.
• Much More Powerful Algorithms:
Unbinding software development from hardware manufacturing has invited software developers to join the revolution. With then pure software company like Microsoft bursting onto the scene in 1975, programmers have since invented many powerful tools to utilize radical new hardware, and one of the prime examples is deep neural networks (DNNs). We consider today’s DNNs to be smart because they can identify novel patterns in their input streams. Patterns their programmers did not anticipate. DNN performance on image recognition tests (ImageNet) exhibits lower error rates than humans performing the same tests.
• Huge Swatches of Data:
During the transition from centralized to decentralized architecture, internet was invented. With internet, collecting and integrating large amount of data becomes possible. From the internet, feeding DNNs with big data on powerful GPUs becomes a reality. With more application processors (AP) in personal devices adopting AI-enabled CV, CV applications are expanding along with more available frameworks and tools.

Let’s take a look at some of the currently notable/predictable CV applications on personal devices:

## Smartphone: Differentiation Opportunities

AI-driven capabilities enabled by CV have quickly become critical differentiation factors in the saturated smartphone market. These features attempt to transform smartphones from a passive utility tool to a more proactive personal assistant.

The emergence of CV in smartphones is driven by continued investments in AI techniques by major OEMs (Apple, Samsung, Huawei, and Google) and smartphone software, as well as the evolution of image sensors (Sony), image processing units (Sony and in-house ASICs) and modules’ miniaturization (Largan, etc.). For the past couple of years, new smartphones has been characterized by continued sophistication in cameras, with higher resolutions to capture more data to improve overall accuracy of visual recognition applications and integration of 3D depth-sensing technology to enhance the reliability of facial recognition. Google started it with its Tango-enabled phones, Lenovo Phab 2 and ASUS ZenFone AR, but failed to elaborate. Last year (2017), Apple introduced 3D sensing in the iPhone X, donned “TrueDepth” as part of the front-facing camera setup. Apple’s move has led to a rush in 3D sensing adoption. 3D-sensing technology is still far from mainstream, but increased availability and affordability of 3D sensors for phones is expected to continue and make it into more Android smartphones between 2018 and 2019.

If CV in smartphone follows mobile payment’s (by NFC) footstep, all premium smartphones would likely include CV capability by 2020 and 30% to 50% of non-premium smartphones would have the function before 2022. Facial or gesture recognition could become one of the standard authentication mechanisms and other CV apps would emerge as people get used to it. Here’s some directions for CV applications:

• Optimize Camera Settings:
Huawei uses the AI function on its Kirin 970 chip to recognize objects and scenarios to optimize the camera settings automatically. The AI-enabled camera can recognize more than 500 scenarios across 19 categories (food, group, sunset, greenery and night shot, etc.) and will adjust camera-setting features such as exposure, International Organization for Standardization (ISO) and color saturation or contrast, in real time. This enables users to get the best shot for each category. It is also able to perform object recognition linked to shopping applications and text translation based on an application developed with Microsoft Translator.
• Augmented Reality (AR):
Apple is already using the TrueDepth system in the iPhone X to produce Animoji, its animated emoji feature, for social networking. In the future, Apple will likely expand on AR applications. Apple acquired computer vision startup Ragaind, whose CV API can analyze photos and recognize in pictures faces, their gender, age and emotions. In 2016, Apple acquired the startup Emotient, which uses AI to recognize people’s emotions from facial expressions (the technology has probably been applied to Animoji already).
• Query and Assistant:
Google Lens, integrating Google’s expertise in CV and machine learning (ML), along with its extensive knowledge graph, can perform visual search. Using a smartphone camera, Lens detects an object, landmark or restaurant, recognizes what it sees, and offers information and specific actions about what it detects. At Google I/O 2018, Google announced enhancements to Lens, such as smart text selection and search. It also announced style match (if you see an item you like while shopping, Lens can show not only reviews but other similar shopping options or similar items to the one you like). However, Google Lens has received quite a few harsh reviews so far, likely due to the technology’s immaturity.
• Health and Record Book:
Samsung has been exploring CV with Bixby Vision. One of the use cases is food calories calculation. Ideally, Samsung’s Bixby Vision could calculate how much calories you consume by reviewing photos of your meal. For those who have been using MyFitnessPal with Asian dishes, trying to find matches and record calories is a PITA. Some other emerging well-being applications , applications such as Calorie Mama , AI has been employed to help manage and advise on diet and calorie intake, and monitor food composition, from food photos using deep learning and computer vision.

The advancements in computer vision and smartphones will likely have the most far-reaching impact. e-Commerce is also an area worth watching. CV could provide AR function for home décor/furnishing applications or clothes fitting. The biggest advantage of brick-and-mortar could erode fast.

## Head-Mounted Display (HMD): Immersive Experiences

CV can enhance immersive experiences via eye and position tracking, gesture recognition, and by mapping virtual environments. It will also help with realistic overlaying of virtual things in the real world in mixed reality, as well as enabling object or location recognition. However, HMD still only plays in a niche market with relatively few applications. To imagine how HMD could utilize computer vision to change our life, we have to look into the progresses of several major participants:

• Qualcomm: Turning Smartphone into HMD:
Qualcomm has Vision Intelligence Platform to support edge/on-device computing for camera processing and machine learning. With in-house CV software development kits, Qualcomm chips (currently on 10nm) can support VR cameras, robotics, and smartphone/wearable cameras. Qualcomm has also partnered with SenseTime(for face, image and object recognition, but as a Chinese AI startup, some privacy concern might emerge), Pilot.ai(for detection, classifications and tracking of objects/actions) and MM Solutions (for image-quality tuning services, acquired by ThinderSoft, another Chinese company which could bring up privacy concern).
• Facebook: Standalone HMD via Oculus Acquisition
Since Facebook acquired Oculus, it has been investing in CV in the last two years. Facebook acquired 3 companies to boost its efforts in CV: Surreal Vision (real-time 3D sense reconstruction of real things in a virtual world), Zurich Eye (enabling machines to navigate in any space), Fayteq (adding digital images into videos).
• Microsoft: Xbox as a Market?
Next version of HoloLens is expected in 2019 and should support cloud-based CV that will be capable of recognizing objects in AR. Other HMD providers from the Microsoft ecosystem could be offering new devices for MR with CV toward the end of 2019 to support next-generation Xbox (expected to hit market in 2020).

CV is a major enabler for creating more engaging customer experiences on HMDs. It reduces the invasive nature of advertisements. For more corporate use such as using HMD for employee training or collaborating on design or experiments, it could take years to create a viable common platform before collecting enough data. However, the internet has proven that advertisement alone is enough to drive massive innovation. The ability to offer location-specific experiences and services through CV would also help improve user experience for HMD.

## Personal Robots: a Visual Touch to Non-Optical Sensory Data

Currently, iRobot is probably the first thing that comes up when we think about personal robot, but cleaning bot is neither smart or multi-functional. It’s far from the humanoid that we imagine. Personal robots today are confined within the data generated by their sensors. Some of the more versatile robots, like Honda’s ASIMO in the graph above, cannot really learn despite being equipped with some cameras.

Computer vision could change all these.

CV complements sensory data in personal robots. It will enrich how personal robots can interact with the environment. CV is enabled in robots via camera mapping, 3D sensor mapping and simulations localization and mapping algorithms. It can be used for edge detection for rooms, furniture and stairs, and for floor plan modeling for cleaning robots. With CV, personal service robots could recognize different members of the family to support individual interactions and personal contexts, and assisting elderly people or people with disabilities in their own homes or in care homes. Remote healthcare for diagnostic and ongoing treatments would also become more reliable with CV and ML. At CES 2018, many robots with some implementations of CV were demonstrated. Many more should come in the next few years.

## Voice-Enabled Personal Assistant (VPA): Multi-modal Speakers

Since its introduction in 2014, more than 12,000 providers have leveraged the functionality of VPA speakers to deliver services, most of them connected home solutions around the Amazon Alexa skill set as Google and Apple were late to the party.

Originally, VPA focuses on audio rendering capabilities and connectivity to cloud-based music services, and as such, these speakers have proven to be a popular music player in the home. However, doubling down on the proven acceptance of these products, 2nd-generation VPAs are now adding cameras and screens to transform into AI-based VPAs.

With Apple’s HomdPod yet to prove its usefulness, VPA market is now dominated by Amazon with Google as the only worthy challenger, especially in the AI-based VPA field:

Amazon started the VPA trend with the introduction of Amazon Echo in 2014. It features far-field voice capturing, wireless (Wi-Fi and Bluetooth) connectivity, and high quality built-in loudspeakers for audio rendering. It was a huge success, but the AI focuses on voice not visual. In 2017, Amazon announced the Echo Show for the Alexa platform, incorporating a 7-inch LCD screen and a camera. Later that year, the Echo Spot began shipping with a circular 2.5-inch screen and camera. The biggest purpose of screens and cameras was to enable videoconferencing applications to improve the customer experience, but these two devices also serve as the basis of training Alexa’s CV capabilities. The Echo Look then shows how camera-enabled Alexa devices can be developed into CV-enabled platforms (which became available on 2018/6/6). The built-in camera can capture a user’s full body image and apply AI to create effects such as a blurred background. More importantly, cloud-based AI can analyze the attire of the user and make the appropriate shopping recommendations for similar styles. Oddly, the Echo Look does not have a screen. As a result, the rendering of the captured images and the shopping suggestions have to come from a connected device such as a smartphone running the Echo Look app, leaving room for future improvement. Imagine if mixed reality is possible with a built-in projector on next-gen Echo Look, Echo Look could project recommended clothes on your body with the camera recording it for you to review the look or share with others for opinion in real-time.
So far, Google’s participation in VPA has followed Android’ footstep: it hasn’t announced any multimodal Pixel VPA yet, but instead, it relies on hardware partners such as LG, Lenovo and others to provide multimodal devices. At CES 2018, LG announced the LG WK9, a ThinQ-enabled smart speaker device with an 8-inch touch display and camera for videoconferencing for Google Assistant. Lenovo announced its Smart Display with 8-inch or 10-inch screen options and a camera, also running Google Assistant. These device isn’t utilizing CV capabilities yet, but with Qualcomm S624 as application processor (which is designed not only for video applications in connected hubs, but also for device-based AI processing), one can imagine that these devices will have CV either through driver update or in next iteration. However, without clear “profitable” use cases as these hardware partners cannot really make money from retail, the potential remains somewhat undeveloped.

## Drone: CV to Elevate Freight and Provide Bridge to Last-Mile

Computer vision capabilities are increasingly being leveraged in drones with a potentially transformational impact in both personal and commercial drone applications.

The biggest impact could come from ambitions of drones for delivery. CV can help enable improving autonomous navigation beyond GPS in situations of pilot assistance in low visibility. CV can also enhance obstacle/collision avoidance and analysis of the best route calculation as CV, AI (ML) and simultaneous localization and mapping have been intertwined to enable 3D mapping and structure reconstruction, object detection and tracking, awareness of context, terrain analysis and path planning.

For shipping, CV could also act like Apple’s FaceID in authentication. One of the biggest concern for drone last-mile delivery is that someone could ninja your package. Using CV (if users have pre-registered forfacial recognition), identifying the right receiver won’t be a problem anymore. However, to enable this function, 5G might be a must.

## Connected Home: Personalized Internet of Things

Google is using CV in its Nest Cam IQ and Nest Cam IQ Outdoor to enable recognition of specific family members or friends, as well as the Sightline feature that identifies specific events in the video footage. The company also recently launched a smart camera, Google Clips, which can be placed around the house and will use algorithms and CV to capture “special moments.”

Nonetheless, cameras in connected-home appliances haven’t really evolved beyond home security features (which could provoke privacy concern as parents might be using these cameras by taking video footage without permission). Google introduced Nest Hello doorbell with a wide-angle camera for video able to perform facial recognition, which could be used as a means to unlock (or not to unlock) the door.

Computer vision adds a natural way for users to interact with the digital and physical worlds around them. It is enabling new interaction models for devices with users and the environment around them, but there are two main concerns around CV.

The first one is technological. As a young technology, there is no definitive algorithm for CV, and most of the popular algorithms out there are proprietary. The proprietary algorithms restricted CV capabilities on specific devices and use cases. For example, iRobot’s cleaning bots won’t share CV with your home security cameras. Facial recognition for a family in iPhones would coordinate with Amazon’s VPAs.

The second major concern is around privacy and country/region-specific regulations (such as GDPR in Europe). Many devices — HMDs and personal robots — with CV will collect a lot of data, images, video around individual consumers, a household, their routine, personal data information, information about kids, and patient information in a hospital’s reception area. The limitation of data retrieval could hamper the development of CV and AI.

Computer vision would be the closest AI that we will experience on a daily basis. Visual processing unit (VPU) for CV, 5G development and the deployment of edge computing will help CV form our future in the next few years.

Reference: Michael Wang-Medium

Posted in DevOps

# What is Kubernetes? Let’s find out how it works

## Kubernetes là gì? Cùng tìm hiểu cách hoạt động

What is Kubernetes? Kubernetes, or k8s is an open source platform that automates the management, scaling and deployment of applications in the form of containers, also known as Container orchestration engine. It eliminates a lot of the manual processes involved in the deployment and expansion of containerized applications.

Kubernetes là gì? – Kubernetes, hoặc k8s là một nền tảng mã nguồn mở tự động hoá việc quản lý, scaling và triển khai ứng dụng dưới dạng container hay còn gọi là Container orchestration engine. Nó loại bỏ rất nhiều các quy trình thủ công liên quan đến việc triển khai và mở rộng các containerized applications.

Lately, many applications have implemented containerization using docker and using it as an increasingly production environment. In production environments, because it is difficult to structure a container-based system using only docker. So using a platform Container orchestration engine such as k8s is quite popular today.

Gần đây, nhiều ứng dụng đã thực hiện container hoá bằng cách sử dụng docker và sử dụng nó như là môi trường production ngày càng tăng. Trên môi trường production, Vì việc cấu trúc hệ thống chạy bằng container chỉ sử dụng docker là rất khó khăn. Cho nên việc sử dụng một nền tảng Container orchestration engine như là k8s thì khá phổ biến hiện nay.

Actual production applications span multiple containers. These containers must be deployed on multiple server hosts. Kubernetes provides the coordination and management necessary to deploy containers to scale for those workloads.

Các ứng dụng production thực tế mở rộng nhiều containers. Các containers đó phải được triển khai trên nhiều server hosts. Kubernetes cung cấp khả năng phối hợp và quản lý cần thiết để triển khai các containers theo quy mô cho các workloads đó.

Kubernetes was originally developed and designed by engineers at Google. This is also the technology behind Google’s cloud services. Google has been creating more than 2 billion container deployments per week, all supported by the internal platform.

Kubernetes ban đầu được phát triển và thiết kế bởi các kỹ sư tại Google. Đây cũng là công nghệ đằng sau các dịch vụ đám mây của Google. Google đã và đang tạo ra hơn 2 tỷ container deployments mỗi tuần và tất cả đều được hỗ trợ bởi nền tảng nội bộ.

## Nên sử dụng Kubernetes khi nào?

• Large businesses that have a real need to scaling the system quickly, and already use containers (Docker). Projects need to run> = 5 containers of the same type for 1 service. (For example using> = 5 machines together to run code website TopDev) Innovative startups invest in technology to easily auto scale later.
• Các doanh nghiệp lớn, có nhu cầu thực sự phải scaling hệ thống nhanh chóng, và đã sử dụng container (Docker).
• Các dự án cần chạy >= 5 container cùng loại cho 1 dịch vụ. (Ví dụ dùng >=5 máy cùng để chạy code website TopDev).
• Các startup tân tiến, chịu đầu tư vào công nghệ để dễ dàng auto scale về sau.

## Kubernetes giải quyết vấn đề gì?

By using docker, on 1 host you can create multiple containers. However, if you intend to use it in production environment, you must think about the following:

• Batch management of docker hosts
• Container Scheduling
• Rolling update
• Scaling / Auto Scaling
• Monitor the life and death of the container.
• Self-hearing in case something goes wrong. (Capable of detecting and self-correct error)
• Service discovery
• Data management, work node, log
• Infrastructure as Code
• Alignment and expansion with other systems

Bằng việc sử dụng docker, trên 1 host bạn có thể tạo ra nhiều container. Tuy nhiên nếu bạn có ý định sử dụng trên môi trường production thì phải bắt buộc phải nghĩ đến những vấn đề dưới đây:

• Việc quản lý hàng loạt docker host
• Container Scheduling
• Rolling update
• Scaling/Auto Scaling
• Monitor vòng đời và tình trạng sống chết của container.
• Self-hearing trong trường hợp có lỗi xãy ra. (Có khả năng phát hiện và tự correct lỗi)
• Service discovery
• Quản lý data, work node, log
• Infrastructure as Code
• Sự liên kết và mở rộng với các hệ thống khác

Bằng việc sử dụng một Container orchestration engine như K8s có thể giải quyết được nhưng vấn đề trên đây. Trong trường hợp không sử dụng k8s, Thì sẽ phải cần thiết tạo ra cơ chế tự động hoá cho những cái kể trên, như thế thì cực kỳ tốn thời gian và không khả thi.

K8s quản lý thực thi các container sử dụng YAML để viết các Manifest.

Sau khái niệm kubernetes là gì chúng ta hãy đến với chức năng của nó. Kubernetes quản lý các docker host và cấu trúc container cluster. Ngoài ra, khi thực thi các container trên K8s, bằng cách thực hiện replicas (tạo ra nhiều container giống nhau) làm cho hệ thống có sức chịu lỗi cao và tự động thực hiện load balancing. Thông qua cơ chế load balancing, chúng ta có thể tăng giảm số lượng container replica (auto scaling).

Khi thực hiện phân chia container vào các Node (docker host), dựa trên các loại docker host kiểu như “Disk SSD” hay “số lượng clock của CPU cao”… Hoặc dựa trên loại Workload kiểu như “Disk I/O quá nhiều”, “Băng thông đến một container chỉ định quá nhiều” … K8s sẽ ý thức được việc affinity hay anti-affinity và thực hiện Scheduling một cách hợp lý cho chúng ta.

Trong trường hợp không được chỉ định host cụ thể, K8s sẽ thực hiện scheduling tuỳ thuộc vào tình trạng CPU, memmory của docker host có trống hay không. Vì vậy, chúng ta không cần quan tâm đến việc quản lý bố trí container vào các docker host như thế nào.

Hơn nữa, trường hợp resource không đủ, thì việc auto scheduling của K8s cluster cũng sẽ được thực hiện tự động.

Được xây dựng theo quan điểm tính chịu lỗi cao, K8s thực hiện monitor các container theo tiêu chuẩn. Trong trường hợp bất ngờ nào đó, khi một container process bị dừng, K8s sẽ thực hiện Self-hearing bằng cách scheduling một container nữa.

Self-hearing là một khái niệm cự kỳ quan trọng trong k8s, nếu trường hợp có một node nào đó trong cluster xảy ra vấn đề ví dụ có thể là bị die, hay node đó được di chuyển đi. Cơ chế self-hearing sẽ tự động phục hồi mà không ảnh hưởng đến service.

Thêm nữa, ngoài việc monitor hệ thống, k8s còn có khả năng thiết lập health check bằng HTTP/TCP script.

Trường hợp sau khi auto scaling, phát sinh một vấn đề của endpoint đến container. Trong trường hợp sử dụng máy ảo, bằng việc setting load balancing endpoint sẽ được sử dụng như một VIP.

K8s cũng có một chức năng tương tự như vậy đó là Service. Service của k8s cung cấp chức năng load balancing cho hàng loạt các container được chỉ định. Việc tự động thêm, xoá container thời điểm scale là điều hiển nhiên, khi một container xảy ra sự cố thì tự động cách ly.

Khi thực hiện rolling update container thì việc đầu tiên k8s sẽ làm là cách ly container cho chúng ta, vì vậy k8s có thể đảm nhận việc quản lý các endpoint ở mức SLA cao.

Trong trường hợp cấu trúc một hệ thống sử dụng docker, nên phân tách nhỏ các chức năng trong kiến trúc Microservice.

Trong kiến trúc Microservice, để sử dụng các image container được tạo ra tương ứng với từng chức năng và deploy chúng thì chức năng Service discovery thực sự cần thiết.

K8s là một Platform nhưng có khả năng liên kết tốt với các hệ sinh thái bên ngoài, có nhiều middleware chạy trên các service của k8s, trong tương lai chắc chắn sẽ còn nhiều hơn nữa.

• Ansible: Deploy container tới Kubernetes
• Apache Ignite: Sử dụng Service Discovery của Kubernetes, tự động tạo và scaling k8s clkuster
• Fluentd: gửi log của container trong Kubernetes
• Jenkins: Deploy container đến Kubernetes
• OpenStack：Cấu trúc k8s liên kết với Cloud
• Prometheus: Monitor Kubernetes
• Spark: Thực thi native job trên Kubernetes(thay thế cho YARN)
• Spinnaker：Deploy container đến Kubernetes

Thêm nữa, K8s chuẩn bị một vài cơ thế để có thể mở rộng, thực thi chức năng độc lập, nó có thể sử dụng platform như là một framework. Bằng cách sử dụng khả năng mở rộng, chúng ta có thể thực hiện release một ReplicaSet mà k8s cung cấp.

## Những khái niệm cơ bản trong Kubernetes là gì

### Master node

Là server điều khiển các máy Worker chạy ứng dụng. Master node bao gồm 4 thành phần chính:

• Kubernetes API Server: là thành phần giúp các thành phần khác liên lạc nói chuyện với nhau. Lập trình viên khi triển khai ứng dụng sẽ gọi API Kubernetes API Server này.
• Scheduler: Thành phần này lập lịch triển khai cho các ứng dụng, ưng dụng được đặt vào Worker nào để chạy
• Controler Manager: Thành phần đảm nhiệm phần quản lý các Worker, kiểm tra các Worker sống hay chết, đảm nhận việc nhân bản ứng dụng…
• Etcd: Đây là cơ sở dữ liệu của Kubernetes, tất cả các thông tin của Kubernetes được lưu trữ cố định vào đây.

### Worker node

Là server chạy ứng dụng trên đó. Bao gồm 3 thành phần chính:

• Container runtime: Là thành phần giúp chạy các ứng dụng dưới dạng Container. Thông thường người ta sử dụng Docker.
• Kubelet: đây là thành phần giao tiếp với Kubernetes API Server, và cũng quản lý các container
• Kubernetes Service Proxy: Thành phần này đảm nhận việc phân tải giữa các ứng dụng

### kubectl

Tool quản trị Kubernetes, được cài đặt trên các máy trạm, cho phép các lập trình viên đẩy các ứng dụng mô tả triển khai vào cụm Kubernetes, cũng như là cho phép các quản trị viên có thể quản trị được cụm Kubernetes.

### Pod

Pod là khái niệm cơ bản và quan trọng nhất trên Kubernetes. Bản thân Pod có thể chứa 1 hoặc nhiều hơn 1 container. Pod chính là nơi ứng dụng được chạy trong đó. Pod là các tiến trình nằm trên các Worker Node. Bản thân Pod có tài nguyên riêng về file system, cpu, ram, volumes, địa chỉ network…

### Image

Là phần mềm chạy ứng dụng đã được gói lại thành một chương trình để có thể chạy dưới dạng container. Các Pod sẽ sử dụng các Image để chạy.

Các Image này thông thường quản lý ở một nơi lưu trữ tập trung, ví dụ chúng ta có Docker Hub là nơi chứa Images của nhiều ứng dụng phổ biến như nginx, mysql, wordpress…

### Deployment

Là cách thức để giúp triển khai, cập nhật, quản trị Pod.

### Replicas Controller

Là thành phần quản trị bản sao của Pod, giúp nhân bản hoặc giảm số lượng Pod.

### Service

Là phần mạng (network) của Kubernetes giúp cho các Pod gọi nhau ổn định hơn, hoặc để Load Balancing giữa nhiều bản sao của Pod, và có thể dùng để dẫn traffic từ người dùng vào ứng dụng (Pod), giúp người dùng có thể sử dụng được ứng dụng.

### Label

Label ra đời để phân loại và quản lý Pod,. Ví dụ chúng ta có thể đánh nhãn các Pod chạy ở theo chức năng frontend, backend, chạy ở môi trường dev, qc, uat, production…

## Thực hành Kubernetes là gì

Phần thực hành sẽ giúp luyện tập với những khái niệm cơ bản ở phía trên của Kubernetes. Nội dùng phần này bao gồm việc cài đặt cụm Kubernetes gồm Master và Node thông qua Minikube.

Việc triển khai ứng dụng vào Kubernetes thông qua Deployment, sử dụng Service để giúp người dùng truy cập ứng dụng từ bên ngoài vào trong Kubernetes, và các thao tác quản trị như tăng giảm số bản sao của ứng dụng cũng như cập nhật phiên bản của ứng dụng.