© 2019 Hey Machine Learning

  • Bohdan Kaminskyi

What is Computer Vision?

Computer Vision is an area of Artificial Intelligence and Computer Science which helps computers to understand the world as a human being.

The mission of Computer Vision is to teach computers to see and understand environment using digital images and videos through three consecutive components:

  • acquisition

  • processing

  • analysis

In other words, the ultimate goal of Computer Vision is teaching the machine to make decisions just like a human.


Acquisition


Types of cameras

Image acquisition is the process of converting the analog world around us into a digital form. To do this, use built-in webcams, digital and SLR cameras, and also professional 3D cameras and laser range finders. As a rule, the data is obtained by such methods and must be further processed for more efficient use.



Processing


The next component of Computer Vision is the low-level image processing. This type of information is necessary for selecting edges, points or segmenting in images. They're all the basic geometric figures from which the objects in images consist.

As usual, image processing is realized with the aid of complex mathematical algorithms.

Low-level image processing algorithms include:

  • edge detection

  • segmentation

  • classification

  • feature detection and matching

Edge detection includes a variety of math methods that aim at identifying points in an image. This algorithm analyzes the image and translates it into a set of curved line segments. Edge detection is used to highlight the most important parts on the image in order to reduce the amount of data processed.



Edge detection

Image segmentation is typically used to locate objects and/or boundaries on images. In the segmentation process, the algorithm assigns a label to each pixel of an image so that pixels with the same label have certain characteristics. The segmentation result is a set of segments covering all parts of an image or contours extracted from an image.



Image segmentation

Classification of images involves extracting information about the content of images. The information obtained can be used to construct thematic maps.



Image classification

Feature detection and matching is an important part in many tasks of computer vision. These include structure-from-motion, image search, object detection and much more. The main concept of this method is the discovery of abstractions of image information.

Analysis


Image analysis and understanding is the last step in Computer Vision that allows machines to make its own decisions. At this stage, there are applied high-level data from the images and the results of the previous step. An example of a high-level analysis can be a 3D scene mapping, object recognition or tracking.


Where does it already apply?

To date, Computer Vision methods are used in many areas, such as robotics, human-computer interaction, and visualization. More specifically are included the following popular destinations:

  • Autonomous Vehicles

  • Augmented Reality

  • Automobile Tracking

  • Face Detection

  • Recognition of movements and gestures

  • Image restoration and processing

Autonomous Vehicle by Google

IKEA's augmented reality application

Face detection with FaceID technology

Computer Vision challenges

However, for today, developers of Computer Vision algorithms face a number of difficulties, such as:

  • poor quality of source data

  • limited resources

  • real-time content processing

But researchers around the world continue to work on overcoming these problems and continue to improve existing algorithms.

What about Hey Machine Learning?

Hey Machine Learning develops computer vision systems for various tasks. We use a lot of frameworks and tools, as well as their combinations to achieve the highest possible accuracy of detection, classification and tracking of objects in images. One such challenge was automatic number-plate recognition.

We faced with the task of developing an algorithm for computer vision, which in automatic mode had to determine the numbers of cars for the automation of parking spaces.

To solve this problem, we marked thousands of photos with different weather and light conditions for training an artificial neural network. While working on the project, we used two models. The first one detected the plates number of cars, which were recorded on a surveillance camera. The second model Optical Character Recognition (OCR) was used to recognize the numbers. As a result, the algorithm recognizes car license plates with an accuracy of 95%.