What Is Computer Vision?

Computer vision is a science that studies how to make machines "see". Furthermore, it refers to the use of cameras and computers instead of human eyes to recognize, track, and measure machine vision, and to perform graphics processing to make computers Processed into images that are more suitable for human observation or transmission to the instrument for inspection. As a scientific discipline, computer vision research-related theories and technologies attempt to build artificial intelligence systems that can obtain 'information' from images or multidimensional data. The information referred to here is defined by Shannon and can be used to help make a "decision". Because perception can be viewed as extracting information from sensory signals, computer vision can also be viewed as a science that studies how to make artificial systems "perceive" from images or multidimensional data.

Computer vision is a science that studies how to make machines "see". Furthermore, it refers to the use of cameras and computers instead of human eyes to recognize, track, and measure machine vision, and to perform graphics processing to make computers Processed into images that are more suitable for human observation or transmission to the instrument for inspection. As a scientific discipline, computer vision research-related theories and technologies attempt to build artificial intelligence systems that can obtain 'information' from images or multidimensional data. The information referred to here is defined by Shannon and can be used to help make a "decision". Because perception can be viewed as extracting information from sensory signals, computer vision can also be viewed as a science that studies how to make artificial systems "perceive" from images or multidimensional data.
Chinese name
Computer vision
Foreign name
Computer Vision

Computer vision definition

Computer vision is a simulation of biological vision using computers and related equipment. Its main task is to process the collected pictures or videos to obtain the three-dimensional information of the corresponding scene, just like humans and many other types of creatures do every day.
Computer vision is a science of how to use cameras and computers to obtain the data and information about the subject we need. Visually speaking, it is to install a computer with eyes (camera) and brain (algorithm) so that the computer can perceive the environment. Our Chinese idiom "Seeing is believing" and Westerners often say "One picture is worth ten thousand words" express the importance of vision to human beings. It is not difficult to imagine how broad the application prospects of machines with vision can be.
Computer vision is a challenging and important research area in both the engineering and scientific fields. Computer vision is a comprehensive subject that has attracted researchers from various disciplines to participate in its research. These include computer science and engineering, signal processing, physics, applied mathematics and statistics, neurophysiology and cognitive science.

Computer vision analysis

Vision is an integral part of various intelligent / autonomous systems in various application areas, such as manufacturing, inspection, document analysis, medical diagnostics, and military. Because of its importance, some advanced countries, such as the United States,
The relationship between computer vision and other fields
Research is classified as a major fundamental issue in science and engineering that has a broad impact on the economy and science, the so-called grand challenge. The challenge of computer vision is to develop vision capabilities comparable to humans for computers and robots. Machine vision requires image signals, texture and color modeling, geometric processing and inference, and object modeling. A capable vision system should tightly integrate all these processes. As a discipline, computer vision began in the early 1960s, but many important advances in basic computer vision research were made in the 1980s. Computer vision is closely related to human vision, and a correct understanding of human vision will be very beneficial to the research of computer vision. For this we will first introduce human vision.

Computer vision principle

Computer vision is to use various imaging systems instead of visual organs as input-sensitive means, and computers to replace the brain to complete processing and interpretation. The ultimate research goal of computer vision is to enable computers to observe and understand the world through human vision, and have the ability to adapt to the environment autonomously. Goals that can only be achieved through long-term efforts. Therefore, before the final goal is achieved, the mid-term goal of people's efforts is to establish a visual system that can complete certain tasks based on a certain degree of intelligence of visual sensitivity and feedback. For example, an important application area of computer vision is the visual navigation of autonomous vehicles. There is no condition to implement a system that can recognize and understand any environment like a human and complete autonomous navigation. Therefore, the research goal of people's efforts is to achieve a vision-assisted driving system that has road tracking capabilities on the highway and can avoid collision with vehicles in front. The point to be pointed out here is that the computer plays a role of replacing the human brain in the computer vision system, but it does not mean that the computer must complete the processing of visual information according to the method of human vision. Computer vision can and should process visual information according to the characteristics of computer systems. However, the human visual system is by far the most powerful and complete visual system known to people. As will be seen in the following chapters, the study of human visual processing mechanisms will provide inspiration and guidance for the study of computer vision. Therefore, it is also a very important and interesting research area to study the mechanism of human vision by computer information processing and to establish the computing theory of human vision. This research is called Computational Vision. Computational vision can be considered as a research area in computer vision.

Computer vision related

The research goals of many disciplines are similar to or related to computer vision. These disciplines include image processing, pattern recognition or image recognition, scene analysis, and image understanding. Computer vision includes image processing and pattern recognition. In addition, it includes the description of spatial shapes, geometric modeling, and recognition processes. [1] Achieving image understanding is the ultimate goal of computer vision. [2]

Computer vision image processing

Image processing techniques convert an input image into another image with the desired characteristics. For example, the output image can have a higher signal-to-noise ratio through processing, or the details of the image can be highlighted through enhancement processing to facilitate operator inspection. In computer vision research, image processing techniques are often used for preprocessing and feature extraction.

Computer vision pattern recognition

Pattern recognition technology classifies images into predetermined categories based on statistical characteristics or structural information extracted from the images. For example, text recognition or fingerprint recognition. Pattern recognition technology is often used in computer vision to identify and classify certain parts of an image, such as segmented regions.

Computer vision image understanding

Given an image, the image understanding program not only describes the image itself, but also describes and interprets the scene represented by the image in order to make decisions about what the image represents. The term scene analysis is often used in the early days of artificial intelligence vision research to emphasize the difference between two-dimensional images and three-dimensional scenes. In addition to complex image processing, image understanding requires knowledge about the physical laws of scene imaging and knowledge about scene content.
Relevant technologies in the above disciplines are used when establishing computer vision systems, but the content of computer vision research is more extensive than these disciplines. The research of computer vision is closely related to the research of human vision. In order to achieve the goal of establishing a general computer vision system similar to the human vision system, a computer theory of human vision needs to be established.

Computer Vision Status

The outstanding feature of computer vision is its diversity and imperfections. Pioneers in this field date back to earlier times, but it wasn't until the late 1970s that computer vision gained formal attention and development when computer performance improved enough to handle large-scale data such as images. However, these developments often originate from the needs of other different fields, so what is meant by "computer vision problems" has not been formally defined. Naturally, there is no formula for how "computer vision problems" should be solved.
Nevertheless, people have begun to grasp some methods for solving specific computer vision tasks. Unfortunately, these methods are usually only applicable to a narrow set of targets (such as faces, fingerprints, text, etc.), and therefore cannot be widely used in different situations.
The application of these methods is often part of some large-scale systems that solve complex problems (such as medical image processing, quality control and measurement in industrial manufacturing). In most practical applications of computer vision, computers are preset to solve specific tasks. However, machine learning-based methods are becoming increasingly popular. Once the research of machine learning is further developed, future "generic" computer vision applications may be able to come true.
One of the main questions researched by artificial intelligence is: how to make the system have "planning" and "decision making capabilities"? In order to make it perform a specific technical action (for example: moving a robot through a certain environment). This problem is closely related to computer vision problems. Here, the computer vision system acts as a perceptron, providing information for decision making. Other research directions include pattern recognition and machine learning (this is also in the field of artificial intelligence, but has an important connection with computer vision). Therefore, computer vision is often regarded as a branch of artificial intelligence and computer science.
Physics is another area that has an important connection with computer vision.
The goal of computer vision is to fully understand the electromagnetic waves-mainly visible light and infrared rays-encountering the image formed by the surface of the object, and this process is based on optical physics and solid state physics. Will be applied to the theory of quantum mechanics to analyze the real world represented by the image. At the same time, many measurement problems in physics can also be solved by computer vision, such as fluid motion. Therefore, computer vision can also be regarded as an extension of physics.
Another area of great significance is neurobiology, especially the part of the biological vision system.
Throughout the 20th century, humans have conducted extensive research on the eyes, neurons, and brain tissues associated with visual stimuli in various animals, and these studies have led to some descriptions of how the "natural" visual system works ( (Although still a little rough), this also forms a sub-field in computer vision-people are trying to build artificial systems to simulate the visual operation of living things to varying degrees of complexity. At the same time, in the field of computer vision, some methods based on machine learning also refer to some biological mechanisms.
Another related field of computer vision is signal processing. Many processing methods of unit variable signals, especially the processing of time-varying signals, can be naturally extended to the processing methods of binary variable signals or multivariate signals in computer vision. However, due to the unique properties of image data, many methods developed in computer vision cannot find a corresponding version in the processing method of unit signals. One of the main characteristics of this type of method is their non-linearity and the multi-dimensionality of image information. As a part of computer vision, the above two points have formed a special research direction in signal processing.
In addition to the areas mentioned above, many research topics can also be considered purely mathematical problems. For example, many of the problems in computer vision are based on statistics, optimization theory, and geometry.
How to implement the existing methods through various software and hardware, or how to modify these methods, so as to obtain a reasonable execution speed without losing sufficient accuracy, is the main issue in the field of computer vision today.

Computer vision applications

Human beings are entering the information age, and computers will be more and more widely used in almost all fields. On the one hand, more people without computer training also need to use computers; on the other hand, the functions of computers are getting stronger and more complicated. This creates a sharp contradiction between the flexibility of conversation and communication and the rigor and rigidity required when using a computer. People can exchange information with the outside world through sight and hearing, language, and can express the same meaning in different ways, but computers require programs to be written strictly in accordance with various programming languages, only in this way can the computer run. In order to enable more people to use complex computers, it is necessary to change the situation in the past that made people adapt to computers and memorize the rules of use of computers. Instead, let the computer adapt to people's habits and requirements, and exchange information with people in the way that people are accustomed to, that is, to make computers have the ability to see, hear and speak. The computer must be capable of logical reasoning and decision making. A computer with the above capabilities is an intelligent computer.
Intelligent computers not only make computers more convenient for people to use, but if such computers are used to control various automation devices, especially intelligent robots, these automation systems and intelligent robots can have the ability to adapt to the environment and make decisions autonomously. This can replace people's heavy work in various occasions, or replace people to complete tasks in various dangerous and harsh environments.
Applications range from tasks such as industrial machine vision systems to, for example, inspecting production lines on bottles to speed up, research into artificial intelligence and computers or robots that can understand the world around them. There is a significant overlap in the fields of computer vision and machine vision. Computer vision involves the core technologies used to automate image analysis in many fields. Machine vision generally refers to a process that combines automatic image analysis with other methods and technologies to provide automatic inspection and robotic guidance in industrial applications. In many computer vision applications, computers are pre-programmed to solve specific tasks, but learning-based methods are now becoming more common. Examples of computer vision applications include for systems:
(1) Control process, for example, an industrial robot;
(2) Navigation, for example, through autonomous cars or mobile robots;
(3) Detected events, such as video surveillance and population statistics;
(4) Organizing information, for example, an index database of images and image sequences;
(5) modeling objects or environments, such as medical image analysis systems or terrain models;
(6) Interaction, for example, when inputting to a device for computer human interaction;
(7) Automatic detection, for example, in manufacturing applications.
The most prominent application areas are medical computer vision and medical image processing. Information on the characteristics of this area is extracted from the image data for the purpose of making a medical diagnosis of the patient. Generally, image data are in the form of microscope images, X-ray images, angiographic images, ultrasound images, and tomographic images. An example of the information that can be extracted from such image data is the detection of tumors, atherosclerosis or other malignant changes. It can also be the size of the organ, blood flow, etc. This area of application also supports the measurement of medical research by providing new information, such as the structure of the brain, or about the quality of medical treatment. Computer vision applications in the medical field also include enhancing images that are interpreted by humans, such as ultrasound images or X-ray images, to reduce the effects of noise.
Computer vision in the second application area is in industry, sometimes also called machine vision, where information is extracted for the purpose of supporting manufacturing processes. An example is quality control, where the information or end product is automatically detected to find defects. Another example is that the position and orientation of the detail being picked are measured by a robot arm. Machine vision is also used extensively in agricultural processes, from bulk materials, a process known as removing unwanted things, and optical sorting of food.
Military applications are likely to be one of the largest areas of computer vision. The most obvious examples are the detection of enemy soldiers or vehicles and missile guidance. More advanced systems guide the missile to the area where the missile is sent, rather than a specific target, and make choices when the missile reaches the area based on locally acquired image data. Modern military concepts, such as "battlefield awareness," mean that various sensors, including image sensors, provide a wealth of information about combat scenarios that can be used to support strategic decision-making. In this case, the automatic processing of data is used to reduce complexity and fuse information from multiple sensors to improve reliability.
A newer application area is autonomous vehicles, which include submersibles, land vehicles (small robots with wheels, cars or trucks), aerial work vehicles and unmanned aerial vehicles (UAV). The level of autonomy ranges from completely independent (unmanned) vehicles to cars, where computer vision-based systems support drivers or trials in different situations. A fully autonomous car usually knows where it is when using computer vision for navigation, or for production environments (Map SLAM) and for detecting obstacles. It can also be used to detect specific events for specific tasks, such as a UAV looking for forest fires. Examples of support systems are cars in obstacle warning systems, and autonomous landing systems for aircraft. Several automakers have demonstrated systematic auto-driving, but the technology has not reached a certain level before it can be put on the market. There are ample examples of military autonomous vehicles, from advanced missiles, to drone reconnaissance missions or guided missiles. Space exploration is already using computer vision, autonomous vehicles such as NASA's Mars Rover and European Space Agency's ExoMars Rover.
Other application areas include:
(1) Movies and broadcasts that support visual effects production, such as camera tracking (motion matching).
(2) Monitoring.

Computer vision similarities and differences

Computer vision, image processing, image analysis, robot vision and machine vision are closely related disciplines. If you open the textbooks with the above names, you will find that they have a considerable overlap in technology and application fields. This shows that the basic theories of these disciplines are roughly the same, and it is even suspected that they are named differently for the same discipline.
However, research institutions, academic journals, conferences, and companies often group themselves into one of these areas, and various characteristics that distinguish these disciplines have been brought up. A distinction method will be given below, although it cannot be said that this distinction method is completely accurate.
The research objects of computer vision are mainly three-dimensional scenes mapped onto single or multiple images, such as reconstruction of three-dimensional scenes. Computer vision research has largely focused on the content of images.
The research objects of image processing and image analysis are mainly two-dimensional images, which realize the transformation of images, especially for pixel-level operations, such as improving image contrast, edge extraction, noise removal and geometric transformation such as image rotation. This feature indicates that the research content of image processing or image analysis has nothing to do with the specific content of the image.
Machine vision mainly refers to vision research in the industrial field, such as the vision of autonomous robots, and the vision used for detection and measurement. This shows that in this field, through software and hardware, image sensing and control theory is often closely combined with image processing to achieve efficient robot control or various real-time operations.
Pattern recognition uses a variety of methods to extract information from signals, mainly using statistical theory. One of the main directions in this field is to extract information from image data.
Another area is called imaging technology. The initial research in this area was mainly about making images, but sometimes it also involved image analysis and processing. For example, medical imaging includes a large number of image analysis in the medical field.
For all of these fields, one possible process is that you work in a computer vision laboratory, work on image processing, and finally solve the problems in the field of machine vision, and then publish your results at a conference on pattern recognition.

Computer vision problems

Almost every specific application of computer vision technology has to solve a series of the same problems. These classic questions include:

Computer vision recognition

A classic problem common to computer vision, image processing, and machine vision is determining whether a set of image data contains a specific object, image feature, or motion state. This problem can usually be solved automatically by a machine, but so far, there is no single method that can determine a wide range of situations: identify any object in any environment. The prior art can and can only well solve the recognition of specific targets, such as simple geometric figure recognition, face recognition, printed or handwritten document recognition, or vehicle recognition. And these recognitions need to have specified lighting, background and target pose requirements in a specific environment.
Generalized recognition has evolved into several slightly different concepts on different occasions:
Recognition (narrow sense): Recognize one or more pre-defined or learned objects or objects, and usually provide their two-dimensional position or three-dimensional attitude during the recognition process.
Identification: Recognize the single object itself. For example: the recognition of a certain face, the recognition of a certain fingerprint.
Monitoring: Finding specific situation content from the image. For example: the discovery of abnormal skills in cells or tissues in medicine, the discovery of passing vehicles by traffic surveillance equipment. Monitoring often finds special areas in the image through simple image processing, which provides a starting point for subsequent more complex operations.
Several specific application directions identified:
Content-based image extraction: Find all pictures with specified content in a huge collection of images. Specified content can take many forms, such as a red, roughly circular pattern, or a bicycle. The search for the latter content here is obviously more complicated than the former, because the former describes a low-level intuitive visual feature, while the latter involves an abstract concept (also can be said to be a high-level visual feature) That is, "bike", obviously the appearance of the bicycle is not fixed.
Attitude evaluation: Evaluation of the position or orientation of an object relative to the camera. For example: evaluation of robot arm attitude and position.
Optical character recognition recognizes and identifies printed or handwritten text in an image. The usual output is to convert it into an easily editable document form.

Computer vision movement

There are many types of object movement monitoring based on sequence images, such as:
Autonomous motion: 3D rigid motion of the surveillance camera.
Image tracking: track moving objects.

Computer vision scene reconstruction

Given two or more images or a video of a scene, scene reconstruction seeks to build a computer model / three-dimensional model for the scene. The simplest case is to generate a set of points in three-dimensional space. In more complex cases, a complete 3D surface model is built.

Computer vision image recovery

The goal of image restoration is to remove noise in the image, such as instrument noise, blur, etc.

Computer vision system

The structural form of computer vision system largely depends on its specific application direction. Some work independently and are used to solve specific measurement or inspection problems. Others appear as part of a large and complex system, such as working with mechanical control systems, database systems, and human-machine interface devices. The specific implementation method of the computer vision system is also determined by its function-whether it is fixed in advance or automatically learned and adjusted during operation. Nonetheless, some functions are needed for almost every computer system:

Computer vision image acquisition

A digital image is produced by one or more image sensors. The sensors here can be various light-sensitive cameras, including remote sensing equipment, X-ray tomography, radar, ultrasonic receiver, etc. Depending on the different perceptron, the generated picture can be a common two-dimensional image, a three-dimensional group or an image sequence. The pixel value of the picture often corresponds to the intensity of light in one or more spectral bands (grayscale or color map), but it can also be related to various physical data, such as the depth and absorption of sound waves, electromagnetic waves or nuclear magnetic resonance Or reflectance.

Computer vision preprocessing

Before implementing specific computer vision methods on images to extract certain specific information, one or some pre-processing is often used to make the images meet the requirements of subsequent methods. E.g:
Subsampling to ensure correct image coordinates;
Smooth denoising to filter out device noise introduced by the perceptron;
Increase the contrast to ensure that relevant information can be detected;
Adjusting the scale space makes the image structure suitable for local applications.

Computer vision feature extraction

Extract features of various complexity from the image. E.g:
Line and edge extraction;
Localized feature point detection such as edge detection, spot detection;
More complex features may be related to the texture shape or motion in the image.

Computer vision detection segmentation

During image processing, it is sometimes necessary to segment the image to extract valuable parts for subsequent processing, such as
Screening feature points;
Split a part of one or more pictures that contains a specific target.

Advanced Computer Vision Processing

At this point, the data often has a small amount, such as the part of the image that was previously considered to contain the target object. Processing at this time includes:
Verify that the data obtained meet the prerequisite requirements;
Estimate specific coefficients, such as the attitude and volume of the target;
Classify targets.
Advanced processing has an understanding of the meaning of image content. It is a high-level processing in computer vision. It is mainly based on image segmentation and then understands the segmented image blocks, such as recognition and other operations.

Computer vision requirements

The influence of light source layout needs to be carefully considered.
Select the correct lens group, and consider the magnification, space, size, distortion ...
Choose the right camera (CCD), consider function, specifications, stability, durability ...
Vision software development depends on experience accumulation, try more, and think about ways to solve problems.
The ultimate goal is to continuously improve the creation accuracy and shorten the processing time.
end.

Computer vision conference

Computer vision top

ICCV: International Conference on Computer Vision
CVPR: International Conference on Computer Vision and Pattern Recognition
ECCV: European Conference on Computer Vision

Computer vision is better

ICIP: International Conference on Image Processing
BMVC: British Machine Vision Conference
ICPR: International Conference on Pattern Recognition
ACCV: Asian Conference on Computer Vision

Computer Vision Journal

Computer vision top

PAMI: IEEE Transactions on Pattern Analysis and Machine Intelligence
IJCV: International Journal on Computer Vision

Computer vision is better

TIP: IEEE Transactions on Image Processing
CVIU: Computer Vision and Image Understanding
PR: Pattern Recognition
PRL: Pattern Recognition Letters

IN OTHER LANGUAGES

Was this article helpful? Thanks for the feedback Thanks for the feedback

How can we help? How can we help?