Social share:
December 9th, 2022

Computer Vision: Trends, Market, and Prospects

Smart video surveillance, image analytics and biometrics have already surpassed the capabilities of even professionally trained human beings in many ways. The computer is never distracted or tired, making it possible to mitigate the human factor in the production, research, and everyday aspects of our lives. Registering traffic violations, face swap for social media, translating texts via a smartphone camera, and FaceID for iPhone are just a few examples of how a computer has learned to interact with the outside world by "understanding" events, "seeing" objects, and distinguishing them from each other.

Computer Vision (CV) is a field of artificial intelligence related to image and video processing. It includes a set of techniques that allow computers to "see" and analyze received information – identify objects and people, recognize text, register movements, highlight homogeneous elements in images and videos, and much more.

In this article, together with the SimbirSoft ML team, we will explore the main applications of computer vision, problems, trends, and prospects for CV technology development.

CV Trends and Research Areas

Modern computer vision technologies allow companies across industries to tackle their business problems efficiently. The availability of industrial-level frameworks and libraries, numerous data sets, pre-trained models of various architectures, as well as a variety of efficient computing platforms (server-based, mobile, and embedded) facilitates this trend.

Here are some examples of traditional computer vision applications:

  • Self-driving vehicles: traffic control, route navigation, and control stimulation.
  • Medicine: interpretation of CT, MRI, ultrasound and X-ray images.
  • Production: process control, detection of product defects.
  • Agriculture and forestry: monitoring of crops and weed growth, plantation inventory.

In recent years, the enhancements to artificial intelligence algorithms have led to CV technologies penetrating almost every area of everyday life: traffic control on the roads, face recognition and search through databases of criminals and missing people, counting store visitors and analyzing the availability of goods on the shelves, monitoring the use of personal protective equipment, etc.

A significant share of the CV market falls on the entertainment and shopping sector. The ViewEvo mobile application, for which we have developed a design layout, allows you to recognize and highlight various products in a photo or video: clothes, shoes, and accessories. After that, the service searches for the products or alternatives in partner online stores.

кейс1.png

The successful application of computer vision technologies to solve a variety of problems allows us to switch to fully automatic systems, excluding human participation in decision-making. The most common example of such systems is self-driving vehicles.

Another important trend is the transition from the analysis of static images to the analysis of dynamic scenes. CV technologies allow us to identify behavior patterns of objects, and then analyze how they interact with each other over time.

The most common technology for computer vision today is ultra-precise neural networks. They help identify local features that are characteristic of various objects and then use them to solve applied problems: detecting objects, classifying them and even generating new images.

To ensure the necessary efficiency of calculations and high speed of CV algorithms, especially on mobile and embedded devices, optimization techniques such as quantization, pruning, and knowledge distillation are often used.

We highlight the following major research areas in computer vision:

  • Unsupervised and self-supervised learning algorithms. They will limit the use or completely abandon the expensive and time-consuming procedure of data markup in the data set.
  • Application of the "transformer" architecture models, which have proven themselves in text processing tasks. The attention mechanism used by such models enables a more flexible approach to identifying patterns in images.
  • Reliability of models: ensuring their correct operation in the presence of noise on input images, deliberate attacks on the algorithm in order to achieve a certain behavior from it (adversarial attacks), changes in statistical distributions in the input data, etc.
  • Model interpretability is an explanation of why the model produces a specific result.

CV Applications

Today, computer vision technologies are widely used both in production and research, as well as in the daily lives of people. Let's look at a few examples.

Security

In 2018, Hong Kong-based startup SenseTime raised $600 million in investment, becoming the most expensive private artificial intelligence project. The developers presented facial recognition and remote detection systems, as well as a solution for self-driving vehicles. A year after the launch, the company was worth, according to various estimates, from 3 to 4.5 billion dollars, and the project itself received the support of the Chinese government.

In addition to searching and comparing people's faces, video surveillance systems that use computer vision algorithms make it possible to detect various objects, monitor the situation indoors and in urban facilities, as well as protect workers in hazardous industries.

However, along with the obvious benefits that computer vision brings in terms of improving the safety of life, there are factors holding back the development of facial recognition technology. Firstly, there are legislative restrictions related to the protection of personal data of people. Secondly, there are ethical issues related to the problem of human rights violations. SenseTime, for instance, has been repeatedly mentioned in the context of the use of CV technologies against Muslims living in China.

Despite this, security remains the main use of computer vision both in Russia and worldwide. According to a 2018 TAdviser study, 32% of CV solutions come from video surveillance and security. Experts called this area of activity the most promising for computer vision in the near future.

Retail

Today, for most stores, being able to learn about the customer experience and offer a personalized experience is the key to success. Technology helps retailers collect customer data and audit outlets, thereby contributing to sales growth. Let's look at examples of how CV helps businesses increase their KPIs.

  • On Shelf Availability (OSA): control of goods on the shelf

Cameras installed in the salesroom analyze images of the racks. If a product is out of stock, the neural network sends a notification to the system. Everyone benefits from this: both the seller, whose buyer did not leave the store to buy the product from a competitor, and the manufacturer, who received valuable information about the movement of goods on the shelf.

  • Safety and theft protection

Facial recognition systems can scan the faces of customers at the entrance to the outlet and immediately check the resulting data against the black lists of well-known shoplifters. Such solutions are also able to distinguish store employees from unauthorized persons, preventing the latter from entering restricted areas.

  • Queue control

Long queues reduce customer loyalty, and floating visitor flow indicators affect the objectivity of KPIs for employees. As a result, all this increases staff turnover and affects business. The neural network determines when the number of people in the queue exceeds the acceptable value, and sends a notification to the outlet monitoring system. The technology also makes it possible to determine the average time after which the customer leaves the queue and refuses to buy. Retailers use the collected data, among other things, to optimize the number of personnel in the outlet.

  • Heat maps

The neural network analyzes data on the movement of customers on the sales floor, showing popular and "cold" areas of the store. They help to confirm or refute the existing rules for displaying goods and attract buyers to certain shelves. Heat maps help marketing specialists to plan the retail space more precisely: from posting information about promotions in the busiest places of the store to selling space to tenants in shopping centers.

Industry

In 2015, the founder of the World Economic Forum, Klaus Schwab, first used the Fourth Industrial Revolution (or Industry 4.0) term. This concept includes a new approach to production, based on the penetration of technology into all spheres of the economy.

One of the characteristic features of Industry 4.0 is the introduction of artificial intelligence for industrial enterprises. The Machine Vision term is also applied to computer vision technologies used in manufacturing. MV fully automates assembly, defect detection, laser cutting and other processes that previously required special training.

Frame 11.png

For example, SimbirSoft experts took part in the development of an application that keeps records of raw materials in the forest industry. Using machine learning algorithms, the system allows you to measure the diameter of tree trunks on a photo with an accuracy of up to a centimeter. Read more about the progress of the project here.

Occupational health and safety is another important area of application of machine vision in the industrial sector. Facial recognition systems are widely used for control panels or production lines, where it is important to ensure a high level of alertness. Thus, the systems help to make sure that the expert controls the production process in accordance with the regulations. In the event of an accident, they inform the staff about the location and criticality of the incident.

Medicine

The introduction of computer vision technologies in medicine opens up opportunities for the study of a wide range of diseases. Algorithms analyze medical images (X-rays, MRI, ultrasound) and help improve the accuracy of disease diagnosis. In particular, the image may contain small details that are not noticeable to the human eye, which the CV system recognizes almost unmistakably.

For example, the InnerEye system developed by Microsoft is capable to analyze the presence of abnormal formations based on computed tomography data and is widely used in radiation therapy for the treatment of cancer. The company also advocates for the democratization of CV technologies in the medical industry. In 2020, the InnerEye software package was made public, allowing healthtech providers to integrate machine learning models into their own systems.

Neural network algorithms are also used in computer diagnostics to plan personal therapy and increase the accuracy of decision-making. In telemedicine, CV technologies help to conduct primary diagnosis of certain diseases on a photo without the need to visit the doctor's office.

Current CV Issues

Despite its progressiveness and relevance, computer vision inevitably faces a number of technical problems and limitations:

  • High demand for marked data. Among all the stages of preparing datasets for the development of computer vision algorithms, the marking up is the most time-consuming and costly process. At the same time, the number of images and the markup quality largely determine the quality of the final models.
  • As a consequence, the implementation of computer vision algorithms can be difficult in areas for which the collection and marking of a large enough data set is difficult or impossible.
  • Interpretability of the results of the algorithm. The traditional approach to machine learning models is the Black Box. Both the input and output data is available to us, but we cannot say how the system arrived at the output result. This lack of interpretability results in distrust of CV algorithms, especially in areas with a high cost of error (e.g., medicine).

The Future of Computer Vision

According to TAdviser's preliminary estimate, the Russian CV market volume could reach almost RUB 40 billion by 2025, a fivefold increase since the study was conducted in 2019.

Today, the following trends in the development of CV technologies can be distinguished:

  • The emergence of effective algorithms and learning methods for training high-quality models on small datasets.

Today, training a model for a new task requires from several hundred to several thousand marked images. Collecting and marking up such a dataset is costly or, in some cases, not possible at all. Thus, the industry needs to learn how to handle data more efficiently.

  • Development of multimodal models capable of processing several data types simultaneously, for example, images and text.

Existing models are able to more or less accurately work either with texts or with images. However, these types of data are often found together: text with illustrations, videos with subtitles and comments, etc. There is every reason to believe that using a single model to handle such situations will simplify the algorithm development process and significantly improve the accuracy of the results.

  • New use cases are emerging, such as answering questions about images and videos that are asked and generated in natural language.

Today, image processing is usually reduced to classification, segmentation or object detection. Interpretation and transformation of the output into a human-understandable result, filtering and other operations are implemented by separate algorithms or additional models. The idea of combining the model and post-processing algorithms in a single model seems quite promising. A video available at the input is converted into a text description of the occurred events at the output. If the input is a described situation, then the output is a set of images depicting this situation. If the input is an image and a question, then the output is the answer to the question based on the image. Such a solution will improve the quality of the results and simplify the development.

Closing Remarks

The scope of application of CV technologies is expanding year after year. There are fewer and fewer branches of business where neural networks cannot come to our aid and sometimes completely replace manual labor or automate routine tasks.

Attitudes toward computer vision can be ambiguous. On the one hand, this technology raises many ethical questions and is regulated at the legislative level in many countries. On the other hand, it is a powerful research tool that businesses are increasingly turning to amid growing interest in artificial intelligence algorithms. According to all forecasts, the already rapidly developing computer vision market is expected to grow even faster in the foreseeable future.

Here you can see how we have been using Data Science to solve our customers' problems for more than 10 years.

Social share:
Enjoyed this article?
Subscribe to the SimbirSoft newsletter! We will sometimes send you emails about some development lifehacks, share our experience in team management, and tell you about the upcoming SimbirSoft events.

More Articles

Quality Assurance for IT Companies: Who Needs It and Why?
October 26th, 2023
Information System Development and Business Process Maturity: Choosing a Solution
October 5th, 2023
190 Projects Daily: Maintaining Quality in Software Development
September 5th, 2023
Tell us your idea
Send us an email or give us a call, we’d love to chat
about your most ambitious idea: +1 617 982 1723
Upload a file up to 10MB
File selected
Required extensions: .txt, .doc, .docx, .odt, .xls, .xlsx, .pdf, .jpg, .jpeg, .png

Maximum file size: 10 MB
Оставьте свои контакты
SimbirSoft регулярно расширяет штат сотрудников.
Отправьте контакты, чтобы обсудить условия сотрудничества.
Написать нам
Расскажите, какие задачи сейчас на вашем проекте.
Проконсультируем и предложим подходящих специалистов, а также сориентируем по ставкам на аутстаф.
Middle
TeamLead
Senior
TechLead
Upload a file up to 10MB
File selected
Required extensions: .txt, .doc, .docx, .odt, .xls, .xlsx, .pdf, .jpg, .jpeg, .png

Maximum file size: 10 MB
Экспресс-консультация
Заполните все поля формы.
Эксперт свяжется с вами в течение рабочего дня.
File selected
Можно прикрепить один файл в формате: txt, doc, docx, odt, xls, xlsx, pdf, jpg, jpeg, png.

Размер файла до 10 Мб.
Порекомендуйте друга — получите вознаграждение!
Прикрепить резюме, до 10Мб
Файл выбран
Можно прикрепить один файл в формате: txt, doc, docx, odt, xls, xlsx, pdf, jpg, jpeg, png.

Размер файла до 10 Мб.