Boosting Business With Machine Learning-Based Automated Visual Data Processing: Results of Finnish Company Interviews

Machine Learning (ML) solutions are rapidly evolving and are increasingly capable of performing Automated Visual Data Processing (AVDP) tasks such as visual scene understanding, 3D model reconstruction and automated content generation tasks. We believe that novel AVDP solutions can significantly boost a company’s business, streamline the creation of 3D content and models, enable adaptation of content in XR applications and support the creation of digital twins for a company’s needs. However, there are also obstacles that limit the business use of such solutions. The objective of our research is to study the skills and insights of companies in using different kinds of AVDP solutions in their business. This article presents the results of interviews with 10 Finnish companies. The interviews comprised three sections: The first section gave a brief introduction about the existing AVDP solutions for the respondents. The second section collected information on the respondents’ background and the respective company’s experience of using ML in business and its short and long-term interest in developing or using different AVDP solutions in its business. The third section comprised thematic interviews which covered visual data processing themes that the respondents selected to be of interest to their company’s business.


I. INTRODUCTION
Recent advances in Automated Visual Data Processing (AVDP) solutions such as ML-based visual scene understanding, 3D model reconstruction and automated content generation solutions can provide great business potential and a business ecosystem for companies that either develop or use these novel solutions in their business.
In order to boost the use of novel AVDP solutions in business and to study the skills and insights of a company in using such solutions, we decided to interview companies that could benefit from such solutions and adopt different roles in an AVDP business ecosystem. The interviews comprised three sections. The first section included a brief introduction that presented examples of the newest ML-based AVDP solutions. In the second section, the respondents answered general questions that covered their The associate editor coordinating the review of this manuscript and approving it for publication was Chintan Amrit . background, their company's experience of using ML in business and their interest in using different AVDP solutions in their business. The third section comprised thematic interviews which covered visual data processing themes that the respondents selected to be of interest to their company. We analysed the responses and prepared a summary showing the kind of AVDP solutions that the interviewed companies need.
This paper makes the following contributions: • A classification of ML-based AVDP solutions that provides a starting point for the company interviews.
• The results of the company interviews that provide new information and insights about the skills and interests of the respective company in using AVDP solutions and present aspects of ML-based AVDP solutions that the respondents thought would boost or reduce the use of such solutions in their business. This paper is organized as follows: Section II includes a brief introduction to the state of the art of AVDP solutions. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Section III presents the results of the company interviews. Section IV provides a discussion related to the results of the interviews. Finally, Section V provides concluding remarks.

II. THE STATE OF THE ART OF AUTOMATED VISUAL DATA PROCESSING
This section provides a brief overview and classification of existing ML-based AVDP solutions. The solutions can be classified in the following three categories based on their usage purpose: solutions for visual scene understanding, 3D model reconstruction and automated content generation. The following subsections discuss these categories in more detail.

A. VISUAL SCENE UNDERSTANDING
Applications such as autonomous driving, indoor navigation and even virtual or augmented reality systems need accurate and efficient segmentation mechanisms for visual scenes [2]. Visual scene understanding can be based on static (for an image) and/or dynamic (for a video) input data understanding [1] and support coarse-grained or fine-grained inference. The following classification introduces five levels of visual scene understanding from coarse-grained to fine-grained inference (based on [2] and [1]): 1) Level 1: Image classification -The most coarsegrained semantic segmentation can simply predict what the object is in an image input or even provide a ranked list if there are many objects in an image input. 2) Level 2: Object localization -is capable of providing the classes and information regarding the spatial location (e.g. centroids or bounding boxes) of these classes. For example, this requires understanding an image at pixel level and matching each pixel to each object class (pixelwise object recognition problem) [3]. 3) Level 3: 3D object pose detection -In this case, the segmentation mechanism is capable of performing object localization and estimating the pose of a 3D object in an image input. 4) Level 4: Semantic segmentation -applied to still 2D images, video and even 3D or volumetric data -is one of the key solutions that paves the way for complete scene understanding that can make dense predictions inferring labels for each pixel. This way, each pixel is labelled with the class of its enclosing object or region [2]. 5) Level 5: Instance segmentation -is capable of producing separate labels for different instances of the same class and even part-based segmentation (low-level decomposition of already segmented classes into their component parts).

B. 3D MODEL RECONSTRUCTION
Reconstructing 3D geometries using active sensors, passive cameras, online images or from unordered 3D points are well-studied areas of research in computer graphics and vision [4]. There is also extensive literature within the AR and robotics community on Simultaneous Localization and Mapping (SLAM), aimed at tracking a user or robot while creating a map of the surrounding physical environment [4]. Machine learning-based 3D model reconstruction solutions can be categorized into the following classes: 1) 3D model reconstruction from 2D images -In this case a 3D model is reconstructed from one or more 2D images. For example, [5] and [7] propose neural network-based solutions to reconstruct a mesh from a single 2D image; [6] presents a solution that uses a trained CNN for mapping between a 2D input sketch and Procedural Modelling (PM) parameters that are then used in PM, which generates a corresponding 3D shape for the sketch; [9] introduces a 3D object reconstruction method to efficiently reconstruct a dense point cloud from a single 2D image. 2) 3D model reconstruction from 2.5D images -In this case a 3D model is reconstructed from 2.5D images such as RGB-D images or point cloud data. For example, [10] supports reconstructing complete geometry from a point cloud acquired with a low-quality consumer-level scanning device. 3) 3D model reconstruction from 3D data -In this case a 3D model is reconstructed from 3D images such as incomplete or broken 3D data. For example, [12] proposes a method that can reconstruct and complete the shape of a broken or incomplete 3D object model by using a 3D Variational AutoEncoder (VAE) that can encode meaningful latent representations, and a Generative Adversarial Network (GAN), which is capable of generating realistic samples from a latent representation using a two-player minimax game.

C. AUTOMATED CONTENT GENERATION
Content creation has been a key research area in computer graphics for decades and one of its main goals has been to minimize manual intervention, while still allowing the creation of a variety of plausible 3D objects [13]. Existing automated content generation solutions provide support for:

1) SYNTHESIS OF 2D IMAGES
Generative Adversarial Networks (GANs) have shown significant promise as generative models for natural images, for multi-stage image generation including part-based models, for image generation conditioned on input text or attributes, for image generation based on 3D structure, and even for video generation [14]. The best results are from particularly well defined use cases, i.e. synthesis of faces etc. where learning is restricted to a 2D image type with quite strict structure.

2) SYNTHESIS OF 3D SHAPES
It is challenging to create a novel shape that appears to be sampled from the distribution of a given class because it is hard to encapsulate the essence of a class of objects and express it in a compact model to guide the creation of a 99172 VOLUME 8, 2020 new sample that belongs to the class [13]. The GAN-based solutions address this problem and are capable of generating novel 3D shapes that are based on features of real objects. For example, [11] proposes a 3D-GAN model that consists of a generator and discriminator, where the discriminator tries to classify real objects and objects synthesized by the generator, and the generator attempts to confuse the discriminator. In this 3D-GAN, the generator maps a 200-dimensional latent vector, randomly sampled from a probabilistic latent space, to a 64 × 64 × 64 cube, representing an object in 3D voxel space. The discriminator outputs a confidence value of whether a 3D object input is real or synthetic [11]. The voxel-based 3D-GANs operate typically in low resolution limiting the applicability of the generated content in real-life applications.

3) SYNTHESIS OF 3D DESIGNS
Design synthesis methods can be divided into two categories: rule-based and data-driven design synthesis [15]. The former (e.g., grammars-based design synthesis) requires labeling of the reference points or surfaces and defining rule sets, so that new designs are synthesized according to this hard-coded prior knowledge; while the latter learns rules/constraints from a database and generates plausible new designs with similar structure/function to exemplars in the database [15]. The existing 3D design synthesis solutions support: a) Synthesis of designs with hierarchical structures -For example, [15] introduces a GAN-based model for problems in which design space exploration of the latent space has to be staged since the optimal solution of one component depends on the geometry of another. The model proposed in [15] synthesizes designs with hierarchical structures in two stages: the first synthesizes the parent shape from a learned latent representation; and the second synthesizes a child shape from another learned latent representation conditioned on the parent shape. b) Synthesis of layouts with constraints -The production of layout designs with complex constraints is a problem of searching for an optimal solution in a design space confined by constraints [16]. It is a particularly challenging problem due to the non-uniqueness of the solution and also because it is difficult to identify a set of suitable design variables that define the layout. Also, it is challenging to incorporate the constraints into conventional optimization-based methods [16]. Reference [16] proposes a method that uses VAE to learn the constraints and to generate layout design candidates that automatically satisfy all the constraints. The method is based on the assumption that a design space is a specific space associated with a certain but unknown distribution [16]. A VAE model can be constructed to learn the unknown distribution of a design space by providing a dataset that satisfies all the constraints. Once the mapping is established, the VAE is capable of generating new datapoints that satisfy the constraints that would allow us to conduct the design without imposing any constraints, since the search is within the confined space identified by the VAE [16]. The VAE network is also capable of learning the underlying physics of the design problem, leading to an efficient design tool that needs no physical simulation once the network is constructed [16]. c) Synthesis of 3D component assemblies -For example, [17] describes a neural network architecture for suggesting complementary components and their placement in an incomplete 3D part assembly. d) Synthesis of indoor scenes -is a process of generating functional and plausible indoor scenes with appropriate furniture and layout [18]. This is an inherently difficult task since it requires considerable knowledge of both selecting reasonable object categories and arranging objects appropriately [18]. For example, [35] presents a solution that integrates a recursive neural network with VAE and is capable of learning hierarchical structures of 3D indoor scenes and generating a plausible 3D scene from a random vector in less than one second.

III. THE INTERVIEW RESULTS
Companies that were assumed to be potential users or developers of AVDP solutions were asked to participate in the interviews. As a result, a total of 17 representatives from 10 Finnish companies participated in the interviews in Helsinki or Oulu between October and December 2019. The interviews included two companies from the manufacturing sector and eight companies from the service sector that offer services such as XR solutions, 3D design services or construction planning services. Table 1 depicts a summary of the interviews. The last column in Table 1 presents the proportion of positive answers to the statements presented in the rows. The companies differed in size, service type and application domains. Company size is defined according to [8]: micro-enterprise <10, small enterprise <50, medium-sized enterprise <250 and large enterprise >250 employees. The interviews included representatives from seven SMEs and three large enterprises. Five interviews included respondents (e.g. CEOs, CTOs and COOs) from C-suite; six interviews included respondents who manage R&D at a company and three interviews included respondents who perform R&D activities at a company.
The interviews comprised three sections. We performed an SoA review for the ML-based AVDP solutions before the interviews and prepared a brief, general level introduction to these solutions, which we presented in the first section of the interviews. The objective was to provide a general level of information about novel AVDP solutions for the respondents and then discuss these solutions in more detail in subsequent sections of the interviews. The second section included a questionnaire that collected information about the respondents' background and the respective company's experience and interest in using ML and AVDP solutions in their VOLUME 8, 2020 business. The representatives also defined the expected role of the company in an AVDP business ecosystem with regard to being a developer or a user of AVDP solutions, or both, if appropriate. The third section consisted of theme interviews that covered the visual data processing themes that the respondents selected to be of interest for their company's business. The results of the interviews are discussed in more detail in the following subsections.

A. PRESENT SITUATION IN THE COMPANIES
The majority of the interviews (nine out of ten) included respondents who worked on both business development and technical development. Many of the respondents had previous knowledge of ML solutions. For example, four interviews included respondents who had already used ML solutions in practice. One interview included a respondent who had extensive knowledge of ML solutions but had not yet used these solutions in practice. Four interviews included respondents who had already studied ML techniques and had some knowledge of existing ML solutions. In one interview, the respondents had no previous knowledge of ML solutions.
The respondents believed that ML could be used at least to some extent for specific tasks in their business. 90% of the companies had experimented in using ML or had used ML in business and 30% of the companies already used ML in their business. A lack of training data is an obstacle that limits the use of ML in business. In practice, in many cases it can be challenging to obtain the necessary training data for ML. The interviews show that there is a clear need for solutions that support the production of training data. For ML solutions to be used, the production of training data should not require too much effort from the existing personnel. The majority of companies (80%) have data to be used in the production of training data. Unfortunately, thus far, a company's data is not typically used in ML. This data is typically raw data, such as visual data captured from the environment, which requires pre-processing and labelling and there is neither sufficient time nor resources to categorize this data. Only one of the companies had labelled training data for ML. In addition, one company had used simulated training data for ML.
The interviews show that identifying the potential of ML can also be challenging. Although the companies had data for the production of training data, detecting problems that could be solved with existing ML solutions can be challenging. Also, there must be customers who are willing to pay for the developed ML-based features.

B. REQUIREMENT FOR AN AVDP ECOSYSTEM
The respondents identified many requirements and opportunities relating to the business use of AVDP solutions. For example, AVDP solutions can provide a competitive edge for a company, making it possible to offer products and services that differ from competitors, and create a faster workflow. For business, it is crucial that an AVDP solution works in practice. For example, an AVDP solution may work well in test conditions but may not translate well to real-world use cases. It is often the case that a large proportion of the work is about fine-tuning the small details.
The type and size of a company affects the expected role of the company in an AVDP ecosystem. Respondents from eight companies expected their company to have a developer role while respondents from nine companies expected their company to have a user role in an AVDP ecosystem in the future. In addition, 70% of the companies expected to have both an AVDP user and developer role in the future.
One respondent expected that their company would have a user role with a developer partner capable of developing ML solutions for the needs of the company's customers. Larger companies could have units in both the AVDP user role and the AVDP developer role. In the future, there could also be opportunities to use ML solutions developed in foreign units.
The business domain determines the AVDP methods that are required to boost the business of a particular company. Most of the companies preferred to use existing ML solutions and development is expected to focus more on integration/adaptation tasks aiming at utilising existing ML methods in business rather than developing completely novel ML methods for a particular company's needs. Company-specific ML methods are only developed in special cases if a suitable ML solution is not available for business needs. The interviews show that there are also opportunities for developing solutions to serve a group of selected companies with similar kinds of business needs for AVDP.

C. REQUIREMENT FOR VISUAL SCENE UNDERSTANDING
Five companies expected to use all five levels of visual scene understanding presented in the interviews. The interviewed companies require methods that are capable of understanding visual space, handling occlusions and detecting partially visible objects in a physical environment. For example, a point cloud does not accurately describe the scenes if certain objects are only partially visible. There is also a need for methods that are capable of understanding the hierarchical structure on a level that enables the detection of elements that can be taken apart and snapped together. Visual scene understanding can also be used for inserting information content to the points of a point cloud data set.
Many companies had the need to detect the desired object types in a physical environment. The number of object classes that must be identified depends on the use case. Some use cases require the preparation of a comprehensive inventory for physical objects in a room requiring the recognition of many object classes. In some use cases, it is sufficient that a solution is capable of detecting 2D surfaces or the positions and poses of a very limited number of 3D objects. Some use cases only require text recognition that is capable of detecting sign texts on surfaces in a physical environment. There was also a need for more advanced solutions that could detect the components and structures of physical objects in a visual environment.
The input data used in visual scene understanding could be 2D data such as 2D, or 360-degree videos, or 2.5D data such as point cloud or RGB-D data captured from the environment. Environment scanning takes time, processing power, human effort and creates costs. There should also be clearly defined protocols for users scanning in a visual environment. Variant illumination and mirrored surfaces are difficult to handle in visual scene understanding. Achieving sufficient measurement precision is costly and can therefore result in great differences in the quality of the captured input data. In special cases, it is possible to use high quality scanners and professional users to capture input data from a physical environment. However, the capturing must often be carried out at minimum cost and with sensors available in consumerlevel devices.
The companies had very different functional requirements (e.g. coverage, accuracy and performance requirements) for visual scene understanding methods. The type of use stipulates different kinds of requirements for the accuracy of visual scene understanding. Many respondents stated that it was better to have a smaller number of working features than a high number of poorly working features in a visual scene understanding solution. For example, one respondent stated that 98% correct recognition of fever classes is better than 10% accuracy for all objects in an environment. Accuracy is not so critical in entertainment applications. However, professional use such as building design or maintenance stipulates higher requirements for the accuracy of object recognition.
There are various kinds of performance requirements for visual scene understanding methods. Applications such as AR applications require virtually real-time tracking for the physical environment but there are also applications that can use offline processing. For example, some applications can capture input data in a physical environment and send the data to the cloud services that will finally perform offline processing for the visual scene understanding tasks and for the captured content.

D. REQUIREMENT FOR 3D MODEL RECONSTRUCTION
The majority of respondents expected that the 3D model reconstruction methods would have the potential to boost their business. These methods can lower modelling costs and offer new information and completely new customer bases. 3D model reconstruction can assist in the visualization of physical environments and buildings in XR environments.
The respondents noted many applications for 3D model reconstruction in business. 3D model reconstruction can be used for enhancing VR environments, producing 3D models for changing physical environments such as construction sites, combining real-world information with accurate 3D design models, or detecting differences between 3D design models and a constructed environment. For example, consumers, building designers and construction and maintenance companies can benefit from reconstructed 3D models. In some companies there was a need for real-time 3D model reconstruction with or without textures.
3D model reconstruction can be used for enriching 3D design models. For example, it is possible to generate a 3D model and then refine the model at a later point by capturing the data of a building. Not all parts of 3D objects are always visible. This requires 3D model reconstruction that is capable of performing object completion for the missing parts. For example, technically interesting structures are often located behind the surfaces of a building.
Some companies want to use 3D model reconstruction to process point cloud data. There is a need for pattern recognition and the replacement of standard components (e.g. valves and pipes) or partially visible objects in point clouds captured from physical environments. Also, manual cleaning and simplification of captured point cloud data is a difficult and tedious task that requires much effort. 3D model reconstruction methods can simplify this work as these methods can detect objects and surfaces and remove point cloud data that relates to objects and surfaces that should not be included in the captured models.
The quality and amount of input data can have a huge impact on the results produced with ML. The quality of input data also depends on the quality of scanning processes and scanning devices. Thus, there must be clearly defined protocols on how input data is captured in an environment, as it should be possible for anyone to capture the visual data (2D images, 360 videos and point clouds) from an environment.
There is a need for fast and accurate measurement of buildings. However, the cost is a very important factor in large volume projects. 3D laser scanning can produce accurate scanning data but takes time and effort and creates costs. A viable alternative could be the use of low-cost devices in the reconstruction of 3D models.
Mobile applications require methods that are capable of producing simplified 3D models for real-world objects to enable the fluent use of the content in mobile devices and networks. Simplifying 3D content can cause losses of information, although this is not as important in entertainment applications such as AR applications. The manual simplification of 3D models can lead to major costs if there are many 3D models that need to be simplified. Existing authoring tools such as Unity are capable of simplifying 3D content but more tools are needed to automate the simplification of 3D models.

E. REQUIREMENT FOR AUTOMATED CONTENT GENERATION
The interviews demonstrate that there is a growing interest in using automatic content generation in business. The respondents saw that these solutions would boost their business as they lower both production costs and the volume of manual work, speed up workflows, reduce content creation time, and enable a way of providing less expensive services for customers. Automated content generation can also assist in selling and demonstrating a company's offering for its target market and lowers the threshold for offering services to specific customers. For example, it can make the customisation of products and services easier, thereby helping to create Proof of Concepts (PoCs) and offers to potential customers. For example, AR applications can greatly benefit from automated content generation as content creation is typically the slowest step in AR. Automated content generation can enable new use cases for XR as not all content presented in applications needs to be manually prepared on a case-by-case basis.
The use of 3D content in the future will increase and, like current 2D content, 3D content will be a standard way of presenting things in the future. There is a need for fully automated content generation and for solutions that assist users in the creation of 3D content. For example, the interviewed companies require solutions that are capable of automatically generating dynamic XR experiences for customers and for solutions that assist the generation of 3D content from 2D in VR environments. Automated content generation such as avatar generation from 2D selfies is already being used by some VR companies.
The development of automated content generation solutions takes time and creates costs. However, in the long term, these methods can produce more value and improve the User eXperience (UX) of a company's products and services. Many of the respondents emphasized the importance of UX and the usability of User Interfaces (UIs). Firstly, a better UX provides a competitive edge. (e.g. it is important to achieve the so-called 'wow' effect in XR experiences). Secondly, a better customer experience will offer greater sales opportunities. The user interface must achieve certain criteria and follow design principles because it is the most important aspect of business -and a positive user experience is essential. Unfortunately, such issues can be difficult to manage in automatic content generation.
The quality requirements for 3D content can vary greatly. In some cases, 3D content can simply be meshes and 99176 VOLUME 8, 2020 textures are not needed. Some applications require better content so that the meshes and textures are in place in real time. Consumer apps often require very high-quality content and some of the interviewed companies want to provide ultra-high-quality content for their customers. In such cases, the desired quality level can be very difficult or even impossible to achieve using the existing automated content generation methods. There is also a need for solutions that are capable of adapting the quality of 3D content. For example, 3D asset complexity must be correct so that more than 60 FPS can be achieved when using VR glasses.
The automatic synthesis of designs can assist in the creation of better 3D designs and structures and can make it faster to procure an optimal solution for a building, for example. ML can assist in the kind of issues that are not possible to handle using parametric models, which are now widely used in building design. Unfortunately, at this point, it appears that the automatic synthesis of designs is now only possible in very well defined and limited use cases. For example, at present, the fully automated creation of building designs is very difficult to achieve because there are many variables that must be considered in the design of a building.
Many companies were interested in using the synthesis of indoor scenes for generating 3D scenes and VR environments for specific uses. For example, the synthesis of indoor scenes can be used to enhance the visual appearance of virtual models of buildings and the décor of rooms in virtual models can assist the evaluation of room designs. Architects, among others, could benefit from the synthesis of indoor scenes that could draw up a reasonable proposal for room décor such as materials and furniture.

IV. DISCUSSION
Only companies that had expressed an interest in AVDP were asked to participate in the interviews. Thus, it is not possible to generalize the results of the interviews to all kinds of companies. For example, it can be assumed that the interviewed companies have better than average knowledge of ML solutions and thus have more potential to adopt a developer role in an AVDP ecosystem.
The business use of AVDP solutions requires companies to be able to quantify the size of the required investments, estimate the potential savings and increased margin that AVDP can generate in their business in the longer term and, finally, estimate the overall economic return on investments made in AVDP systems. This can be challenging for an external study as it requires accessing confidential information about companies, understanding the business domain and having knowledge of the possibilities and challenges of AVDP solutions. However, overall, all the companies saw an investment in AVDP as an attractive option in the long term. One important reason for this was the proven cost efficiency of the current cloud-based solutions.
The respondents identified many opportunities relating to the use of AVDP solutions in business. For example, AVDP can provide a competitive edge for a company, enabling it to offer products and services that differ from its competitors, and make the workflow faster.
Many companies require visual scene understanding solutions that can detect physical objects in an environment. Most of the companies require more advanced solutions that can detect the poses, parts and structures of physical objects in a visual environment.
The reconstruction of 3D models can lower modelling costs and offer new information that offers a completely new customer base.
Automated content generation solutions can boost business as they lower production costs and the amount of manual work, speed up workflows, reduce content creation time and enable a way of providing less expensive services for customers. Automated content generation can also assist in selling and demonstrating a company's offering for its target market and lowers the threshold of customising services to specific customers.
However, the respondents also identified challenges that limit the use of AVDP solutions in business. Firstly, companies must be capable of identifying use cases that benefit from AVDP and there must be customers who are ready to pay for the developed AVDP features. Secondly, companies should be aware of the limitations of such solutions and be capable of estimating the kind of results that these solutions will produce in real-life use. Thirdly, there are practical issues that limit the use of AVDP solutions in business. Firstly, inadequate object detection accuracy, challenges related to the physical environment (e.g. occlusions, illumination, mirrors) and the quality of input data and costs (manual work and sensors) relating to capturing the physical environment limit the use of visual scene understanding solutions in business. Secondly, a lack of accuracy of 3D model reconstruction, challenges relating to the physical environment (e.g. occlusions, illumination, and mirrors) and the quality of input data restrict the use of 3D model reconstruction solutions in business. Thirdly, development costs, inadequate quality of the generated content and too high complexity limit the business use of automated content generation solutions.

V. CONCLUSION
This paper presents the results of 10 company interviews that studied the skills and insights of companies in developing or using ML-based AVDP solutions in their business.
Most of the companies had previous knowledge or experience of using ML in business and 90% of the companies had experimented in using ML or had used ML in business. Also, 30% of the companies had already used ML in their business in one form or another. The type and size of a company affects the expected role of the company in an AVDP ecosystem. Eight companies expected to have a developer role and nine companies expected to have a user role in an AVDP ecosystem in the future. In addition, 70% of the companies expected to have both an AVDP user role and an AVDP developer role in the future. VOLUME 8, 2020 The respondents identified many opportunities in the AVDP solutions. For example, AVDP can provide a competitive edge for a company, enabling it to offer products and services that differ from its competitors, and create a faster workflow. The business use of AVDP solutions requires a company to be capable of identifying use cases that benefit from AVDP and there must be customers who are ready to pay for the developed AVDP features. Also, it is very important that the adopted solutions work in real-life use. For example, issues such as inadequate quality of input data, properties of the physical environment (e.g. occlusions, illumination, mirrors) and development costs can limit the business use of AVDP solutions.
Most of the companies preferred to use existing ML solutions instead of developing new solutions for their business needs. The interviews show that there are also opportunities to develop solutions to serve a group of selected companies that have similar kinds of business needs for AVDP.
The interviews also show that the lack of training data is an obstacle that limits the use of ML in business and there is a clear need for solutions that support the production of training data. The majority of the interviewed companies (80%) have data for the production of training data. Unfortunately, in many cases, the data of these companies is currently unprocessed raw data that does not have a clearly identified use.
MARKO PALVIAINEN received the M.Sc. degree from the Lappeenranta University of Technology, in 1998, and the Doctor of Technology degree in computer science from the Tampere University of Technology, in 2007. He is currently a Senior Scientist working with the Collective and Collaborative AI Team, Technical Research Centre of Finland (VTT). He has several international top tier publications in the field of end-user programming. He has considerable experience of semantic technologies and models and ontology-based software development methods and tools aimed at increasing the effectiveness of software development processes. His research interests include software architecture, software development methods, data economy, machine learning solutions, and automated visual data processing. He also holds the title of Docent in adaptive interaction techniques with the University of Oulu. He has authored around 100 scientific publications. He holds over 30 patents. His current interests include machine learning methods for enabling novel interaction techniques and behavior, and context and activity detection for future wearable devices for various application domains. VOLUME 8, 2020