Visual Cue Based Corrective Feedback for Motor Skill Training in Mixed Reality: A Survey

When learning a motor skill it is helpful to get corrective feedback from an instructor. This will support the learner to execute the movement correctly. With modern technology, it is possible to provide this feedback via mixed reality. In most cases, this involves visual cues to help the user understand the corrective feedback. We analyzed recent research approaches utilizing visual cues for feedback in mixed reality. The scope of this article is visual feedback for motor skill learning, which involves physical therapy, exercise, rehabilitation etc. While some of the surveyed literature discusses therapeutic effects of the training, this article focuses on visualization techniques. We categorized the literature from a visualization standpoint, including visual cues, technology and characteristics of the feedback. This provided insights into how visual feedback in mixed reality is applied in the literature and how different aspects of the feedback are related. The insights obtained can help to better adjust future feedback systems to the target group and their needs. This article also provides a deeper understanding of the characteristics of the visual cues in general and promotes future, more detailed research on this topic.


INTRODUCTION
P HYSICAL activity, especially exercise and physiotherapy, is important to improve and retain a healthy condition.In recent years, instructions and feedback given to learn and execute the relevant body movements has been increasingly supported by technology.Especially, systems employing mixed and augmented reality have been devised to support this so-called motor skill training.The mixed and augmented reality technologies represent a platform for innovative new visualization techniques regarding motor skill training which are worth exploring.
In this article, we survey such approaches of visual feedback using mixed reality in the field of physical therapy, exercise and motor skill learning in general regarding the different visual feedback and technologies involved.In comparison to the existing surveys (see Section 1.1), we look deeper into which visual cues are used and how they are employed to achieve the goal of movement correction.To achieve this aim, we devise a classification of the approaches surveyed which involves, among others aspects, the used MR technologies, temporal and spatial characteristics of the feedback given, and the body parts addressed by the feedback.Additionally, we relate the approaches to stages of the way humans learn skills [15].The intention is to obtain a clearer view of which visualization techniques and visual cues are suitable for given tasks and to show where there are gaps in the existing research.The former can help practitioners and developers to choose the appropriate visual cues for their task and system, while the latter can show researchers avenues for future research.
This survey discusses approaches from 39 papers which were selected out of 131 papers initially reviewed.These approaches were found to be relevant in terms of providing insight into the types of corrective visual motion feedback investigated in current research.
In the focus of our analysis is aimed at the visual feedback of the surveyed literature.This does not include the therapeutic and medical aspects.We listed surveys addressing these points in Section 1.1.

Related Work
There is a limited number of surveys discussing visual feedback in mixed reality.While their scope regarding use cases, body parts or technologies is often narrower than ours, the analysis regarding the types of the feedback and the forms it can take is broader.The survey at hand goes in more depth regarding feedback while retaining a wide scope on use cases, body parts, and technologies.In the following paragraph, we will discuss related surveys and ways in which our work complements the existing research.
The scope of the present work, and therefore the scope of the related surveys, is located at an intersection between medicine, sports and computer science.In the medical literature, surveys like Mubin et al. [43], Schiza et al. [59], Gandhi et al. [17] and Rutowski et al. [57] provide a treatment-oriented perspective on the field of digital feedback for movements.Their overview and analysis is mainly focused on the outcome of the treatment and less so on the the type of visual feedback.The present paper however is concerned with the visualization of the feedback given.Related surveys from the physical exercise part of the literature, in particular Perin et al. [53] and Liebermann et al. [33], look at the performance regarding exercise.One topic connected to visual feedback is (serious) gaming.Thus it is worth noting that the above mentioned Mubin et al. [43] and Ma et al. [35] discuss serious games in health care.Sawan et al. [58] present a literature review on how various MR and AR technologies are used in the sports industry.The review provides insights on the sport related use cases of the MR and AR technologies.
Gatullo et al. [18] conducted a systematic literature review and classification for visual assets in industrial augmented reality applications.As skill learning and training are an important use case for augmented reality in the industrial context, their work is related to ours.The focus of their approach, however, lies heavily on tool handling, which we excluded from the scope of this work (see Section 3).
In addition to these generally related publications, there are a few papers which stand out as being closer to our approach, as they analyze visual aspects of feedback and hence have a scope overlapping ours: Viglialoro et al. [74] investigate literature aiming at shoulder rehabilitation supported by augmented reality.The scope of their review led to a sample collection of nine papers.The arm and hand movements of the users, as well as rehab settings, target groups, tracking technologies of each augmented reality (AR) system, user interfaces and evaluation methods were investigated.
The work of Neumann et al. [46] surveys 20 approaches focusing on virtual reality in physical exercise.They investigated activity, equipment, virtual reality (VR) technology, point of view and whether other persons than the user are present in the environment.The characteristics of test groups were looked at as well.Number of participants, gender, age range, experience type and location were investigated and documented.Additionally, the paper summarized aims, conditions, measured features, immersion and key findings of the researched literature.
Brennan et al. [6] analyzed literature about feedback design in home rehabilitation.While this comes close to our approach, they did not focus on mixed reality and limited their scope to home rehabilitation, which resulted in a smaller body of literature of only 19 research attempts.Clinical context, system components, feedback design, and the evaluation of these features were investigated.The feedback characteristics were categorized and analyzed.This resembles our approach, although we took a closer look at the visualizations per se.The smaller scope of our survey in comparison to the approach of Brennan et al. enabled us to analyze visual cues in more detail.

Organization of This Paper
This paper starts with an introduction establishing the motivation of the present survey and relating it to previous works as well as putting it in scientific context.Relevant terminology and fundamental concepts like feedback and skill learning basics are introduced in Section 2. In Section 3 our methodological approach to acquire literature is explained and the guidelines we followed to do so are described.Section 4 elaborates on how we categorized the literature and the features we chose for that matter.Table 2 shows the results of our classification and represents a central point of reference for the whole paper.To make the nature of the surveyed literature more accessible to the reader, an exemplary discussion can be found in Section 5.In Section 6 the main insights of this survey are presented.The paper is concluded by summarizing the findings, mentioning limitations and highlighting interesting open questions in Section 7.

BACKGROUND AND TERMINOLOGY
To establish a common ground for understanding and discussion, we explain the definitions of the most important terms in this section.Additionally, we provide an overview of the fundamental concepts we are working with in this article.

Feedback
The term feedback originates from electronics as stated by Morone et al. [41].In this context the output of a system is combined with the input to affect the function of the system.This idea was later transferred to the social sciences, as Fig. 1.Illustration of the human-machine feedback loop based on [41].The machine or system usually is a computer with some kind of display, for example an augmented reality headset.
humans observe their actual state and regulate their behaviour according to a desired state to minimize an error (or distance in our case), as stated by Morone et al. [41].This idea is illustrated in Fig. 1.The natural feedback loop, which is represented on the left in Fig. 1, involves planning, executing, perceiving and adjusting the movement.
This feedback loop can be extended to incorporate a technical feedback system as seen in the literature we analyze (represented by the right loop in Fig. 1).To provide feedback, the machine detects certain aspects of the human output (in our case the movement).This information then enters the system as machine input via device sensors.The machine-generated feedback is then delivered to the human as a (in our case visual) machine output.
The scope of the present paper includes literature which for the most part includes what Morone et al. [41] define as 'augmented feedback'.This means the user is already aware of the feedback signal given by the system.In our case, the visual information of the body position in space is emphasized by the feedback, and a focus is placed on movements that the users could readily detect themselves.
The term augmented feedback has to be distinguished from the term biofeedback, which is commonly used in current literature.It often occurs in the context of device-supported rehabilitation feedback.When used accordingly, it refers only to signals the users are not aware of, for example in electromyography (EMG), where the electrical activity of the muscle is measured [39].
In educational settings, feedback is traditionally given by another person, usually a teacher, instructor or trainer (see Section 2.3).In this context, feedback which enables participants to correct their behaviour is often called corrective feedback [22], [34].Similarly, in a feedback system, information qualifies as corrective feedback if it gives the user insight into how the movement can be carried out differently in order to accomplish the task at hand correctly or at least in an improved manner.
To be precise at this point, it has to be mentioned that in computer science 'feedback' is often used for the response of a system to confirm input by the user [7].This meaning of the term is not relevant for the scope of this article.

Phases of Skill Learning in Physical Activity
The acquisition of new skills proceeds in three stages or phases as described by Fitts and Posner [15].These phases of skill learning are connected to motor skill acquirement as shown by various authors (see e.g., [61], [64], [67] and [70]).The three stages as seen in Table 1 successively take place one after another over the course of internalizing a movement.
Skill acquisition starts in the cognitive stage, where the learner tries to grasp the overall concept and understand what to do.The flow of information from an instructor (or instructions) to the learner plays a major role during this phase, as the learner still processes what to do.In the associative stage the learner will construct the actions to be done from minor movements (subroutines) and the information gathered during the cognitive stage.The final stage is the autonomous stage.Herein the learner has fully internalized the movement.The cognitive capacity needed for the movement is minimal in this stage.Thus additional information can be accessed or processed while making use of the skill.The efficiency and performance of the activity enacted still increase in this stage.
Once the have learners internalized an action, they can revisit stages to improve and 'relearn' their movements [25].As stated by Fitts and Posner [15] the transitions between stages are not always clear, nevertheless it will be useful in the survey at hand to relate the skill level of the target group to the feedback given.

Instructor or Agent
In Fitts and Posner's work [15] discussed above, instructors play a central role.They transfer knowledge to the learner and decide what input is suitable at a given moment.In most mixed reality systems, a machine substitutes the instructor as illustrated in Fig. 1.This means that feedback systems have to be designed carefully with the user in mind (participatory design) [12].Oftentimes health care professionals are included in this process [23].
Hattie and Timperley establish the more general notion of an agent in their work [22].This notion includes teachers, peers, books, parents, the self and experience.It can be regarded as analogical to the term instructor.A mixed reality system substituting the human instructor qualifies as such an agent.Virtual Trainers, virtual medical professionals and simplified human shapes are depicted in mixed reality to provide feedback to the user, to hint at positional discrepancies, or to show in advance what positions to copy.A good example for this is the work of Mostajeran et al. [42], which utilizes a virtual coach offering instructions to the user.
A few systems included in our survey, like those described by Debarba et al. [13] and Furukawa et al. [16], provide feedback to instructors.These approaches are applied to rehabilitation but could also be applied to physical exercise or skill learning in general.One advantage of systems targeting instructors is the exact metrics they can provide to the instructors.Consequently, the instructor is well informed and can decide what information to give to the learner or client.

Immersion
As discussed by Nilsson et al. [47], there are various definitions of immersion which seem to differ from one another Presentation based on [25].
quite considerably.What they all seem to have in common is that immersion influences a feeling of presence, an impression of being there.Ijsselsteijn et al. [27] established that immersion, and connected with it, presence help to motivate users.They even increase the feeling of competence and control, which is highly relevant when applying a given feedback and hence correcting a false movement.

METHODOLOGY AND SCOPE
To acquire literature for this survey, we conducted snowballing as search approach.The snowballing followed a scheme similar to the one described by Wohlin [78].We preferred snowballing rather than database keyword searches because it has been shown to be more effective for acquiring sources in general (see e. g., Greenhalgh et al. [19], Badampudi et al. [2]).As start set for the snowballing, we used the papers mentioned in related works (Section 1.1).While snowballing, we did not limit ourselves to papers, but included all sources of interest to the research community, considering all sources which incorporate motor skill training in mixed reality.
To decide which of the papers obtained to include in the survey, we conducted a screening analogical to the flow diagram of the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines described by Liberati et al. [32].
Since the field of mixed reality is moving fast and there have been major changes and innovations in the recent years, we additionally limited the publications to be surveyed to those which have been published in the period since 2016.The year 2016 marks the launch of Microsoft's AR headset HoloLens which represents an important development to the research community in the field of mixed reality applications [51].
In detail, through snowballing with the above-mentioned scope, we identified 131 promising papers.After screening, we found 28 papers were not relevant to our overview, since their keywords or title might have sounded promising, but their content did not match our scope.Subsequently checking for eligibility according to the use of visual corrective feedback, 64 more papers were excluded.We found that these papers did provide corrective feedback in the sense explained in Section 2.1.In the end, 39 papers matching our scope and providing suitable content were included in the review.The methodological quality of the papers has not been evaluated.
To verify our acquisition and selection process, we conducted a database search in Google Scholar, IEEE Xplore and ACM Digital Library using search terms extracted from the literature matching our criteria.For this purpose the word pairs with the highest occurrence among the paper titles (ignoring filling and linking words) were identified.Two pairs were each combined with AND-arguments to a search term resulting in three or four words per term depending on duplicate words in the pairs.All databases were searched with the same terms.No additional relevant papers were found by this process.Thus we conclude that the snowballing was effective and sufficiently thorough.
It appears to be especially important to mention why certain types of approaches are not as present in this survey.
First, approaches that utilize movements only as an input (e.g., walking-in-place in VR applications) are not covered by this survey as they do not provide motion feedback (corrective feedback) in the sense we discussed in Section 2.1.Input movements as seen in walking-in-place approaches or even mouse clicks are merely necessary means to control systems and applications.Exertive movements as an input are usually used to increase immersion.
Second, exergames [48] or serious games are usually designed to fulfill certain objectives, like sport or rehabilitation.Corrective feedback as discussed in Section 2.1 is provided in just a few instances, for example the papers by Afyouni et al. [1], Caserman et al. [9], Raffe et al. [56] or Booth et al. [4].These works do not define their use of the term feedback.But they provide a visual incentive to carry out a certain movement correctly.
Third, mixed reality skill training occasionally focuses on handling tools.Examples of such approaches have been described by Pucihar et al. [54] and Mohr et al. [40].These approaches give feedback for motor skills, but often focus on the tool itself, not on the body.Although for handling a tool a complex combination of body movements is necessary, and although the position of the tool is a result of these complex combinations, we still excluded papers with a missing focus on the body from the survey.Nevertheless, we included work that gives feedback for the body parts handling the tools.One such approach by Furukawa et al. [16] provides feedback for the hand positioning while writing calligraphy.Another relevant approach regarding tool use, is the work of Cao et al. [8], who utilize augmented reality to show a full body feedback for skill training involving machine tasks.
Fourth, there is at least one research attempt which appear noteworthy although it predates 2016 and is thus not included in the main scope of this article: The approach of Tang et al. [69] provides corrective feedback involving visual cues similar to many of the other references in the present survey.Thus, this work could be categorized according to Section 4 without problems.What is especially interesting when considering this work are the similarities and differences of the used visual feedback categories movement arc, directional arrow, nearest arm and topdown angle to the categories we use in Section 4.
Finally, our intent is to investigate the visual cues related to movement feedback from a visualization standpoint.Thus, we do not analyze the therapeutic aspects of the surveyed approaches.

CLASSIFICATION
To analyze the work surveyed in this article and to provide insights about it, we select different features and characteristics to classify and group the literature.The following paragraph outlines which categories were chosen, why they are important and how they impact visual feedback.Each category is described in a separate subsection below and Table 2 depicts the complete conducted classification.The classification is directed at the visual computing aspects of the feedback.The therapeutic and medical aspects of the topic are discussed elsewhere (see Section 1.1 for more information).
The technology used to implement mixed reality has a big influence on immersion (see Section 2.4) and feedback visualization.Additionally, the point of view is important to distinguish, as it influences the identification with the avatar as well as the immersion and therefore the impact of feedback.It also affects which body parts are visible to the user.The abstraction type of the feedback determines what kind of information is provided to the user.It is possible to provide the feedback at different times of the process.This temporal order impacts the learning process and changes the way feedback is perceived.A classification into the former mentioned stages of learning (see Section 2.2), can give a quick indication how and in what depth the system provides feedback.The type and the scope of feedback change drastically depending on the body parts for which the feedback is provided and both, type and scope, are strongly connected to the use case (sports, rehabilitation etc.) the feedback is aimed at.Lastly the publication venue is a higher-level feature to classify the literature.

MR Technologies
There are different definitions regarding the term mixed reality (MR) [66].For this study we based our definition on the reality-virtuality continuum of Milgram et al. [38].
The mixed reality technologies used in the surveyed literature involve very diverse approaches [60].Essentially HMDs for both VR and AR are widely spread.HMDs used in AR can be further categorized as optical see-through, in which the real world is perceived through glasses and the virtual elements are added to it, and video see-through, which shows virtual elements together with the camera-captured surroundings.A special implementation of VR is CAVE, which combines life sized screens with stereo glasses to create an immersive digital environment.In the context of AR a room-mounted display, oftentimes displaying an augmented mirrored camera image (augmented mirror), can be used to provide feedback.Finally, there are AR approaches which augment the environment by projecting computer graphics directly into it.

Point of View
The point of view (POV) plays an important part in the type of feedback that can be given.A third person or exocentric perspective (Fig. 2, left) can provide a full-body view, which makes it easier to supply a complete feedback for multi-joint movements and complex movement sequences.
It could be argued the immersion provided by a first person or egocentric point of view (Fig. 2, right) is superior to a third person view.This would mean a first person view offers more motivation and a feeling of control (see Section 2.4).However immersive HMDs often feature a limited field of view [72], which can lessen the user's ability to see and correct position or movement for certain body areas.
There are approaches which combine both of the above mentioned features.

Abstraction Type
The abstraction type of movement information used for giving feedback impacts the users' experience and the corrections they execute.Directional feedback shows the direction in which a limb should be corrected.For example arrows can be utilized to achieve this.An alternative to showing the correcting movement is to visualize the target state.Positional feedback predominantly does this with an outline, transparent target avatar or end position to show where the ideal position is.In contrast to that, guidance demonstrates the desired movement before the user will execute it.The concrete visual cues being used to provide these types of information are described in Section 4.9.

Temporal Order
The temporal order in which the feedback is provided in the context of the movement execution varies in the analyzed approaches.In some cases a playback, where the feedback is shown after the execution, might increase the precision of movement, while in other cases a real time feedback offers an instantaneous in-situ opportunity to apply correction.Additionally, it is possible to show the user the future movements, which can provide information about upcoming target motions.

Stages of Learning
Considering that motor feedback can help users learn a skill, Fitts and Posner's [15] stages of learning can be applied to visual motor feedback.We use the feedback features provided by the surveyed literature to assign each approach to one of the stages of learning.This assignment can provide useful information regarding which part of the learning process the feedback addresses and which depth it can provide.The stages are explained in more detail in Section 2.2.It is also to be said, that the stages are transitioning into each other fluently and that in some cases arguments for a classification into a different category can be made [15].

Publication Venue
The literature surveyed is sourced from several different publication venues.We categorized the venues as computer science (VR, AR, MR), computer science (HCI), computer science (other), medicine, health & sports as well as patents.These categories can give readers a general orientation in which areas most of the research is rooted and where there is still potential.The descriptions from authors from different venues also usually put emphasis on different parts of the respective approaches (e. g. application versus technology versus usefulness).

Body Parts
Furthermore the body parts for which feedback is given are of interest.The feedback changes with the degrees of freedom of different joints.It is also interesting to consider how the visibility of a body part in a neutral position interacts with the feedback (see Section 4.2).In the literature we surveyed, the feedback was provided for arms, legs, hands or the whole body.

Use Case
The analyzed approaches showed a wide variety of use cases the feedback was given for.We identified individual sports, team sports, rehabilitation and motor skill training as typical use cases.Motor skill training, here, does not only include approaches that analyzed motor skill training per se, but also the ones that had no use case taken into consideration so far.

Visual Cues
The visualizations in the literature surveyed featured several different visual cues to indicate how a target movement should be executed.These visual cues are the most in depth description of the feedback we provide and are closely linked to other features such as technology, body parts and use case.The distribution of visual cues among the literature surveyed is listed in Table 2. Exemplary images for all of the visual cues explained in the following, can be found in Fig. 3.The labels of the examples in Fig. 3 correspond to the letters found in front of the visual cue names in the following description.

a) Textual
Hints to correct the movement with words or text were categorized as textual.These cues are in most cases combined with other feedback methods (e. g., Oshita et al. [50] and Conner and Poor [11]).

b) Color Coding
Colors can be an intuitive indicator for wrong or right (e. g., red/green).For example an avatar with color changing limbs or joints (as in Fig. 4 taken from the work of Oka et al. [49]) can be used to give feedback for a desired movement.The color coding can be utilized in many ways, but is especially well suited to be used with a 2D or 3D avatar.

c) Body Outline
An outline of the body, or of certain parts of it, can provide feedback while causing limited or no occlusion of an avatar or video that represents the actual position.Showing a body outline is used in combination with video or avatars in both 3D and 2D.Ikeda et al. [29], for instance, utilize this technique to give feedback for golf strikes.

d) End Position
To show the direction or correction of a movement it is possible to show the end positions of certain limbs or joints.This advises the users to correct their pose so their limbs or joints fit these particular positions.Each end position is represented by a spatial coordinate.Oftentimes a volume or area is shown to allow for a certain tolerance.There are several methods implementing this, including 3D and 2D and even projectionbased approaches (e. g., Sekhavat et al. [62]).

e) Transparent Target Avatar
A transparent target avatar can be used in combination with a video or 3D/2D opaque avatar showing the current pose to create a sense for how movements should be executed.The transparency of the target pose or target movement prevents this visual feedback from occluding the actual pose.For example Barioni et al. demonstrate how this can be used to show target movements for ballet practice [3].This can be seen in Fig. 5.

f) Opaque Target Avatar
In 3D, an opaque target avatar depicting the target movement, can be superimposed with an avatar, showing the actual movement.The two objects colliding create an intersection effect as seen, e. g., in the work of Ikeda et al. [28].Another use of an opaque target avatar is a video overlay as seen for example in [31].

g) Movement Abstraction
The correction cues for some movements may be hard to perceive.This can be the case if the body part to be corrected is out of sight for the user or the correction and movement is minimal.These movements might be easier to comprehend if they are represented by an abstraction of the movement rather than showing the actual movement or start/end position.An example for this is shown by Vidal et al. [73] in their work.

h) Video Overlay
If a video stream is implemented in the system it can be used to show a superimposed target avatar.This way the system can depict what movements to execute next, a sequence of movements or correction cues.Video overlay is used in augmented reality (AR) approaches and benefits from 3D implementation but is not limited to it.One example of this visual cue can be found in the work by Furukawa et al. [16] who use video overlay to teach writing motions.

i) Rubber Bands
To indicate the direction of the target movement, the actual limb positions and the target limb position can be connected with a line.The result resembles so called rubber bands connecting actual and target body positions.Yu et al. [79] included this along with other visual cues in their approach.

j) Arrows
Arrows are an intuitive technique to indicate a direction.Hence they can be used to show a direction in which to move, or to provide correction cues for poses and movements.Oshita et al. [50], for example, use it to show the direction in which the target pose lies.
It should be noted that arrows can be implemented as a special case of the above mentioned rubber band cues.The only difference would be that an arrow head is added.

k) Trajectories
Movements can be described by lines that represent the path of a certain joint or bone in space over time.These trajectories can show the user where or along which path to move next, or how to correct the movement executed.Clarke et al. [10] combine trajectories with a video overlay in their approach.

l) Graphs
The data provided by the motion of a body can be used to create graphs (sometimes also called plots).This classical visualization of numerical data provides a detailed but abstract way to present movement information and correction cues to the user or instructor.As an example, Takahashi et al. [68] visualize the velocity of various joints and a ball to evaluate a baseball bat swing.

m) Limb Angles
One way of defining movements is by observing the angles at the joints between two bones.These limb angles can consequently also be used to provide feedback to the user on how to correct the movement or in what way to move next (see e. g., Debarba et al. [13]).

EXEMPLARY DISCUSSION
In the following discussion, we will point out examples to explain certain features that are either widely spread among the surveyed literature or have rare occurrence.In other words, we provide a collection of representative and exceptional examples.This will help the reader understand the composition of the body of literature we are analyzing and will complement the overview provided by Table 2.

Representative Examples
The work of Oka et al. [49] utilizes color coding in their visualization to provide feedback to users through VR glasses.Users are shown color coded cues indicating if any of their limbs need correction for an ideal execution of the exercise.The real time feedback that is provided is supplemented by textual cues and training meta data.The visualization of an abstracted skeleton as avatar (see Fig. 4) is oftentimes utilized by the analyzed attempts (see for example Furukawa et al. [16] or Escalona et al. [14]).
Ikeda et al. [28] likewise present a research attempt to visualize motion feedback using VR glasses.Users can watch their body (as avatar) from a third person perspective.This enables them to gather information about the intended movement correction of the whole body.It is possible to get the feedback in real time as well as a playback after the exercise.This work addresses the matching of actual and target movement with dynamic time warping.The matching of time discrepant executions is a challenge many real-time or Transparent target avatar, f) Opaque target avatar, g) Abstraction, h) Video overlay, i) Rubber bands, j) Arrows, k) Trajectories, l) Graphs, m) Limb angles.Images from [9], [10], [13], [16], [20], [28], [50], [55], [68], [73], [75], [77], [79].Fig. 4. Color coded skeleton: The skeleton visualizes where to correct the movement.Image from [49].
playback feedback solutions have to face.The exocentric perspective in combination with VR is a setup often found throughout the literature surveyed.Example can be found in the work of Hoang et al. [24] and Ware et al. [76].
Barioni et al. [3] show upcoming poses for the users to mimic.The target pose of the user is visualized by a transparent target avatar which is intersecting an opaque avatar representing the user's actual pose (see Fig. 5).The execution time of the pose is illustrated by a clock, which elapses when the pose is held correctly.The experimental setup uses a room-mounted display as feedback technology, but the system can also be used with VR glasses.The method of showing the poses before the execution is a typical way of providing information about the movement.It is often used in the literature at hand and can, for example, also be found in the work of Cao et al. [8] and Han et al. [20].
Oshita et al. [50] use opaque target avatars to provide movement feedback for tennis shots.Additional arrows show the direction in which the correction is to be made and text expresses how the movement should be executed.Furthermore, limb angles are shown to depict the intended correction of a movement.After the execution the feedback can be viewed on a large display.The target avatar and avatars showing the actual user pose, together with the option of a playback function is representative for many approaches in the surveyed literature.Examples of this combination can be found in the work of Hoang et al. [24] and Ikeda et al. [28].

Exceptional Examples
A rehabilitation-oriented approach is presented by Debarba et al. [13].With an optical see-through HMD they visualize realistic looking bones of the clients.The HMD is worn by the instructors, who get real-time feedback on how far the limb is bent by an angular indicator at the joint.The realistic rendering of bones is unique, since it is often more legible to have abstract movement metaphors.Debarba et al. combine this realistic skeleton bones with a simple visualization of the limb angle.This solution was chosen as an exceptional example due to the special circumstances of a rehabilitation setting in combination with the instructor as feedback receiver.
The work of Shiro et al. [63] represents another innovative attempt at giving feedback for movements.The system generates a picture which an interpolation between the pose of the user and a recorded movement of an expert.After the movement execution, the user can set how close the generated image should resemble the expert movement and browse through the timeline of the playback as seen in Fig. 6.This interpolation and image synthesis technique is quite exceptional in the surveyed literature.It utilizes generative adversarial networks (GAN), a machine learning (ML) technique, to synthesize the images and show a target pose for the user to imitate.
Vidal et al. [73] use a projection-based approach to visualize the movement of the trunk.Lights attached to the body show the position of the trunk on the floor or walls of the room.The correct position is represented by a mark in the room.The users need to move to match this mark with the projected crosshair (see Fig. 7).This abstract approach, which works in real-time, represents a singularity in the literature we analyzed.
Another approach utilizing projection is presented by Kosmalla et al. [31].A projector displays an image on a climbing wall visualizing either the end position of the next motion or an instructor performing the upcoming movement.The attempt to project the feedback onto the training equipment is seen a few times throughout the literature.It is the scale and the virtual instructor on the wall that make this approach unique.

SURVEY INSIGHTS
In this section we will point out what we consider interesting findings of the classification seen in Table 2 and present what insights can be drawn from them.We discuss the insights by category.
MR Technologies.The most used MR technology in the literature reviewed was room-mounted displays, with VR-glasses in second place.The reason for room-mounted displays and VR glasses as most used MR technologies could be accredited to the fact that they are simple and inexpensive to implement.Especially room-mounted displays used as "augmented mirror" (see Section 4.1), as often seen in the literature surveyed, can be set up with little inexpensive equipment like a camera and a normal display.Additionally an augmented mirror approach provides an intuitive way to receive full body feedback.
Contrary to our expectations, optical see-through headsets made up a small percentage of 15.4% of the used technologies.Despite the important development the HoloLens represents, not many approaches utilized it to give visual motor feedback to participants in physical therapy and exercise.
Point of View.More than half of the approaches used a third person view in their systems.This can be directly linked to the high quantity of room-mounted displays as all of these systems use a third person perspective.
Abstraction Type.A majority of research attempts we surveyed featured a positional form of feedback.It is noticeable that most directional feedback is a hybrid of directional and positional feedback.Pure directional feedback could be hard for humans to comprehend.Only showing the direction is insufficient information in many cases, as it would be difficult for the participant to stop at the target location.Information about the distance is crucial to be able to move correctly and precisely without further compensating motions.
Stages of Learning.Regarding the learning phases, the literature was predominantly categorized into the associative phase.Although the stages are overlapping and therefore a clear assignment is sometimes impossible, the dominance of the approaches that can be classified as associative can still be seen as significant.It could be argued that visual feedback is most appropriate in the associative skill learning phase.When subroutines are put together to form one uniform skill or movement the motor feedback might be most effective and useful to the user.
When considering the connection between the used visual cues and the learning stages, we came to the conclusion that using graphs as visual cues might be most suitable for learners already familiar to the skill.Such learners would be in the autonomous stage.New learners might be overwhelmed with the detailed information and it might not contribute to their improvement.Advanced users, already in the autonomous stage, might look for ways to improve their movement beyond their self-reliant execution and can therefore be supported with graphs.Additionally, a guidance feedback type could be helpful for learners in the cognitive phase, since the upcoming movement is demonstrated and they can take in the information about the movement.
Publication Venue.When looking at the publication venues, we can see that a majority (82.1%) of papers were published in a computer science venue.As expected most of the publications were found in the VR, AR and MR sector (35.9%).Another substantial part (28.2%) was found in the field of HCI.Furthermore, several (15.4%) attempts were found in the fields of medicine, health and sports.It is to mention here that research approaches in this sector often feature a different focus, for example the medical state of the user or the impact of the system on performance.This sometimes culminates in the complete absence of statements or depictions regarding the visualizations employed in the publication.This led to an exclusion of such papers in the screening phase as described in Section 3.
Body Parts.Most feedback we observed was meant for the whole body.A few attempts concerned arms and legs and one paper covered feedback for hands.Again, it could be argued that this phenomenon is linked to the many examples of third person augmented mirrors.Augmented mirrors are most suitable for a whole body feedback type, as a mirror scenario is the most common way we experience the view of our whole body.
Use Cases.The observation that a large number of research attempts target individual sports seems trivial.HMDs for example are limited to one user.To give a clear feedback only one user can be addressed.If a whole team of participants would have to get feedback it would be either very time-consuming or a large number of systems would have to be available simultaneously.However, in team sports the feedback for specific movements is traditionally given to the individual.
Visual Cues.The most popular visual cues were end position and opaque target avatar as seen in Fig. 8.The directional cues like arrows and rubber bands are seldom used throughout literature.The literature appears to prefer a positional approach.
It becomes apparent that there are no clear outliers regarding the occurrences of visual cues.The literature does not seem to have found one best way to provide visual feedback in MR.This can be attributed to the wide variety of different use cases appearing the literature surveyed and to the diversity of movements associated to these use cases.Evaluation Methods.Various different approaches to evaluate the methods and technologies can be found the literature surveyed.Evaluations were mostly conducted from a therapeutic, user experience or visualization standpoint and usually included a user study.The therapeutic evaluations were mainly concerned with the recovery of the user or patient.For example Booth et al. [5] measured the step length, knee extension and ankle power of participants.Aside from their work [4], [5], Afyouni et al. [1], Marti [36], Sekhavat et al. [62] and Karatsidis et al. [30] provide therapeutic evaluations of their work as well.
A user experience focused evaluation was found in the work of Cao et al. [8], Barioni et al. [3], Han et al. [20], Hoang et al. [24] and Mostajeran et al. [42].These evaluations usually include a questionnaire to identify the condition or opinion of the participants.For example Barioni et al. [3] developed a questionnaire involving ten questions regarding the use of their system to obtain an opinion from the participants.
Evaluations from the visualization standpoint often include measures like precision or correction times.We observed this for example in the work of Yu et al. [79], who measured completion time and movement error.Aside from that, similar metrics can be found in the approaches of Cao et al.Not always a clear distinction of these perspectives is possible.There are some works analyzing the feedback system from several standpoints for example Cao et al. [8] and Hoang et al. [24].Several more evaluations were found among the surveyed literature, but they either had a very small number of participants ( [11], [14], [37], [50]) or did not deliver insights relevant for our overview ([9], [10], [20], [49], [52], [73]).

CONCLUSION
In this article, we gave an overview of the most recent literature on mixed reality feedback in the sector of physical exercise, rehabilitation and general motor skill training.
The literature has been classified and an overview of the classification is given in a table for easy reference.We discussed the different feedback types and identified potentials for a better utilization of feedback in the mentioned application areas.We believe this survey closes a gap concerning a literature analysis taking a closer look at visual cues.Furthermore, the survey could stimulate future research regarding visual cues for motor skill training as suggested in Section 7.3.

Findings
We identified several trends: Many of the papers considered use approaches that can be described as virtual mirrors, that is a whole body view on a display with feedback for certain movements.With respect to the abstraction type, positional feedback is dominating the surveyed literature.Even directional feedback is often combined with the former to provide the user with sufficient information.In the examined approaches, feedback appeared to be mainly implemented for the associative skill learning phase.The papers used a variety of visual cues to pursue their goals.The most popular cues visualize the target pose or end position.The classification of literature in learning phases enables a more suitable feedback choice for specific target groups.

Limitations
This survey gives insight into which visual cues are utilized within literature.We can only provide subtle hints to the reasons.An adequate analysis of the reasons for visual cue utilization is yet to be conducted.The paper at hand does not deliver in-depth insights from a cognitive, therapeutic or user experience standpoint.
As discussed in Section 3, tool handling is not included in our scope.We solely focus on feedback for the human body.Nevertheless, several insights on visual cues and associated approaches might still be useful for supporting tool-based skill training with augmented reality.
There is no clear indication to when Fitts and Posners [15] learning phases apply, so the insights the classification can provide regarding this category are limited.

Open Questions
At this state of research, the connection between feedback and learning stages [15] is yet imprecise.It is still not always clear which visual cues are connected to which stages and what they invoke in users.It would be profitable to understand this connection better in order to improve motion learning in the sectors of rehabilitation, physical exercise, and private or professional skill training.With greater insight, the visual cues could be adjusted to better suit the target group and goal.
As mentioned in Section 6 there are just few research approaches with visual corrective feedback for team sports.It could be valuable to look deeper into the individual feedback given for team sports.Further it might be interesting to study use cases where the interaction between the movements of two people is crucial, for example in dancing.
The insights found in this work could be transferred to tool handling.Since skill training is an important application for augmented reality, it might be interesting to analyze which of the visual cues found could be applied for toolbased skill training in the industrial context.This would build upon the work of Gatullo et al. [18], applying it to the movement itself.
We found and discussed many different visual cues for motor feedback.Yet, the nature of them is not yet fully understood.It could be profitable to investigate the visual cues in more depth to allow for a better informed choice.This paper gives a first indication when to use which type of feedback.Building upon the obtained insights, more detailed recommendations could be developed and researched in the future.To have a greater variety of visual feedback to choose from and intentionally utilize it for a given use case, one could investigate new visual cues especially tailored for mixed reality.

Fig. 2 .
Fig. 2. A person exercising with a ball.Exocentric (left) and egocentric (right) view types with possible target movement (feedback) in red and the actual movement in black.

Fig. 5 .
Fig.5.An opaque avatar shows the actual pose, while a transparent target avatar depicts the target pose, which is to be taken.Image from[3].

Fig. 6 .
Fig. 6.Video frame synthesized by interpolating between user and expert pose.The upper slide bar represents the degree of expertise used for the interpolation.The lower slide bar controls the current frame of the playback.Image from [63].

Fig. 7 .
Fig. 7. Projection based MR for body parts that are hard to see.Assuming a correct movement execution the projected crosshair matches the marking on the floor.Image from [73].

Fig. 8 .
Fig. 8. Occurrence of different visual cues in the selected literature.

TABLE 1
Fitts and Posners [15]Stages of Skill Learning as Applied to Motor Learning

TABLE 2
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.