I. Introduction
The first-person shooter (FPS) game, designed to mimic real-world warfare and combat situations, is a popular game genre. To defeat enemies in the FPS game, a player under attack must decide whether to strike back or retreat by considering the enemies’ position and firearms. However, human vision may not be able to capture where the enemies are and what firearms they have as the distance between the player and them increases. In addition, when the enemies camouflage themselves, it is also difficult to detect the enemies by human vision. In these cases, gunshots can be clues for estimating the enemies’ state. For example, an expert FPS gamer can recognize the tiny difference in stereophonic sound from a headphone, and she can roughly guess the position and the firearms of the enemies. The reason the player can establish strategy based on auditory information is that a game engine can reproduce the characteristics of the sound that varies with distance and direction. Inspired by the realism of the game, we hypothesize that a prediction model, which localizes enemies and identifies firearms from in-game gunshots, can also be applied to real-world gunshots. Specifically, a support system that can spot the enemies and determine the type of firearm from gunshots does not only assist the beginners of the game, but also aids soldiers and police officers who track the criminals in the real world.