Empirical Study on Influencing Factors of Knowledge Product Remixing in OIC

Remixing of knowledge products has become one of the mainstream innovation models for the online innovation community (OIC). It is of great significance to explore the influencing factors of knowledge product remixing in OIC for better stimulating the open innovation. We first propose an analytical model for influencing factors of knowledge product remixing, then come up with a method for identifying false product attributes based on deep learning, and finally sum up the influencing factors of knowledge product remixing after analyzing the knowledge product attributes. The study results show that attention degree and user interaction have a positive impact on remixing of knowledge products, and there exists an inversely U-shaped relationship between knowledge complexity and product remixing. Continuous innovation has no significant positive incentive effect on product remixing.


I. INTRODUCTION
As the achievements of natural science and social science expressed in certain forms, knowledge products are creative outcomes obtained by human beings relying on brainwork, knowledge, intelligence and other elements in the course of natural and social transformation [1]. Along with the development of the Internet and knowledge-based economy, free and open ''peer production'', as a new organizational model and an innovative driving mechanism for knowledge production, is highly recognized [2], [3]. The innovative driving mechanism is effective to promote and stimulate the high-quality, efficient innovation so as to achieve preset goals [4]. A good knowledge of the innovation process and its determinants is conductive to optimizing the innovative production and finally creating more excellent products and services [5], [6]. Through the OIC, producers (community users) can release their knowledge products. Remixing refers to a process of generating innovative products by copying, integrating and recombining existing knowledge products [7], [8]. As one of the main innovation models of knowledge products, remixing has been widely applied by such OICs as Wikipedia, Thingiverse, Scratch, Git-hub, and so forth.
The associate editor coordinating the review of this manuscript and approving it for publication was Haishuai Wang . By the relationship between remixing objects, remixing can be divided into inheritance and derivation. Inheritance refers to generation of a new product after transformation by the original creator, and derivation means generation of a new product after transformation by any other user [9], [10].
Along with rapid development of various open OICs, remixing of knowledge products, as an important innovation model for Online collaboration, has received more extensive attention [11]. Stanko et al. [12] think that remixing is a process of information exchange and knowledge sharing and diffusion in an open environment, proposing to explore the intrinsic motivation of remixing from the perspectives of dynamic diffusion and static attributes. Pointing out the significant impact of product complexity, creator's reputation and knowledge accumulation on the remixing innovation, Hill and Monroy-Hernández [13], [14] dialectically analyzed remixing behaviors from the perspectives of originality and generativity of knowledge contribution. Also, research on OIC innovation creates theoretical and practical bases for better understanding of remixing of knowledge products. These researches mainly focus on user participation motivation, knowledge sharing, online interaction and other aspects involved in the OIC innovation. Some scholars hold that community users' involvement in product innovation is mainly affected by epistemic motivation [15], [16], reciprocal VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ motivation [17], interest motivation [18], [19], and perceived ease of use (PEOU) [20], user behavior is closely related to innovation performance, and continuity and level of knowledge sharing, and willingness to share, have a significant impact on open innovation [21], [22]. Liu et al. [23] think that online interaction can enhance the affinity and trust between community members, and propel the remixing and innovation of knowledge products. However, existing researches have the following deficiencies: (1) Most of the existing researches focus on the knowledge contribution and promotion effect of remixing to open innovation, however, few of which make an indepth discussion about the influencing factors of knowledge product remixing; (2) Existing researches mainly target at users, platforms, knowledge and other objects, focusing on the contribution and effect of remixing to innovation performance, but paying little attention to the mechanism of knowledge product innovation.
In fact, different knowledge products of OICs vary significantly in innovative contribution of remixing. Most of them are seemingly insignificant, and only few of them can be serialized through continuous remixing and innovation.
Consequently, this Paper analyzes the impact of knowledge product attributes, of producers' (platform users') interaction and of knowledge product continuity effect on the innovative contribution of remixing. This Paper first proposes an analytical model for influencing factors of knowledge product remixing, designs the web crawler to obtain the public data (data of knowledge products and attributes) of the OIC websites, then comes up with a method for identifying false product attributes and screening out valid data based on deep learning, and finally makes the empirical research based on such data.

II. RESEARCH THEORIES AND ASSUMPTIONS A. RESEARCH MODEL
In essence, remixing is a process of continuous improvement of existing products and continuous creation of new products by OIC users based on such motivations as interest, knowledge seeking and practicality. Different from the traditional innovation models, remixing has basic characteristics of freedom, openness and circulation, and is a dynamic process of interaction, integration and coordination of various innovation elements. Roger said: ''An innovation is an idea, practice, or object that is perceived as new by an individual or by another 'unit of adoption'. Diffusion of innovations is a social process that communicates perceived information about a new idea; it produces an alteration in the structure and function of a social system, producing social consequences.'' [12]. From the perspective of knowledge sharing and innovation diffusion, remixing is an iterative process of sharing, diffusion and recombining of innovation objects (knowledge products) in the open Internet space. In his innovation diffusion theory, Roger clarifies the attributes of innovation objects from the perspectives of relative advantage, compatibility, complexity, trialability, observability and thinking variability, comprehensively introducing the factors affecting the identification, cognition, acceptance and spreading of innovative products. Specific connotations of such factors are defined as follows: relative advantage represents the degree of novelty of innovative products against existing products; compatibility reflects the degree of coexistence of an innovation with existing values, previous experience, and needs of potential users; complexity reflects the degree of difficulty in understanding and application of an innovation; trialability reflects the extent to which test can be performed on a limited basis; observability reflects the extent to which innovative products are visible to others; thinking variability highlights the significance of changing thinking models to innovation. Good information diffusion channel and high attention degree are essential for better diffusion of innovation.
Designing a theoretical model according to the DOI theory, this Paper analyzes key factors of OICs that affect the remixing of knowledge products from the perspectives of knowledge complexity, attention degree, user interaction and continuous innovation, as shown in Fig. 1. Whereas, stricter requirements are raised for template-based display of knowledge products of OIC, and innovative products have strong homogenization characteristics in observability and trialability, knowledge complexity thus becomes critical for understanding and reuse of products. Specific OIC users spread the information of innovative products mainly through click, browsing, discussion and forwarding, etc., so that the attributes of attention degree and user interaction can be used to describe the product diffusion effect. Because innovative products created based on ''inheritance'' are better than original products in compatibility, continuous innovation may have a potential impact on continuous remixing.

B. RESEARCH HYPOTHESES 1) RELATIONSHIP BETWEEN KNOWLEDGE COMPLEXITY AND REMIXING
Complexity is one of the important characteristics of knowledge. Nelson and Winter [24] hold that knowledge can be divided into simple knowledge and complex knowledge by degree of understandability. Zander and Kogut [25] think that knowledge complexity refers to the diversity of results caused by users' differences in capability during the sharing and transfer of knowledge. OIC users acquire experiential knowledge through spontaneous experiential learning, dialectical observation, action and introspection [26]. Therefore, simple knowledge is easier to be understood and mastered, but is inferior to complex knowledge in intrinsic value. Complex knowledge is more valuable [27], but tends to be more tacit, embedded and dependent, thus making it more difficult to understand and reuse.
Complexity of knowledge products of the OIC is mainly manifested in the ways of expression, knowledge correlation and property rights licensing [13]. For better remixing of innovative products, OIC users should take full account of better participation experience, higher potential value and lower behavioral risk. Simple knowledge products are easier to be understood and recognized which, however, are of low potential value for reinnovation. A moderately complex product may render higher potential value and better participation experience, thus being more attractive for users to reinnovate. To a certain extent, it is more difficult for users to understand high-complexity knowledge products. Furthermore, any high-complexity knowledge product, despite its high knowledge contribution, is hard to attract users' attention and participation enthusiasm due to many restrictions on property rights licensing and subsequent use, or deliberate hiding of its design details. The following hypothesis is thus proposed: Hypothesis 1: There exists an inversely U-shaped relationship between knowledge complexity and remixing activity.

2) RELATIONSHIP BETWEEN ATTENTION DEGREE AND REMIXING
In the information economy era, attention has become a scarce resource with commercial value [28]. According to the competition theory, knowledge products with different degrees of attention vary in knowledge contribution. Knowledge products of high concern are more likely to become advantageous products. Users prefer remixing innovation and transformation of advantageous products mainly for the sake of interests. Benefits stimulate users' enthusiasm to spare no effort to develop advantageous products and seek continuous improvement. Considering the importance of reputation in the OIC [23], users give top priority to remixing innovation of products that enjoy a higher reputation.
Seen from the perspective of thinking variability, higher attention can enrich the thinking injection in the innovation process [29] and give rise to a resonance effect in the user interaction process, thereby attracting more users to engage in the improvement and innovation of advantageous products. This benign circle reflects the importance of remixing innovation to increase in products' knowledge contribution, and significantly improves the overall innovation capacity of the OIC, whereby the following hypothesis is proposed: Hypothesis 2: The degree of attention has a positive correlation effect on the remixing of knowledge products.

3) RELATIONSHIP BETWEEN USER INTERACTION AND REMIXING
Remixing of knowledge products is a result of knowledge sharing by OIC users. Such knowledge sharing is based on such factors as reciprocity, vision sharing, and perception of fun, as well as user interaction [30]. According to the social cognition theory, creativity of companions may stimulate one's enthusiasm to actively engage in creative work [31]. User interaction is a process of individual learning, knowledge diffusion, transfer and innovation, which is critical to stimulate the innovation of knowledge products.
Interaction of OIC users mainly involves how to design better products or solve specific design problems. Innovative thinking or creativity from online reviews of users constitutes knowledge contributions to the OIC, and improves the OIC's innovation ability and performance level [32]. In the interaction process, some ideas are absorbed and practiced, and innovative products are created through transformation. This is the specific process of remixing of knowledge products. Although some user interaction may fail to find out a feasible solution, it may attract more users' attention and help the product publisher enjoy a better reputation. In addition, it can bring users the sense of accomplishment and belonging and improve their efficiency, thereby motivating their enthusiasm to create more innovative products. The following hypothesis is thus proposed: Hypothesis 3: User interaction has a positive correlation effect on the remixing of knowledge products.

4) RELATIONSHIP BETWEEN CONTINUOUS INNOVATION AND REMIXING
Re-remixing of innovative products of the OIC generated based on inheritance is a typical behavior of continuous innovation which makes innovative products better compatible with existing knowledge [33]. According to the DOI theory, creativity in more harmony with existing experience and value is more suitable for the existing cognitive pattern and thinking paradigm of OIC members [34]. Such creativity facilitates users' recognition and acceptance, making it easy to work out an innovative design scheme for better improvement of existing products.
In the continuous innovation process, knowledge products will be continuously improved through repeated processing and iteration and repair of design defects. However, it continually shrinks the space of re-transformation, and reduces the possibility of knowledge product remixing. For example, early versions of open source software, like Linux, are simple VOLUME 8, 2020 in form and imperfect in detail, but are easier to understand, with a large space for improvement and innovation [13].
Continuous innovation may promote the ''multigenerational'' pedigree culture of knowledge products of the OIC. Almost all innovative products inherit some attributes or functions of previous generation knowledge products. Remixing of knowledge products mainly aims to optimize knowledge products for specific applications by repairing existing defects [35]. If original knowledge products can be optimized through inheritance and defect repair, it will attract more users' attention and stimulate their interest of continuous innovation; if knowledge products can't be optimized through inheritance and innovation, continuous innovation of them may be suspended. The following hypothesis is thus proposed: Hypothesis 4: Continuous innovation has little impact on the remixing of knowledge products.

III. IDENTIFICATION OF FALSE PRODUCT ATTRIBUTES
Key factors that affect the remixing of knowledge products are studied here according to the attributes of knowledge products of the OIC. However, each OIC user can perform the operations of browsing, giving a like, collecting, downloading and commenting for any knowledge product in an open internet environment. In order to get greater reputation, the product creator may hire some users to make false comments on his/her product. At the same time, OIC users may deliberately make false comments that are irrelevant or meaningless to the product, which could impair the authenticity of attribute data of knowledge products, thus affecting the accuracy of analysis conclusions.
In order to solve the above problems, this paper proposes a method for identifying fake product attribute data based on deep learning. Deep learning technology is fully used to analyze whether any comment of a platform user is deceptive, irrelevant or authentic. If a user's comment is proved to be authentic, his/her related operations will be regarded as valid; otherwise, his/her related operations will be recorded as invalid. After filtering all knowledge product attribute data using this method, this Paper analyzes key factors affecting the remixing of knowledge products through the model proposed in the previous section.

A. EQUATIONS TECHNICAL PRINCIPLE OF DEEP LEARNING
As a branch of machine learning, deep learning is evolved from artificial neural network [36], [37]. At present, Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) have been widely used in the fields of computer vision and natural language processing, with many outstanding results achieved.
In the field of natural language processing, CNN model [38] can be used for text classification. The specific process is as follows: First, convert the text into the word vector for model data input; then, extract a number of features through the convolutional layer, setting the convolution kernel size as k ×h, where k denotes the dimensionality of word vector, and h is the number of words for convolution operation; input the extracted data of features into the pooling layer for feature filtering, to solve the problem of inputting the variable-length text; finally, fully connect the selected feature vectors, and input data to the Softmax classifier to predict the class probability. The structure of CNN-based text classification model is shown in Fig. 2.
Use of similar weight-sharing network structure of BNN (biological neural network) greatly reduces the weight and complexity of the CNN model, making calculation faster. Text information refers to the contextual data, but the context relationship of words inputted in the text is not taken into account for the CNN model, which will affect the accuracy of text classification.
Widely applied to machine translation and other natural language processing fields, the directional recurrence is also introduced for the RNN model to process sequential data associated. However, use of the optimization algorithm BPTT (Back Propagation Through Time) makes the RNN model unable to deal with the dependency relationship between the information in a long interval, due to the problem of gradient vanishing or gradient explosion. As an improved method based on the RNN model, long short term memory network (LSTM) [39], [40] has a strong learning ability in processing of sequential data. LSTM adopts the improved hidden units calculation method for the RNN model, and controls the deletion, retention or addition of data in memory units through the designed ''forget gate'', ''input gate'' and ''output gate''. The hidden unit calculation process of LSTM is shown in Fig. 3.
As shown in Fig. 3, x denotes input data, C represents memory unit information, h indicates hidden unit information, f is the forget gate, i is the input gate, and o is the output gate.
At the time t, the temporary memory unit information C t can be expressed as follows: where, W c and U c denote the weight parameter matrix respectively, b c is the offset vector, x t is the input data, h t−1 represents the hidden unit information at the previous moment, and tanh is the activation function. The values of forget gate, input gate and output gate can be calculated as follows: where, W and U represent the weight parameter matrix of each control gate respectively, b denotes the offset vector of each control gate, x t denotes the input data, h t−1 denotes the hidden unit information at the previous moment, and σ denotes the activation function. The calculation formula for the current memory unit information updated is as follows: The output of current hidden unit can be defined as follows:

B. FALSE COMMENT IDENTIFICATION BASED ON CNN + LSTM
A model of false comment identification based on CNN + LSTM is designed in view of their advantages, with its overall structure as shown in Fig. 4.
The false comment identification model mentioned herein is mainly composed of word vector input layer, CNN feature extraction layer, LSTM encoding layer, pooling layer and classification output layer. For the word vector input layer, the open-source pre-training model, namely the word vector model GloVe (Global Vectors for Word Representation) [41], is used to convert each word in the comment into a 300dimensional word vector representation. As shown in Fig. 5, the convolution kernel size (k = 300) is set (five convolution kernel sizes h = 1, 2, 3, 4, 5) for the CNN feature extraction layer, the convolution step size is set as 1, and the number of channels for each type of convolution kernel is set as 300.
After the inputted word vector passes through the CNN layer, 300 feature vectors extracted from each convolution kernel will be transferred to the pooling layer for maximum pooling, so as to screen out the feature vector with the largest response value. Then, the selected feature vectors are inputted to the encoding layer consisting of 256 units and average pooling layer in succession after learning of the dependency relationship between feature vectors. The pooling layer enables the dimensionality reduction of feature vectors, and solution of variable-length problem of comment. Extracted feature vectors are finally inputted to the fully connected layers for expression. One hundred and twenty eight neurons and three neurons are configured for the first and second fully connected layers, respectively. Then, after data input, the Softmax loss function is used to calculate the loss and train the entire network.
The Softmax loss function is expressed as follows: where, W represents the weight parameter matrix, b denotes the offset vector, x i denotes the feature vector of the i th comment, and M denotes the number of comment categories.

IV. EXPERIMENTAL VERIFICATION A. FALSE COMMENTS RECOGNITION MODEL EVALUATION
The information provided by the user who posted the comment will indirectly affect other users' selection, and even some businesses will ask people to deliberately post a large number of fake reviews or highstar ratings to indirectly benefit, which brings many problems, such as reducing the existence value of the review platform, making users make wrong judgments and choices, and seriously affecting the remixing and correct evaluation of knowledge products. Therefore, the network platform pays more and more attention to online reviews, and it is necessary to filter these fake comments.
In order to verify the accuracy of the false comment recognition model proposed in this paper, this section uses TensorFlow to implement a CNN + LSTM-based false comment re-cognition method, and then uses the popular gold standard dataset [42] in the field of false comment detection to evaluate the model, which contains a total of 400 real  reviews and 400 false reviews. In this section, 300 real and fake comments are used for model training, and the remaining comments are used as the test set. During training, the model is trained using cross-validation, that is, 90% of the data is randomly selected as the training set and 10% of the data is used as the validation set at the beginning of each round of training. After multiple rounds of iterative training until the model's accuracy on the validation set is stable, the test set is used to evaluate the model and compared with existing methods. Comparative experimental results are shown in Fig. 6.
The experimental results show that the proposed deep learning model has been made on the optimal accuracy in test set. The model combines the advantages of CNN and LSTM, not only take advantage of the share convolution parameters of CNN, also makes use of the characteristics of LSTM for the contextual feature learning of text information, so that the model can better learn the features of the review text, which makes the accuracy of false review recognition significantly increase. Because there is no need to introduce additional feature calculation, and the number of model parameters is smaller, compared with the currently popular methods such as Fast-text [43], Deep-Bi-LSTM [44], the proposed method in this paper is simpler to use and can quickly identify fake review content.

B. DATA ANALYSIS FOR KNOWLEDGE PRODUCTS
The research data come from Thingiverse website (http://www.thingiverse.com), the world's largest OIC engaged in product design based on the 3D printing model, which now displays more than 700,000 design products. Innovative products released by users are displayed on the Thingiverse website in the unified way of web page. Product attribute data include number of likes, number of downloads, number of views, number of collections, number of images, number of design files, number of comments, and comment content. Valuing the contribution of remixing innovation to community innovation, Thingiverse has established a management mechanism for remixing of knowledge products, and set the ''remix from'' label to record the origin (inheritance) of knowledge products, and also the ''remixes'' label to record the situation of improvement and re-innovation (derivation) by other users. 55,310 comments on knowledge products remixed (derived) (value of ''remixes'' attribute> 0) were extracted from the Thingiverse website, and nine attributes parsed from the product description labels were used as observed variables for statistical description and analysis. To ensure the accuracy of research conclusions, extracted comments were identified through the false comment identification model based on CNN + LSTM. Three thousand comments were certified to be false. TABLE 1 shows the meaning and statistical description of each variable screened through the model. Based on the analysis model and research hypotheses, ''remixes'' is set as a dependent variable, and the other eight variables are set as independent variables.

C. FACTOR ANALYSIS
In order to verify the research hypotheses proposed herein, the factor analysis method was used for principal factor analysis and modeling based on the observed variables. Commonly used for comprehensive evaluation, factor analysis method is applied for variable grouping by magnitude of correlation. According to the analysis model, four factors may be    extracted from eight independent variables. SPSS25.0 software was used for factor analysis. The results of analysis through the dimensionality reduction factor analysis module are shown in TABLE 2. As shown in the table, the statistical magnitude of KMO is 0.685, greater than the minimum standard, and the spherical test value p of the Bartlet is less than 0.001, indicating that the eight observed variables are suitable for factor analysis. Seen from the percentage of variance in the square ratio of load, the degree of interpretation of all variables by the four factors reaches more than 93%, indicating that the four factors can comprehensively summarize the overall features of variables. The common factor variance of each observed variable is above 0.9, indicating that the four common factors can perfectly reflect most contents of original observed variables.
In combination with the analysis results, the relations between observed variables and principal factors are explained as follows. The four variables, namely number of views, number of likes, number of downloads, and number of collections, are most associated with Factor 1, which reflect VOLUME 8, 2020  OIC users' attention to certain product, and can be used for analysis of attention degree. The two variables, namely number of images and number of files, are most correlated to Factor 2. Since the images and files about released products are generated by the website users, numbers of innovative products and files reflect their knowledge complexity. The two variables, namely number of comments and remixing (inheritance) are strongly correlated with Factor 3 and Factor 4, respectively, indicating that they have certain relative independence, and can be expressed and interpreted separately. Number of comments reflects the sufficiency and activity of discussion about knowledge products by OIC members. The nature of a knowledge product, namely a source innovation or an innovation based on inheritance of original product, can reflect the impact of continuous innovation on remixing innovation. The specific relations between observed variables and principal factors of knowledge products are obtained through factor analysis, as shown in Fig. 7.

D. PRINCIPAL COMPONENT REGRESSION ANALYSIS
On the basis of above-mentioned principal factor analysis, the correlation between principal component factors and dependent variables was analyzed to verify the research hypotheses. First, normalization of dependent variable, namely remixing(derivation) vector, was conducted. Then, four independent variables were defined according to the research model, namely X 1 -attention degree, X 2 -knowledge complexity, X 3 -user interaction and X 4 -continuous innovation, and one dependent variable Y-remixing of knowledge products, was also clarified. The influence relationship was analyzed through the linear regression model. In addition, one quadratic term of X 2 was added for the linear regression model in order to verify the inverse U-shaped relationship between knowledge complexity and remixing. The regression analysis results are shown in TABLE 3. As shown in the table above, the R-Square value of the entire model is 0.578, indicating a high degree of fitting. The significance is obvious at p < 0.05, and overall model fitting is valid. The results of factor impact analysis show that standardized coefficients of X 1 and X 3 are positive, and significant at p < 0.05, indicating that relative advantages and user interaction are indeed positively correlated to remixing of knowledge products. Knowledge complexity X 2 has no significant impact on the model, but its quadratic variable is significant at p < 0.05, indicating that there indeed exists an inverted U-shaped relationship between X 2 and remixing of knowledge products. The p value of inheritance effect variable X 4 is 0.310, indicating that it has no significant impact on remixing of knowledge products. The aforesaid research hypotheses are fully proved by the above test results.

V. CONCLUSIONS AND PROSPECTS A. CONCLUSIONS
Influencing factors of knowledge product remixing in the OIC were explored based on the DOI theory. The study results show that attention degree and user interaction frequency have a positive impact on knowledge remixing, and knowledge complexity constitutes an inverted U-shape relationship with knowledge remixing. Also, the relationship between compatibility and innovation variability of knowledge products was demonstrated through the empirical study. According to empirical study results, continuous innovation has no significant positive incentive effect on the remixing innovation. Given that knowledge remixing is an important innovation impetus of the OIC, the research results are of great practical guidance significance to better stimulate the open innovation. The following suggestions are made based on the research conclusions: OICs and users should fix eyes on advantageous products attracting high attention to stimulate innovations; OICs should lay a more emphasis on refined management over the complexity of user-contributed contents; OICs should promote innovations by creating more active, diverse interactive environments and atmospheres.

B. PROSPECTS
The research model and hypotheses were verified in an objective and scientific way. However, it is still subject to some limitations. For example, only the 3D printed products, rather than other knowledge products of the OIC, were sampled for the purpose of study. Therefore, universal significance of research results needs to be further tested. On the other hand, the impacts of different remixing modes and other deep complex factors on the remixing of knowledge products, were not considered. Thus, further researches will focus on exploration and empirical analysis of influencing factors of knowledge product remixing in the OIC on the basis of collecting multiple samples, as well as the impacts of different remixing modes and other deep complex factors on the remixing of knowledge products.