Data Analysis of Online Shopping Platform during the Epidemic of Coronavirus Disease
Zhao B and Cao J
Published on: 20201031
Abstract
Data analysis of online shopping platform, and with the development of online platform, more and more consumers will choose this convenient way of online shopping. This paper uses Spider’s timebased model to mine the rating data of Amazon online shopping platform, establishes a neural network model, analyses the connection relationship of each rating index, carries out descriptive statistical analysis on each index, obtains the correlation results between the impact indicators, and carries out fuzzy evaluation, analysis the impact of each evaluation index on the product, and finally combines the product's The relationship between sales situation and rating provides reliable product sales design mode for Amazon, and gives some sales suggestions, so as to enhance the product's desirability.
Keywords
Data analysis; BP neural network; Fuzzy evaluation; Index systemIntroduction
With the improvement of people's material life quality, high quality and high praise items have been favoured by more and more consumers, and become the first choice of people's daily shopping. After purchasing, consumers grade the products through star rating and experience comments on the shopping website, and select highquality products. The managers of shopping websites can understand the shortcomings of their products in other websites of the same type through consumer reviews, so as to grasp the advantages and disadvantages of product sales and clarify the development direction of products. For consumers, the evaluation results can let them understand the specific ranking and advantages of different products of the same type, match with their own needs, so as to obtain better shopping. After purchasing the products, consumers also participate in the evaluation process of the goods, and as a group with the most say in the evaluation, they express their choices and wishes, so as to promote the businesses to find themselves. As a result, consumers get a better shopping experience. These two groups are opposite to each other and promote each other, forming a virtuous circle.
The Model of Problems
The model of problem 1
Selection principle of indicators: The online shopping platform is a complex In order to get the product comment information scientifically and reasonably, the selection of indicators is a crucial step. Only when the indicators that can accurately measure the product are selected can the model be established, the following five principles should be followed in the selection process (Figure 1).
Figure 1: Schematic diagram of index screening principle.
Scientific principles: The selection of measurement indicators must be based on scientific principles, and can truly and objectively reflect the impact of each element on the selection of indicators.
Practical principle: The construction of the evaluation system is mainly theoretical analysis, which will be affected by the data sources of various indicators in practical application. Therefore, the availability and reliability of data sources should be guaranteed in the process of selecting indicators.
The principle of system: There should be a certain logical relationship between the indicators, not a single index, but a system of product evaluation information, they should not only reflect the sales and praise of the three products from different aspects, but also form a systematic organic whole.
Principle of comparability: Different types of index data should conform to comparability, so the evaluation index system constructed conforms to universality.
The principle of relevance: Three products selection of evaluation index system should be the combination of a series of indicators, in each of the products under the background of stars and comments, through the analysis of each product related comments properly and evaluation, not only can evaluate the efficacy of the product of actual sales, but also can judge the trend of three kinds of product sales.
Based on the above selection principles of five evaluation indexes, the overall framework of the comprehensive evaluation index system of the market products studied in this paper is shown in the following figure (Figure 2).
Figure 2: Flow chart of index evaluation system.
Selection of evaluation indexes
According to the scientific selection principle, combined with the relevant selection criteria of product evaluation body wash index and the comments and sales of three products provided by sunshine company, the evaluation indexes that can better express the sales characteristics of products are selected as follows (Table 1).
Table 1: Definition of indicators.
Index quantity 
Representation method 
Credibility 
a 
Evaluation level 
b 
Evaluation headline 
c 
Credibility: Because credibility is related to many factors, when describing credibility, combined with the given data and conditions, the credibility is described in the following three aspects: whether Amazon members are authenticated, whether products are confirmed to be purchased and whether the number of votes is useful (Figure 3).
Figure 3: Credibility index.
In order to further analyse the influence of credibility and determine the influence weight of credibility, three factors affecting credibility are assigned and defined. The results are as follows:
Set useful voting as and useless voting as , and obtain the functional relationship of the three products under the selected index with respect to reliability as follows:
Where, is Credibility score; are the undermined coefficient.
In order to select the indexes that have the greatest impact on the product evaluation system, the above four indexes were reduced by principal component analysis. The results are as follows (Table 2).
Table 2: Composition coefficient.
D_value 
vine 
Verified Purchase 

Hair dryer 
0.225 
0.78 
0.811 
Microwave 
0.515 
0.554 
0.416 
Pacifier 
0.22 
0.599 
0.639 
Star rating: The Company has designed two kinds of star rating for different objects, one is store feedback, and the other is list, that is, product review. Customers can rate the store and products after purchasing goods to express the satisfaction of this shopping. After customer rating, the company will take the average number for the star rating given. Since the stars shown are the average number, half or most of the stars will appear. The sales of the product will be judged by the star rating. The evaluation grade is 15 stars from low to high, which is directly converted into a judgment index. Finally, the sales situation of the products is determined by calculating the average stars. The evaluation grade is 15 stars from low to high, which is directly converted into a judgment index. Finally, the sales situation of the products is determined by calculating the average stars. Indicators are classified as follows (Table 3).
Table 3: Star level classification.
1 
2 
3 
4 
5 
Bad 
Worse 
Qualified 
More satisfied 
Perfect 
The calculation method of star rating is as follows:
Among, is star rating; is the Number of stars; is star index.
Comment title and comments: The comment Title often determines whether consumers are willing to continue reading and browsing the comment, so the importance of the title is obvious. When evaluating the comment title, we can consider the length of the text, the number of commodity characteristic words, the number of negative emotional words, etc. In order to judge the evaluation index formed by the evaluation title more accurately, we will use the text length, the number of characteristic words of goods, and the respective weight of the number of negative emotional words to describe he evaluation title.
Let the weight of text length (L) be W_{1}, the weight of characteristic word quantity (N) of commodity be W2, and the weight of negative emotion word quantity (M) be W3, respectively. Snyder software is used to screen the overall reviews of the three products, and the descriptive statistics of text length, the number of characteristic words of goods and the number of negative emotion words are as follows (Table 46).
Table 4: Relative weight of blower.
Factor 
Mean Value 
Maximum value 
Minimum value 
Relative weight 
L 
307.25 
2483 
8 
0.35643 
N 
0.4493 
12 
0 
0.21843 
M 
0.11234 
4 
0 
0.42514 
Table 5: Relative weight of microwave oven.
Factor 
Mean Value 
Maximum value 
Minimum value 
Relative weight 
L 
483.587 
6512 
13 
0.29844 
N 
0.31641 
6 
0 
0.17378 
M 
0.23128 
5 
0 
0.52778 
Table 6: Relative weight of microwave oven.
Factor 
Mean Value 
Maximum value 
Minimum value 
Relative weight 
L 
278.0283 
6545 
0 
0.18786 
N 
0.187233 
6 
0 
0.26542 
M 
0.092384 
7 
0 
0.54672 
Table 7: Comprehensive index data.
Aggregative Indicator 
Indicator 
Evaluation 

Indicator 
Pearson significance 
1 
0.81 
Evaluation 
Pearson significance 
0.81 
1 
Through statistical analysis of three product indexes, we can get
Among them a, b and c are the weight of text length, quantity of commodity characteristics and quantity of negative emotion words.
Model building
According to the three evaluation indexes selected above, a BP neural network model is constructed, and the process of establishing the model is shown in Figure 4.
Figure 4: Building BP neural network model and solving flow chart.
Parameter setting of BP neural network model
Network layer number: Kolmogorov theorem points out that in theory, three layer neural network can fit any continuous nonlinear function. In order to simplify the model, this paper uses threelayer neural network model [5].
Set input layer: Four evaluation criteria are selected to describe the product, so the number of input layer neurons is 4.
Number of neurons in the hidden layer: There is no fixed algorithm for calculating the number of neurons in the hidden layer of the model, and the number is closely related to the number of input layer and output layer, which needs to be determined by experience and multiple tests. The number of neurons in the hidden layer is 4, so the number of neurons in the hidden layer is 4.
Output layer setting: The output result of the shopping evaluation model in this paper has only one comprehensive score about the product, so the output layer setting has only one neuron.
BP The solution of neural network evaluation model
Step 1: The connection weights between neurons in each layer of network initialization, each weight value is assigned an interval random number in (1, 1), given calculation accuracy and maximum learning times M, give hidden layer threshold and output layer thresholds.
Step 2: Input sample and the corresponding expected output
Step 3: Hidden layer output calculation. According to the input vector X, connection weight between input layer and hidden layer , and hidden layer threshold a ,calculate hidden layer output.
Among, m is number of hidden layer nodes,is the implicit layer transfer function.
Step 4: Output layer output calculation. Output according to the hidden layer, Connection weight and threshold value b, calculating the actual output O of BP neural network.
Step 5: Error calculation. According to the actual output O and expected output D of the network, the overall error E of the network is calculated.
Step 6: Weight update. According to the overall network error E, update the network connection weight according to the following formula ,
Among, in style the learning rate.
Step 7: Training and convergence. When the average error of the calculated training sample is less than ε, the whole training is over, otherwise, the above process is repeated, and the weight and threshold are constantly modified. After repeated calculation, the actual output of the network gradually approaches to the corresponding desired output, which is also the process of the global error of the network tending to the minimum. After repeated iterations, when the error is less than the allowable value, the training process of the network ends.
Conclusion of question 1
The principal component analysis is carried out to determine whether Amazon members have been certified, whether purchasing power products and voting numbers have been confirmed to be useful, and the evaluation index of credibility is obtained; the evaluation grade is taken as the second evaluation index, and the text length, the number of commodity characteristic words and the number of negative emotional words are mined with Spyder data, and the corresponding values are obtained after descriptive statistical analysis. And get the third evaluation index of the comment title. Then the evaluation model of three product evaluation indexes is established by using BP neural network.
The model of problem 2
In order to get the data measure that can best be tracked by the sunshine company from rating and comment, we choose star rating, helpful votes, total votes, and evaluation score as variables to establish four evaluation indexes. We use the fuzzy evaluation theory to discuss these indexes, and finally give their comprehensive impact to determine their final data measure, the specific flow chart of the fuzzy evaluation theoretical model is shown (Figure 5).
Figure 5: Process of fuzzy evaluation model.
Establishment of model
 Set the factor set as the influencing factors of four evaluation indexes to the comprehensive indexes. Where,: refers to the data of star rating;: refers to the data of helpful votes; : refers to the number of comments; : refers to the evaluation score.
 Select random 10 comments of purchased goods in each table, set the evaluation set to represent 30 comments, and calculate the corresponding evaluation degree of each comment through the model of question one.
 Establish single factor evaluation matrix:
 Single factor weight: Set the weight of each evaluation factor as:
]
 The choice of different models in fuzzy theory:
In the fuzzy evaluation of body shape, different principles correspond to the selection of different models:
Solution one: M (,) (Main determinant)
Solution two: ( (Main factor prominent type)
Solution three: M (Main factor prominenttype)
Solution four: M ( (Weighted average model)
According to this problem, we use a more suitable weighted average model, that is, solution 4.
Define the weight coefficient. It can be seen from the reality that the comprehensive evaluation will be positively correlated with the three indicators of star rating, helpful votes and evaluation score, and the user will choose to watch helpful votes and comments. Total votes includes helpful votes and some voting data with negative correlation. Therefore, the definition .
Comprehensive evaluation:
Solution: first normalize the sample data, then normalize the whole data, and then substitute the obtained value into the formula to get the comprehensive data value.
Solution of model
Make correlation analysis between the comprehensive index data of sample data and the evaluation degree of sample data, and the results are shown (Table 7).
Conclusion: the analysis shows that the correlation between the two models is basically the same, which confirms the accuracy of the neural network model of question one and this model. Through this model, we can accurately
provide data measurement based on rating and comment for sunshine company, and sunshine company can analyse the market of goods according to these measurement.
The model of problem b
Model establishment and solution: This model adds time measurement mode, and establishes time rating model by using the evaluation reliability index discussed in question 1. Because the recognition and discussion of three product data sets based on time measurement and pattern are similar, only the blower is discussed in detail, but the data recognition process is similar, so the time rating model of the blower is analysed carefully, which is rough in the analysis of microwave oven and pacifier, but also gives the analysis results in detail and clearly.
Establish time rating model for hair dryer: According to the evaluation grade, evaluation title and evaluation equation of question 1:
Evaluation level:
Evaluation title:
Known by question 1:
Through the time series analysis and prediction of SPSS, we can respectively get the observation chart of the star change trend of the blower based on the time measurement as shown in figure 6, and the observation chart of the comprehensive change trend based on the evaluation score and star level under the time measurement as shown in figure 7, as well as their influence chart, namely the overall change trend chart, as shown (Figure 6,7).
According to the consumer's star level change and the overall change trend chart, we can know that the star level is on the rise. From 2013 to 2014, the rating rose rapidly, but it was also in a rapid decline stage in the same year, but the overall trend was still on the rise. In other words, the higher the star rating of consumers is, the higher the value is, the greater the reputation of products will be, and the greater the impact of consumers' purchase decisions will be. In order to improve the analysis and the reliability of the analysis results, in view of this problem, the evaluation star level evaluation score is also considered comprehensively. After the correlation analysis, it is considered that the evaluation star level and the evaluation score are related to a certain extent, so under the time.
Figure 6: Star change trend based on time measurement.
Figure 7: Overall trend of star change.
Figure 8: Trend of total evaluation scores based on time measurement.
Figure 9: Overall trend of total change.
Measurement, the change trend is observed, and then the star level change under the time measurement is analyzed separately. Trends are compared for reliability of results. After the software analysis, we can get the change trend and the overall trend as shown (Figure 8,9).
According to the evaluation score of the hair dryer by consumers and the comprehensive trend chart of star level changes, it can be seen that the comprehensive change shows a downward trend at the end of 2015, but on the whole, it is still an upward trend. It can be seen from the figure that the evaluation of hairdryer by consumers reached the peak of evaluation score and star rating in 2015, and then declined. This trend is similar to the change trend of star rating and evaluation score from 2011 to 2012, so it is not ruled out that the problem of product quality and consumers' evaluation psychology of purchasing goods. In short, when analysing the comprehensive evaluation trend of evaluation score and star rating, and analysing the comprehensive indicators of star rating and evaluation score, it can be concluded that the reputation of products will decline briefly after 2016, and then increase rapidly, but its reputation is increasing in the online market in 2015 (Figure 10).
Figure 10: Forecast chart of online market increase and decrease of hair dryer reputation.
Because the time series analysis method of evaluation and rating of microwave oven, pacifier and blower is similar, there is no detailed explanation when analysing microwave oven and pacifier.
Establish time rating model for microwave oven
Known by question 1:
The timebased measurement and pattern are identified in the data set of microwave ovens. Because the comprehensive index of star rating and evaluation score can better reflect the increase and decrease of product reputation in the online market when the data set of hair dryer is analysed and discussed, the comprehensive index of star rating and evaluation score is directly considered in the analysis of microwave ovens, and the star to product is not considered separately Influence. According to the data set of microwave oven, after preprocessing the missing value and time series, we can get the comprehensive trend chart of evaluation score and star level based on the measurement of time series as shown (Figure 11).
Figure 11: Evaluation synthesis trend based on time series measurement.
According to the comprehensive trend chart of microwave oven evaluation score and star rating based on time series measurement, the reputation of microwave oven is slowly decreasing in the online market at this stage.
Time rating model for pacifier
Known by question 1:
According to the data set of the pacifier, after preprocessing the missing value and time series, we can get the comprehensive trend chart of evaluation score and star level based on the measurement of time series as shown (Figure 12).
Figure 12: Comprehensive trend chart of pacifier score and star rating based on time measurement.
According to the comprehensive trend chart of evaluation score and star rating of nipple based on time series measurement, the reputation of nipple is slowly increasing in the online market at this stage.
Conclusion of question b
Based on the above analysis, it can be concluded that the reputation of hair dryer is increasing in the online market, the reputation of microwave oven is slowly decreasing, and the reputation of pacifier is slowly increasing.
The model of problem c
Analysis of model: In order to better analysis of the product in a potential success and potential failure, we choose the most can reflect real product quality indicators star rating, evaluation score, number of comments, the fivestar rating proportion. See each item as a high dimensional space of points, each evaluation index represents the dimension on this point, using the comprehensive evaluation method of fuzzy theory, the commodity properties of fuzzy similarity to high point, construct the fuzzy clustering model.
Establishment of model
Data Selection: In order to avoid the volatility of evaluation indexes caused by too few data and ensure that the number of comments on each model is more than 20, we randomly select 20 products from three categories as samples and take the average value of sample indexes for analysis. A total of 20 commodities are input into a
Twodimensional matrix with respect to three variables, which is called the observation matrix:
Steps of fuzzy clustering model
Step 1: Establish observation data matrix W for sample commodities;
Step 2: The data matrix of the sample is standardized to unify the data structure. In this paper, the standard deviation change method is used to process: , among them , is the average value of four evaluation indexes of sample data, is the standard deviation.
Step 3: There are many ways to calculate the relative distance of commodity between points in matrix space, such as Mahalanobis distance, absolute distance, Euclidean distance, etc. This paper adopts the most simple and practical European distance:
Thus
Step 4: The average distance method and the shortest distance method can be used to calculate the distance between space classes.
Step 5: Through the sample matrix data, the above steps of fuzzy clustering model are carried out, and MATLAB is used to solve the problem, then the specific results can be obtained.
Solution of model
Because the scalar quantity we selected is positively related to the sales volume of the product, we select the largest data [5400,1] in the sample data as the success point s W , and calculate the result through the fuzzy clustering model as shown in the figure below (Figure 13).
Figure 13: Comprehensive evaluation diagram.
Take the Euclidean distance between goods d =0.6 and d =1.2, respectively, as the dividing line between possible successful products and failed products. At that time, d < 0.6, the potential success of the product was indicated. At that time, d > 0.6 the product failed.
The model of problem d and e
Index correlation analysis: In the sales of products in various aspects, star and review is the customer after the purchase of goods for the evaluation of the performance of the product, can objectively reflect the product quality or not. The amazon review the star rating of the basic principle is the addition of the positive and negative, and then according to the A9 algorithm weighted average, finally it is concluded that the shop star digital, has a certain scientific nature and accuracy. In order to explore whether there is a certain influence relationship between star rating, text rating and product rating, the first question is related to mining the selected text data, and the data is statistically analysed to see whether there is a correlation between each star rating and review text and rating. Therefore, the text types are divided as follows:
 The division of the five star indexes is intuitively divided into five weight standards according to the division standard in question one;
 Through the word frequency statistics in the text comment, assign value weight to the characteristic words;
 The number of valid votes is extracted as the score index of the product and the weight standard is obtained.
 Pearson correlation analysis was carried out on the divided indexes, and the correlation results among the three product index ratings were obtained as follows (Table 810).
It can be seen from table 810 that the significance between the stars, reviews and scores of the three products is p = 0.00 < 0.05, indicating that there is a linear relationship; the correlation between the three product evaluation indexes is greater than 0.7, indicating a high correlation, and the three products are closely.
Table 8: Blower correlation test sheet.
Hair dryer 
start 
comment 
grade 

start 
Pearson significance 
1, 0 
0.815,0 
0.744, 0 
comment 
Pearson significance 
0.815, 0 
1,0 
0.828, 0 
grade 
Pearson significance 
0.744, 0 
0.828, 0 
1,0 
Table 9: Microwave oven correlation test form.
Microwave 
start 
comment 
grade 

start 
Pearson significance 
1, 0 
0.817,0 
0.711, 0 
comment 
Pearson significance 
0.817, 0 
1,0 
0.836, 0 
grade 
Pearson significance 
0.711, 0 
0.836, 0 
1,0 
Table 10: Pacifier correlation test form.
Pacifier 
start 
comment 
grade 

start 
Pearson significance 
1, 0 
0.865, 0 
0.781, 0 
comment 
Pearson significance 
0.865, 0 
1,0 
0.823, 0 
grade 
Pearson significance 
0.781, 0 
0.823, 0 
1,0 
Answer to problem d and e
In order to determine the specific linear relationship between stars, reviews and ratings, the further relationship between the three indicators of each product was determined and the impact was predicted (Figure 1416).
Figure 14: Linear trend chart of blower indexes.
Can be seen from the diagram, three star ratings, reviews and comments, there is an obvious linear relationship between the content of the comments and ratings with the increase of commodity star, tend to be more high praise, content and score also gradually rise, for those low star products, relatively, comments are more negative content, grading is low. After evaluation of the products is an interconnected system, after customers to buy goods on the star rating, help review and product grade, interactions between the three, the size of the star indicators will affect the customer comments on the products and subsequent rating level. Star index is higher, affect the customer comments on the product content length, the more customers will write more positive emotional words, such as good, happy, and better, to express his love for the product, that matter, at the same time can also affect the customer to product good comments score, according to the customer's use of the products and comments are Face effects give them higher ratings.
Figure 15: Linear trend chart of each index of pacifier.
Figure 16: Linear trend chart of microwave oven indexes.
Conclusion
This study is mainly based on the neural network model of Amazon Product sales strategy analysis. Through the screening and analysis of each index by BP neural network, fuzzy evaluation theory and time rating model, the market heat and Prospect of the three products are judged. It can be seen that higher star index will lead to more favourable comments and higher scores, and customers will write more enthusiastic words about good, well and excellent, it reflects the sales volume of the products, and then provides the company with online sales strategy and affirms the timebased mode. The text data helps the company to interact in the way of manufacturing successful products, which can also be used for the evaluation of Shopping platform quality, diagnosis and treatment of shopping platformrelated diseases.
Conflict of Interest
We have no conflict of interests to disclose and the manuscript has been read and approved by all named authors.
Acknowledgement
This work was supported by the Philosophical and Social Sciences Research Project of Hubei Education Department (19Y049), and the Staring Research Foundation for the Ph.D. of Hubei University of Technology (BSQD2019054), Hubei Province, China.
References
 Kun L. Analysis and Research on emotional tendency of commodity review based on expression skills. China University Mining Technol. 2018.
 Xu X, Liu W, Gursoy D. The impacts of service failure and recovery efforts on airline cust omers emotions and satisfaction. J Travel Res. 2018; 58: 10341051.
 Ajzen I, Fishbein M. Attitude behavior relations: A theoretical analysis and review of empirical research. Psychological bulletin. 1977; 84: 888918.
 Chen Y. Research on the relationship between online reviews and consumers' purchase intention. Jiangsu University sci technol. 2014.
 Shouren H. Introduction to neural network. National University Defense Sci Technol Press. 1993: 113120.
 Minna N, Zhihong S, Zi W, Shujia L, Jing H. Application of BP neural network model for product modeling perceptual image evaluation. J Donghua University. 2016; 42: 604607.
 Ming C. Research on R and D capability evaluation of enterprises based on neural network. Ocean University China. 2015.
 Wanli Z. Research on credit risk assessment model of commercial banks based on artificial neural network. Changsha University Technol. 2013.