A Proactive Mobile Edge Cache Policy Based on the Prediction by Partial Matching

A R T I C L E I N F O A B S T R A C T Article history: Received: 28 August, 2020 Accepted: 15 October, 2020 Online: 22 October, 2020 The proactive caching has been an emerging approach to cost-effectively boost the network capacity and reduce access latency. While the performance of which extremely relies on the content prediction. Therefore, in this paper, a proactive cache policy is proposed in a distributed manner considering the prediction of the content popularity and user location to minimise the latency and maximise the cache hit rate. Here, a backpropagation neural network is applied to predict the content popularity, and prediction by partial matching is chosen to predict the user location. The simulation results reveal our proposed cache policy is around 27%-60% improved in the cache hit ratio and 14%-60% reduced in the average latency, compared with the two conventional reactive policies, i.e., LFU and LRU policies.

Due to the limited storage capacity of the cache devices, only a part of the contents can be stored in the edge cache devices. Hence, multiple works are focusing on how to design an efficient cache content placement policy. The most common approaches are least frequently used (LFU) and least recently used (LRU), which are referred to as reactive cache policies that determine whether to cache a specific content after it has been requested [11] [12]. In detail, LRU always caches the most recently requested contents while LFU caches the most frequently requested contents [13]. While the reactive cache policy is not efficient during peak hours. Hence, the proactive caching strategy is introduced, by which the content can be cached before the request, and hence the users can access the preferred content immediately when they arrive in new areas. [14] [15].
There are many proactive schemes have been investigated. In [16], a threshold-based proactive cache scheme based on reinforcement is presented, aiming at minimising the average energy cost. In this case, the time variation of the content is considered, which means the content popularity is changed over time rather than static. In practice, only the content whose lifetime is not expired has the potential to be cached. In [17], a caching scheme is presented to improve the cache hit rate and reduce energy consumption by predicting the content popularity distribution. In [18], a proactive cache based on the estimation of the content popularity is presented, targeting increasing the cache hit rate and decreasing the content transmission expenditure. Motivated by deep learning which can improve the accuracy of the content prediction, many works utilise deep learning for proactive caching. In [19], deep learning is utilised to predict the future probability of the content and the predicted content with a high probability will be cached. In [20], a proactive cache policy is proposed based on a deep recurrent neural network model which can predict the future content requests.
Besides, in [21], a proactive cache policy for the vehicular network is proposed, where the roadside units (RSUs) are equipped with the cache capability under high mobility of the moving vehicles. There, a long-short term memory (LSTM) network is utilised to predict the direction of the moving vehicles. Then the proactive cache problem is modeled as a Markov decision process (MDP) problem and solved by a heuristic ngreedy algorithm. In [22], Gao et al. design a proactive cache scheme for the hierarchical network where each small base station (SBS) can perceive the user mobility of its adjacent small base stations (SBSs), aiming at maximising the cache hit rate and minimising the transmission latency. In specific, the users with different moving speeds are clustered into different layers and the cached content deployment problem is solved by a genetic algorithm. In [23], a cooperative cache framework is introduced to increase the cache hit rate and minimise the access latency, in which the prediction by partial matching (PPM) is utilised to predict the vehicles' probability of arriving at the hot areas. The vehicles with long sojourn time in a hot spot are equipped with cache capability and are regarded as cache nodes. A summary of the aforementioned works is shown in Table 1. [16] A reinforcement learning-based proactive cache policy is proposed to minimise energy consumption. Here, content popularity is a time-varying variable and only the contents whose lifetime are not expired can be considered to be cached or not. [17] An accurate content popularity prediction is adopted to improve the cache hit rate and reduce energy consumption. [18] A proactive cache policy is proposed to increase the cache hit rate and decrease the content transmission cost. Here, transfer learning is applied to evaluate the content popularity, and a greedy algorithm is adopted to deal with the cache problem. [19] A deep learning algorithm is applied to predict the future probability of the content, and the content with a high predicted probability will be pre-cached. [20] A proactive cache policy is proposed to alleviate the data congestion and reduce the average latency, in which a deep recurrent neural network algorithm is adopted to predict future content requests. [21] A long-short term memory (LSTM) network is utilised to predict the direction of the moving vehicles, and the proactive cache problem is modeled as MDP and solved by a heuristic n-greedy algorithm. [22] A two-layer cache network consisting of several MSBs and SBSs is proposed to improve the cache hit ratio and reduce the average latency. Here, the adjacent SBSs can communicate with each other. Besides, the users with different moving speeds are clustered into different layers, i.e. the MBS or the SBS. [23] A cooperative cache framework is proposed to increase the cache hit ratio and reduce access latency. Here, a PPM algorithm is adopted to predict vehicles' probability of arriving in the hot areas.
Different from the aforementioned works singly considering the prediction of the content popularity or the users' location, this extended paper designs a proactive cache policy jointly considering the prediction of the user preference and the user location to minimise the average latency and maximise the cache hit rate, which to the best of our knowledge has not been considered in the prior research works. In detail, a practical scenario is considered, in which the BSs are distributed and the users are mobile. A backpropagation (BP) neural network, one of the deep learning methods, is applied to predict the user preference based on the historical content requests. Furthermore, the user's future location is predicted via PPM which has been introduced in our previous work [1], and the user's preferred content is pre-cached at the location in which the user will highly arrive. The main contributions of this paper are as follows:  This paper focuses on minimising the average latency and maximising the cache hit rate by jointly considering the content popularity prediction and user location prediction.
 The BP neural network is applied to predict the content popularity, and PPM is chosen to predict the user location.
 The effect of the several parameters on the cache performance is investigated, i.e., the Zipf parameter, the content size, the transmission rate, the distance of the backhaul link, and the distance between the user and the BS.
The remainder of this paper is organised as follows. The system model and the problem formulation are shown in section 2. Section 3 introduces the proactive cache policy. We show the simulation results in section 4 and conclude in section 5.

System model and problem formulation
In this section, we describe the system model, state the assumption, and formulate the problem.

System model
For each time slot t whose period is one hour, the proposed proactive cache policy adopts the PPM algorithm to obtain the probability of the user arriving at different locations. The location with the highest value is regarded as the future location. In parallel, the prediction of the user preference is trained via the BP neural network. Once the predicted user preference and the future location are obtained, the popular contents in the user preference are pre-cached at the future location. Consequently, once the user arrives in this location in the next time slot (t+1), the user can immediately obtain the requested content. However, if the prediction is not accurate, the BS needs to retrieve the requested content from the core network and then send it to the users, which imposes a more latency consumption issue.
As shown in Figure 1, the distributed cache architecture consists of the following network equipment (NE): a core network, ℳ cache-enabled BSs, and mobile users. The ℎ BS is denoted by for 1 < < ℳ, the ℎ user is denoted by for 1 < < , the circular coverage area of is denoted as , and the set of the users served by is denoted as where ℜ is the rank of the content in ℱ , is the Zipf parameter for 0< <1, and is the total number of contents in ℱ .
Let ℱ represents the set of the contents requested at , represents the content popularity at and ={1,2,…, ℋ} represents the library of all the contents requested by mobile users served by ℳ BSs. Assume each BS can store contents at most for < ℋ, and each content has the same size ℬ. Besides, one user only requests one content at most for each time slot t.

Problem formulation
Based on the mention before, our target is to minimise the access latency , which is comprised of the transmission latency and propagation latency [25]. The transmission latency is caused by transmitting the content from ENi to ENj [26], in which ENi and ENj are any two network equipment. According to [27], the transmission rate ℝ ( , ) is calculated as: where B (Hz) is the available spectrum bandwidth, is the transmitted power, 2 is the noise power and is the channel gain between ENi and ENj.
Therefore, the transmission latency ( , ) based on the size of the requested content and the transmission rate is derived as: where is the size of requested content .
The propagation latency ( , ) is defined as the time of propagating the requested content from ENi to ENj. Affected by the propagation speed of the electromagnetic wave and the distance between the ENi and ENj, the propagation latency ( , ) is expressed as: where is the propagation speed of the electromagnetic wave in the corresponding channel, ( , ) is the distance between ENi and ENj.
Therefore, the access latency is expressed as: In detail, the content can be directly retrieved from BS if it is hit at the BS, i.e., the content is cached at the BS. Hence the latency ℎ of cached content is shown as: where ℝ ( , ) is the transmission rate between a user and a BS, ( , ) is the distance between a user and a BS, and ℎ is the propagation speed of the electromagnetic wave in the air.
Otherwise, the content needs to be retrieved from the core network via the backhaul links if the content is missed at the BS, i.e., the content is not cached at the BS. According to [28], the transmission rate ℝ ( , ) from the core network to the BS is shown as where R * is the maximal transmission rate of the network.
Therefore, the latency of a missed content consisting of the transmission latency ( , ) and the propagation latency of a missed content is expressed as: where ( , ) is the distance between the user and the BS, ( , ) is the distance between the BS and the core network, and is the propagation speed of the electromagnetic wave in the backhaul link.
Therefore, the average system latency is calculated as: where is the cache hit rate and is calculated as follows: where is the content requests of , is the number of request times of . The ( ) is calculated as The problem of minimising the average system latency is modeled as follows P_1: min (15) s.t. 0< < ℎ ≤3× 10 8 / 0≤ ≤ 1 (17)

The proactive cache based on the content popularity prediction and future location prediction
In this section, a proactive cache policy is proposed to address P_1. Firstly, the user preference is predicted according to the backpropagation (BP) neural network. Besides, we introduce the future location prediction based on the prediction by partial matching (PPM) algorithm. The proposed cache policy minimises the average system latency by pre-caching the predicted popular content at the correspondingly predicted location.

The content popularity prediction based on backpropagation neural network
User preference is the content probability distribution of individual user and content popularity is the content probability distribution of a cluster of users. Due to the characteristic of the user preference that a small number of contents account for most of the data traffic, the cache policy considers caching the popular content to reduce the complexity of the computation. Hence, the set of the popular contents of is denoted as ℙ ={ 1 , 2 , …, }, which contains k samples by choosing the top k contents with the highest probability from the user preference. Therefore, the set of the popular contents at BSm is denoted as After obtaining the popular content database of BSm, the BP neural network, as shown in Figure 2, is applied to predict the content popularity. The proposed neural network is comprised of three layers, namely the input layer, hidden layer, and output layer. The number of the neuron cells in the input layer and the output layer is equal to the cache storage . The content requests of are collected each hour and denoted as a training data set. Besides, two continuous training data sets are chosen to optimise the parameter of the neural network. The value for the input layer is the request times of the top popular contents in the former training data set. The value is the request times of the top popular contents in the latter training data set. Furthermore, mean squared error (MSE) is utilised as the loss function in the content prediction. The MSE is formulated as where is the value of the output layer.
Besides, the Relu function is chosen as the activation function, which is expressed as With the help of stochastic gradient descent (SGD), the proposed neural network can optimally predict the content popularity after enough training.

The future location prediction based on a prediction by partial matching
Before the location prediction, the historical location information is collected from a real environment model as shown in Figure 3. The areas labeled by red symbols are regarded as the hot spots with long sojourn time. The historical location information sequence is denoted as ℒ which is related to the hot spots. Figure 3： The user movement model.
After obtaining the historical location information ℒ, PPM is applied to predict the user's future location. PPM is a data compression method based on the finite context and it has been proven effective for the location prediction [23]. The probability of the future location y appearing after the given context Con is model as P(ycon), where Con is the sequence of the location and the length of the sequence is called order [29]. Furthermore, PPM proposes an escape mechanism to deal with the zero-frequency problem [30]. When escape occurs, i.e. y is missed after Con. Then the PPM outputs an escape probability defined as Pesc(esc|Con). The computation of PPM is shown in Algorithm 1. Firstly, PPM checks whether y appears after Con. If y appears, PPM records the number of appearing times and outputs the probability ( | ) , otherwise, PPM outputs the escape probability Pesc(esc|Con). Under the escape situation, PPM restarts to check whether y appears after the new Con (the order of which is the original order minus 1). The process is finished until y appears after Con or the order is -1. The predictive probability of the future location is the multiple of the subprobabilities and the calculation is shown as: where is the probability of step i, represents the number of the times of y appearing after Con, represents the number of the characters appearing after Con, and represents the number of the times of all the characters appearing after Con.
Once the probabilities of the possible locations are obtained via PPM, these obtained probabilities are ranked in descending order. The location with the highest probability is regarded as the future location. The process is finished until y appears after Con or h =-1. Output j, and P=∏ 1 Here is an example to help understand PPM computation by giving a user path {L1, L3} and the future location L4 in the historical data sequence ℒ ={L1, L2, L3, L4, L5, L1, L3, L1, L4, L1, L2, L4, L3, L4, L1}. First, since the sequence{ L1, L3, L4 } cannot be found from the historical data sequence, the escape probability P(esc| L1, L3) is outputted based on Pesc(esc|Con) in Eq.

The pre-deployment of the popular content at the future location
In each time slot t, the users' future locations in which users will highly arrive at the next time slot t+1 are predicted via PPM. In parallel, the user preference at t+1 is predicted via BP neural network. The top w contents with the highest number of request times are regarded as the popular contents in the future. After that, these popular contents are pre-deployed at the corresponding future location. Hence, in the next time slot t+1, if the prediction is correct, users can immediately obtain their preferred contents, which extremely reduces the average system latency.

Simulation results and analyzation
In this section, we consider a distributed BS caching network which consists of 10 BS, 30 users, and 6 locations. The number of content requests of each user is 3000. The comprehensive simulation shows the performance of our proposed policy, LFU, and LRU in terms of the average latency and cache hit rate. The specific parameter settings are shown in Table 2. The program is modeled via PyTorch language in Pycharm software.To further show the improvement of our proposed policy in terms of the cache hit rate and the reduction of our proposed policy in terms of the cache hit rate compared with LFU and LRU policies, we propose the growth ratio and the reduction ratio , which are expressed as: where and is the cache hit rate of our proposed policy and any one of the LFU and LRU policies, respectively. and is the average latency of our proposed policy and any one of the LFU and LRU policies, respectively.  Figure 4 reveals the cache hit rate (represented in percentage) of our proactive policy and the conventional reactive policies, i.e., LFU and LRU. The number of the total content requests is 6000, the Zipf parameter of each user varies between 1.7 and 1.8. Besides, to demonstrate the effect of the cache capacity on the cache performance, we introduce the cache capacity ratio = ℋ . And in this simulation, we assume = 2%, 4%, 6%, 8% and 10%. Horizontally, the cache hit rates of LFU, LRU, and our proposed policy increase with the larger cache capacity ratio. The tendency demonstrates that increasing the cache capacity can improve the cache hit rate since more popular contents can be cached. We also notice that our proactive policy has the highest cache hit rate, which is around 10-25% higher than that of LFU and LRU policies, no matter how the Zipf parameter varies. Therefore, our proposed policy outperforms the other two policies.  Figure 5 investigates the effect of the Zipf parameter on the cache hit rate of our proposed policy with the other two policies as mentioned before. We assume is 10%, and the Zipf parameter of each user varies in the range [ [1.7, 1.8]. As the Zipf parameter grows, the cache hit rates of all the cache policies increase. The reason is that fewer contents are taking up more content requests as the Zipf parameter grows, and hence the popular content becomes more popular. Considering the fixed number of the total content request, the number of content reduces. With the same capacity, the cache has a higher chance to store more contents and the cached contents are more popular, which contributes to a higher cache hit rate. Furthermore, the slopes of the three curves are gradually reduced. The reason is with the larger Zipf parameter, the newly cached popular contents have fewer content requests compared with the initially cached contents. We also notice that the two reactive policies have a relatively close cache hit rate, and the cache hit rate of our proposed policy is around 24%-38% higher than that of the two reactive policies. The relation between the average latency and the size of the content is displayed in Figure 6. Here, the size of the content is 30Kb, 200Kb, 200Kb, 250Kb, 300Kb, 350Kb, and 450Kb, respectively. Besides, we set the cache capacity ratio is 10% and the fluctuation of the Zipf parameter is between 1.7 and 1.8, the distance between the user and the BS is 10km and the distance of the backhaul link is 100km. As the size of the content grows, the average latencies of all the policies increase. The reason is the transmitter consumes more time to send the content into the channel as the size of the content grows. Vertically, the average latency obtained by our proposed policy is around 60% reduced compared with LFU and LRU regardless of the size of the content, which implies our proposed policy outperforms the two reactive policies.  Figure 7 shows the relationship between the average latency and the transmission rate between the user and BS. Here, the content size is 400Kb, the storage capacity ratio is 10%, the distance between the user and the BS is 10km and the distance of the backhaul link is 100km. The transmission rate between user and BS is 10Mbps, 20Mbps, 30Mbps, 40Mbps, and 50Mbps, respectively. As the transmission rate between user and BS grows, the average latencies of all the policies reduce. The reason is that, with the larger transmission rate, the latency between the user and the BS is reduced. Also, the average latency of our proposed policy is 31%-64% reduced compared with the other two policies. As shown in Figure 8, the average latency is plotted as a function of the Zipf parameter. Here, the Zipf parameter of each user varies in the range [ transmission rate between the user and the BS is 50Mbps, the content size is 400Kb, the storage capacity ratio is 10%, the distance between the user and the BS is 10km and the distance of the backhaul link is 100km. As the Zipf parameter increase, the average latencies of three policies are reduced. The reason is that, with the increase of the Zipf parameter, more contents are cached locally, and hence fewer contents need to be retrieved from the remote core network. And the latency from the BS is lower than from the core network. Also, as the Zipf parameter grows, the slopes of the three curves gradually decrease. The tendency is caused since the newly cached contents are less popular than the initially cached contents. Furthermore, our proposed policy is around 14%-53% reduced in terms of the average latency compared with the two reactive policies. The effect of the cache capacity ratio on the average latency is shown in Figure 9. In this simulation, we assume = 2%, 4%, 6%, 8% and 10%. Besides, the content size is 400Kb, the transmission rate is 50Mbps, the distance between the user and the BS is 10km and the distance of the backhaul link is 100km. The cache capacity ratio δ is varied from 2% to 10%. It can be noticed that the average latencies of three policies decrease with the increment of the cache capacity ratio. The fact is that a larger cache capacity means more contents can be cached. As a result, more longdistance propagation time consumption from the core network to the BS can be avoided. Also, the average latency of our proposed policy is around 35%-55% reduced compared with the LFU and LRU. Figure 9: The average latency vs. cache capacity ratio.

Conclusion
In this paper, a proactive cache policy is proposed in a distributed manner to minimise the average latency, as well as maximising the cache hit rate. An accurate prediction is achieved to make sure the proactive cache policy can have a high cache performance. In specific, a BP neural network is applied to predict the content popularity, and a PPM algorithm is applied to predict the user location. The simulation results ( Fig.4 and Fig.5 simulations) reveal our proposed cache policy is around 10%-38% improved in terms of the cache hit rate no matter how the cache capacity and Zipf parameter vary, compared with LFU and LRU policies. As for the average latency, our proposed policy has at least 14% decrease no matter how parameters change, i.e., the variation of the content size ( Fig.6 simulation), the transmission rate between the user and BS (Fig.7 simulation), the Zipf parameter (Fig.8 simulation) and the cache capacity (Fig.9  simulation). Consequently, our proposed policy outperforms LFU and LRU policies.