Channel Charting for Pilot Reuse in mMTC with Spatially Correlated MIMO Channels

Massive multiple-input multiple-output (MIMO) systems and pilot reuse are essential ingredients for internet of things (IoT) networks and massive machine-type communications (mMTC). The large number of devices and the limited amount of resources (time, frequency, and power) preclude the allocation of orthogonal pilot sequences for users. We propose a pilot reuse strategy based on channel charting to deal with the pilot contamination in massive MIMO systems with spatially correlated channels. In particular, we use channel charting as a means to extract angular domain information from channel covariance matrices and assign orthogonal pilots to users with overlapping angle of arrival intervals. The simulation results show that the proposed pilot reuse method significantly improves the channel estimation performance and the symbol error rate as compared to existing schemes.


I. INTRODUCTION
The evolving 5G and 6G networks will be the cornerstones enabling the future internet of things (IoT) connectivity. Because excessive numbers of different devices will be connected to the internet, technological solutions for massive machine-type communications (mMTC) are required [1]. Massive multiple-input multiple-output (MIMO) technology is one of the main enablers in 5G systems and beyond for capacity enhancements including the connectivity of vast numbers of things. It relies on the spatial multiplexing provided by the large numbers of antennas to improve the spectral and/or energy efficiency [2]. To exploit the massive MIMO features, channel state information (CSI) is required at the base station (BS). The CSI is typically acquired by uplink training. In other words, the user equipment (UE) send known data symbol or pilot sequences and the BS estimates the channel using the received signal. In order to minimize the signaling overhead and avoid the need for explicit downlink channel estimation, time-division duplexing (TDD) is usually preferred over a frequency-division duplexing (FDD) protocol. Thereby one can exploit the reciprocity between the uplink and downlink channels within a coherence block [3].
In mMTC, it is impractical to allocate orthogonal pilot sequences for all the users as it would consume excessively time and frequency resources due to the large number of devices. One solution is to reuse the same pilot sequence across different UEs, which can, however, lead to interference between users sharing the same pilot. This problem is known as pilot contamination and typically assumed to exist between the neighboring cells of a cellular network [4]. As presented in [5], the pilot contamination can be alleviated by exploiting the spatial correlation of channels. The location information of the UEs is used to perform pilot assignment to reduce the pilot contamination in [6]- [8].
Li et al. [9] developed a location-aware pilot reuse algorithm for a massive MIMO multi-cell scenario. They proposed first to group the users within the cells to alleviate intra-cell pilot contamination. Then, to mitigate inter-cell interference, they match the groups for neighboring cells such that groups sharing the same set of pilots do not reside in the same direction. The main practical limitation of such location-aware pilot assignment is the requirement of knowledge about UEs' positions. This was overcome in [10], where You et al. showed the feasibility of pilot reuse for massive MIMO cells with correlated channels and proposed a pilot reuse strategy based on the channel covariance without knowing the UEs' positions. Also, they derived a robust receiver, which takes into account the degradation in channel estimation accuracy caused by the pilot reuse. Their results show significant improvement in the spectral efficiency as compared to the case where all users have orthogonal pilots.
An alternative way for employing pilot reuse without knowing the exact UEs' positions could be provided by a recent framework, channel charting (CC), proposed in [11]. CC estimates the positions of devices in an unsupervised manner and maps the features obtained from CSI into a low-dimensional chart, in which the relative positions of UEs are preserved. The advantage of CC is that, after an initial training phase, the positions can be retrieved and updated from CSI obtained when estimating the channel, avoiding further overheads.
In this paper, we propose a CC-based pilot reuse algorithm for massive MIMO networks where the number of users is larger than the number of available orthogonal pilot sequences. The main idea is to perform CC using angular domain features such that users with similar angle of arrival intervals lie near each other in the CC map. This angular information is then exploited to perform greedy pilot assignment. Our main contribution is the use of CC as a means to assign the pilots. The main advantage lies in the fact that most existing pilot reuse schemes require the knowledge of UEs' positions, an assumption that can be impractical. Differently from [10], which uses channel covariance matrices as input to a greedy pilot allocation algorithm, we perform CC on them, resulting in improved efficiency for our proposed pilot allocation algorithm.

II. SYSTEM MODEL
We consider an uplink communication scenario with K single-antenna UEs communicating with a base station (BS) that is equipped with an M -element uniform linear array (ULA). We further assume block flat-fading channels where the channels are static within a coherence block, but can vary from one block to another. Following the channel model presented in [7], [12], the uplink channel vector for user k is modelled as a superposition of L paths between the k th user and the BS, which is given by In (1), α k,l is the complex gain of the l th path, which is modeled as an independent and identically distributed (i.i.d.) complex Gaussian random variable with zero mean and E{|α k,l | 2 } = 1. The large-scale fading coefficient for user k β k ∈ C, is defined as where b k (d k ) ∈ R is the attenuation due to the path loss, λ is the wavelength, and d k is the distance between user k and the BS. A steering vector for the receiver ULA is of the form e r (θ k,l ) = 1, e −j2πΔr cos(θ k,l ) , . . . , e −j2π(M −1)Δr cos(θ k,l ) T (3) where θ k,l denotes the AoA of the l th path for user k at the ULA, and Δ r is the normalized spacing between the antenna elements [13]. The AoA θ k,l is modelled as an is the incident angle between the ULA and user k, and σ θ is the angular standard deviation, which specifies the AoA interval A k around the mean valueθ k . The AoA interval is defined as the set of possible angles of the incoming multipath components arriving from user k.
We consider the widely accepted assumption that the channel is wide-sense stationary [5], [7], [11]. We further assume that the channel covariance matrices are known at the BS 1 . From (1), the channel covariance matrix for user k, denoted The corresponding received signal, y t ∈ C M , at the BS is given by where H = [h 1 , . . . , h K ] ∈ C M ×K is the channel matrix, and n t ∈ C M is the noise vector at the M receiver antennas. We model the noise as an i.i.d. complex Gaussian random variable n ∼ CN (0, σ 2 n ), where σ 2 n is the noise power. We assume that all transmitted symbols have the same power, i.e., is the transmit signal power for user k at time instant t. Thus, the signal-to-noise ratio (SNR) is defined as ρ = σ 2 x /σ 2 n .

III. UPLINK CHANNEL ESTIMATION AND DATA TRANSMISSION
The linear minimum mean square error (MMSE) receiver optimally trades off between interference and noise to achieve maximum signal-to-interference-plus-noise ratio (SINR) [13]. In this work, we deploy an MMSE-based receiver developed in [10], which takes into account the channel estimation error.
Let K = {1, . . . , K} represent the set of UEs and T = {1, . . . , τ} the set of indices of available pilot sequences. The pilot sequence assigned to user k ∈ K is denoted as x k = φ π k , where π k ∈ T . The set of UEs sharing the same pilot sequence as UE k is denoted as The received pilot signal for channel estimation, Y = y 1 , . . . , y τ ∈ C M ×τ , can be written as where N = n 1 , . . . , n τ ∈ C M ×τ is the noise matrix and X = [x 1 , . . . , x K ] T ∈ C K×τ is the pilot signal matrix. The MMSE estimate of the communication channel between user k and the BS, h k in (1), is given as [10] where y d k represents the decorrelated received signal for the pilot sequence assigned to user k, i.e., and Q k ∈ C M ×M is the covariance matrix of the received signal as The expression in (7) shows the degradation of channel estimation accuracy caused by the pilot reuse. However, according to [10], [14], when the number of BS antennas tends to infinity, the array response vectors for UEs with nonoverlapping AoAs are orthogonal, thus, making it possible to avoid pilot contamination between UEs with distinct AoAs.
As highlighted in [5], even though in practice UEs are not completely spatially uncorrelated, we can minimize the pilot contamination effect by assigning orthogonal pilot sequences to UEs with overlapping AoA intervals, i.e., A k ∩ A j = ∅.

B. Uplink Data Transmission
For TDD systems, downlink and uplink channels are symmetric, i.e., the downlink channel can be inferred at the BS from the uplink one [13]. In this paper, we concentrate on the uplink data transmission phase and the design of the MMSE receiver at the BS. Given the estimated channelĤ, we use the optimum linear MMSE receiver derived in [10], which takes into account the error covariance matrix Rh k . Because the channel estimation error is independent ofĥ k [15], we can decompose the channel as h k =ĥ k +h k , wherẽ h k ∼ CN (0, Rh k ) is the channel estimation error for user k. Thus, the error covariance matrix for user k is defined as [11] Rh Therefore, the received symbol vector at time instant t, after employing the MMSE receiver, can be expressed aŝ IV. CHANNEL CHARTING BASED PILOT ALLOCATION Next, we present the proposed channel charting (CC) aided pilot reuse scheme for massive MIMO deployments. Prior to the algorithm description, we present the principles of CC.

A. Channel Charting
The first step to build CC is to extract suitable features from CSI. As reported by [11], the large-scale properties of multiantenna systems can be captured in the second order statistics of radio environment, i.e., R k carries information about largescale fading. The large-scale propagation effects are related to the power dissipation through the propagation environment and the presence of objects between the transmitter and receiver. Thus, they vary over relatively large distances and time scales as compared to small-scale fading [16]. This is desirable for the features to be used by CC.
Our CC-based pilot assignment algorithm relies on angular domain information of channel covariance matrices and aims at assigning orthogonal pilots to users with overlapping AoA intervals to avoid pilot contamination. To this end, we must convert the information captured in the second-order statistics to the angular domain. We perform this transformation by computing the discrete Fourier transform (DFT) of the channel covariance matrices (R k ). For a large-antenna system we can approximate the eigenvectors of the channel covariance matrix by the unitary DFT matrix D such that DD H = I M [14], [17]. Therefore, we define the feature associated with UE k in the angular domain as To improve the quality of features, the work [11] proposes to take the absolute value of features; thus, we setC k = |C k |. This enhances continuity and trustworthiness measures, which assess, respectively, whether neighbors in the highdimensional space remain neighbors in the low-dimensional space and if false neighbors are introduced in the CC domain.
Let f k ∈ R M 2 represent the vectorized featureC k , such that F = [f 1 , . . . , f K ]. After the extraction of the suitable features F, we apply a function C that generates CC by mapping these features to a low-dimensional domain, i.e., where z k ∈ R N is the point in the N -dimensional CC corresponding to feature f k , where typically N M . The remaining step is to determine C in (13), which maps the extracted features to a lower dimension embedding. Several unsupervised dimensionality reduction techniques have been proposed for this purpose. Principal component analysis (PCA) is a widely used algorithm, which performs a linear projection of high-dimensional data onto a subspace of lower dimensionality [18]. In this work, we use PCA as a CC method to get the information required by the pilot allocation algorithm 2 .
The main idea behind PCA is to find the components that maximize the variance of the projected data. To obtain CC points via PCA, we first centralize the features in zero, so that Then, we apply eigenvalue decomposition on the covariance matrix of the centralized features F, i.e., where Σ is the diagonal matrix formed by the eigenvalues of F sorted in the descending order, λ 1 , . . . , λ K , and U = [u 1 , . . . , u K ] ∈ R K×K is the unitary eigenvector matrix related to Σ. Since U is unitary, the eigenvalues in Σ represent the variances related to each eigenvector [19]. Therefore, we finally obtain the CC points Z = [z 1 , . . . , z K ] ∈ R N ×K by Fig. 1 illustrates CC mapping: Fig. 1(a) shows a scenario where 64 UEs are served by a massive MIMO BS equipped with a 128-element ULA. We considered the channel model described in (1) for σ θ = 10 • , Δ r = 0.5 and L = 200. The triangle indicates the BS position while the UEs are represented by circles which are colored based on their AoAs with respect to the BS. Fig. 1(b) shows the CC obtained via PCA. We notice that CC preserves well the angular information for users nearby each other in the angular domain, i.e., UEs with a similar AoA are mapped close in the CC domain. However, UEs that have large azimuth distances suffer from distortions in CC, which incorrectly maps them closer than they should be [20]. The authors in [20] propose a dimensionality reduction method based on Euclidean distance (a) Considered system scenario.
(b) Channel charting mapping. matrix completion to handle this issue. Despite this minor distortion, the simulation results in Section V show that the proposed CC-based method obtains high performance.

B. Pilot Allocation Strategy
Here, we present the proposed CC-based pilot allocation algorithm, shown in Algorithm 1. Our algorithm relies on an observation that a PCA-based CC map for a typical mMTC uplink communication scenario has a curved line shape, as illustrated in Fig. 1(b). Thus, provided that that CC captures well the angular distances among UEs, our greedy algorithm allocates the orthogonal pilot sequences in an ordered way φ 1 , φ 2 , . . . , φ τ , φ 1 , φ 2 , . . . by traversing through the CC curve, aiming at maximizing the distances between the same pilot sequences.
The first step in Algorithm 1 is to allocate the first pilot sequence φ 1 to one of the users. For now, allow φ 1 to be allocated to a random UE; later on, we discuss the impact of an alternative initialization. Then, we greedily allocate the next orthogonal pilot sequence, φ 2 , to the unassigned UE with the smallest distance to the previously allocated user Algorithm 1: CC-based pilot assignment algorithm Input : 1) The number of UEs K, 2) the pilot length τ , and 3) the CC points z k , k ∈ K. Output: A pilot assignment X = [x 1 , . . . , x K ] T .
1 Initialize the set of unassigned UEs K un = K, and the set of unassigned pilots T un = T . 2 Select a random UE k and initialize the auxiliary variable k with it, i.e., k = k. 3 Assign φ 1 to user k and update the set of unassigned UEs and pilots, i.e., K un ← K un \ {k} and T un ← T un \ {1}. 4 Initialize the auxiliary variable: p = 2. 5 while K un = ∅ do 6 if T un = ∅ then 7 Reinitialize: T un = T and p = 1. Assign pilot φ p to user k, i.e., x k = φ p , that satisfies k = arg min

10
Update the set of unassigned UEs, K un ← K un \ {k}, and the set of unassigned pilots, T un ← T un \ {p}. 11 Update k = k and p = p + 1. 12 end k , i.e., we find k that minimizes z k − z k 2 (Line 9). We repeat this process until all orthogonal pilot sequences have been allocated. Once the last orthogonal pilot sequence φ τ has been allocated after τ iterations, we start to repeat the steps by allocating the first sequence φ 1 to the closest unassigned UE from the precedent allocated user. We repeat this process until all UEs have been assigned a pilot sequence. Note that once the CC has been generated, the pilot allocation strategy summarized in Algorithm 1 enjoys low computational complexity due to greedy-based search among the UEs.
Assuming that CC perfectly captures the angular distances among UEs and we made a "good" choice when assigning the first pilot sequence (chose the one of the UEs in the extremities of the angular domain), our algorithm efficiently assigns the pilot sequences to avoid pilot contamination. Note that we face additional interference from initialization only if the last UE processed by the algorithm receives the first pilot sequence.

V. SIMULATION RESULTS
We consider K = 64 users uniformly distributed in a cell as depicted in Fig. 1(a) and a BS equipped with a critically spaced (Δ r = 0.5) 128-element ULA. The propagation channel between the k th user and the BS consists of L = 200 paths. We assume channel normalization, i.e., β k = 1, ∀ k. We use binary phase shift keying (BPSK) for the channel estimation and quadrature phase shift keying (QPSK) for the data transmission.
We consider three baseline pilot assignment methods: • "Random": A random pilot assignment scheme. • "Real position": A multi-cell pilot assignment method proposed in [9] adapted to a single-cell scenario. This method relies on the exact UEs' positions and does not require covariance information. • "SGPS": The statistical greedy pilot scheduling (SGPS) developed in [10] that, similarly to our method, relies on the knowledge of the channel covariance matrices. Differently, we perform CC on them to assign the pilots; this also obviates the need to store them at the BS.
We evaluate the performance of methods in terms of 1) the normalized mean square error in the channel estima- , which is numerically evaluated via Monte-Carlo averaging over independent simulation realizations and presented as normalized average square error (NASE), and 2) the symbol error rate (SER), which assesses the impact of the pilot reuse on the data rate. Fig. 2 depicts the impact of the angular standard deviation σ θ on channel estimation error (Figs. 2(a)-(c)) and SER (Figs. 2(d)-(f)) as a function of SNR for a pilot reuse factor K/τ = 4. We can see from Figs. 2(a)-(c) that the proposed method outperforms the baseline methods for all simulated scenarios. One should note that the "Real position" pilot assignment method was primarily developed for a multi-cell scenario; UEs with similar AoAs are grouped and assigned orthogonal sequences. However, no policy guides the allocation within the groups. Also, we observe that the smaller the angular spread A k , the smaller the channel estimation error. This happens because the likelihood of UEs having overlap-ping A k 's decreases as A k decreases. Since the receiver relies on the estimated channel, it is expected that the SER also improves as we improve the estimation for the channel, which is confirmed in Figs. 2(d)-(f). It can be further noticed that σ θ has a great impact on both SER and NASE, e.g., the NASE for the random pilot assignment with σ θ = 10 • is lower than that obtained for all compared methods with σ θ = 15 • . Fig. 3 presents the performance of the proposed method for different pilot reuse factors K/τ = {2, 4, 8} and a fixed angular standard deviation, σ θ . Following [4], we adopt σ θ = 10 • which is a suitable value for an urban environment. Similar to Fig. 2, the proposed scheme outperforms all compared methods. We observe from Fig. 3(c) that the performance of the proposed algorithm provides significant improvement for a pilot reuse factor of 2: for example, at 20 dB SNR, we achieve more than 12.5 dB gain with respect to random pilot assignment and 6 dB gain with respect to SGPS. From Figs. 3(d)-(f) we notice how big is the impact of the pilot reuse factor on SER. The proposed method achieves SER values from 2.8 × 10 −1 for a pilot reuse factor of 8, up to 5 × 10 −5 when K/τ = 2, at 20 dB SNR.

VI. CONCLUSIONS
This paper addressed the pilot contamination problem in a single cell for massive MIMO systems. We proposed a novel pilot reuse approach which relies on CC to exploit the spatial/angular information present in CSI to tackle the pilot contamination problem. The proposed pilot assignment utilizes CC mapping to maximize the AoA distances between the same pilot sequences. The proposed method showed significant improvements in terms of channel estimation error and symbol error rate as compared to existing pilot allocation schemes.