Circuit-aware design of energy-efficient massive MIMO systems


Published in:
2014 6th IEEE International Symposium on Communications, Control, and Signal Processing (ISCCSP)

Document Version:
Peer reviewed version

Queen's University Belfast - Research Portal:
Link to publication record in Queen's University Belfast Research Portal

Publisher rights
© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

General rights
Copyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights.

Take down policy
The Research Portal is Queen's institutional repository that provides access to Queen's research output. Every effort has been made to ensure that content in the Research Portal does not infringe any person's rights, or applicable UK laws. If you discover content in the Research Portal that you believe breaches copyright or violates any law, please contact openaccess@qub.ac.uk.

Download date: 08. Dec. 2018
CIRCUIT-AWARE DESIGN OF ENERGY-EFFICIENT MASSIVE MIMO SYSTEMS

Emil Björnson†  Michail Matthaiou†  Mérouane Debbah*

* Alcatel-Lucent Chair on Flexible Radio, SUPELEC, Gif-sur-Yvette, France
† ACCESS Centre, Dept. of Signal Processing, KTH Royal Institute of Technology, Stockholm, Sweden
‡ ECIT Institute, Queen’s University Belfast, Belfast, U.K. and S2, Chalmers University of Technology, Gothenburg, Sweden

ABSTRACT

Densification is a key to greater throughput in cellular networks. The full potential of coordinated multipoint (CoMP) can be realized by massive multiple-input multiple-output (MIMO) systems, where each base station (BS) has very many antennas. However, the improved throughput comes at the price of more infrastructure; hardware cost and circuit power consumption scale linearly/affinely with the number of antennas. In this paper, we show that one can make the circuit power increase with only the square root of the number of antennas by circuit-aware system design. To this end, we derive achievable user rates for a system model with hardware imperfections and show how the level of imperfections can be gradually increased while maintaining high throughput. The connection between this scaling law and the circuit power consumption is established for different circuits at the BS.

1. INTRODUCTION

We consider a cellular network where each BS communicates with $K$ unique single-antenna user equipments (UEs). Interference coordination is a major limiting factor in these systems but can be handled in the spatial domain by CoMP methods, where several antennas, $N$, are deployed at each BS [1]. The massive MIMO paradigm, where $N \gg K$, has gained particular traction in recent years [2], because it allows for distributed interference coordination and brings robustness to having imperfect channel state information (CSI).

Two important practical issues with the deployment of large antenna arrays are the increased hardware cost and circuit power consumption [3]—these scale linearly/affinely with $N$ unless we redesign the network with these two issues in mind. Low-cost energy-efficient transceiver equipment suffer from hardware imperfections, which must be modeled properly if accurate conclusions are to be drawn [4–6]. In this paper, we consider an uplink system distorted by multiplicative phase-drifts, additive distortion noise, and noise amplifications. We derive achievable user rates and prove that the level of imperfections can be gradually increased with $N$. The practical implications of this scaling law are established for three circuits at the BS: analog-to-digital converter (ADC), low noise amplifier (LNA), and local oscillator (LO). These are the main components of the typical receiver illustrated in Fig. 1.

2. SYSTEM MODEL

We consider the uplink of a network with $L \geq 1$ cells. The flat-fading channel from UE $k$ in cell $l$ to BS $j$ is denoted as $h_{jlk} \triangleq [h_{jlk}^1 \ldots h_{jlk}^N]^T \in \mathbb{C}^N$ and is modeled as block fading; thus, it is static for a coherence block of $T$ channel uses and has independent realizations between blocks. Each channel is zero-mean circularly symmetric complex Gaussian distributed as $h_{jlk} \sim C\mathcal{N}(0, \lambda_{jlk} I_N)$, where the average channel attenuation $\lambda_{jlk} > 0$ depends on the large-scale fading.

The signal $x_{lk}(t)$ sent by UE $k$ in cell $l$ at channel use $t$ satisfies a power constraint of $\mathbb{E}\{|x_{lk}(t)|^2\} = p_{lk}$ and $x_{lk}(t) \triangleq [x_{l1}(t) \ldots x_{lK}(t)]^T \in \mathbb{C}^K$ is the transmit signal in cell $l$.

Contrary to prior works on massive MIMO (e.g., [2] and references therein), the receiver branches at the BSs are assumed to be imperfect. Interestingly, it is shown in [4–6] that imperfect hardware mainly causes multiplicative phase-drifts, additive distortion noise, and noise amplifications. Based on these prior works, the received signal $y_j(t)$ in $\mathbb{C}^N$ at BS $j$ at channel use $t \in \{1, \ldots, T\}$ in the coherence block is modeled as

$$y_j(t) = D_{\phi_j}(t) \sum_{l=1}^{L} H_{jl} x_l(t) + v_j(t) + \eta_j(t)$$

(1)

where $H_{jl} \triangleq [h_{jll} \ldots h_{jlk}] \in \mathbb{C}^{N \times K}$ is the channel from
The SINR of user $j$ at time $t$ is given by

$$\text{SINR}_{jk}(t) = \frac{p_{jk} |\mathbb{E}\{v_{jk}^u(t)\h_{jk}(t)\}|^2}{\sum_{l=1}^L \sum_{m=1}^K p_{lm} \mathbb{E}\{|v_{jk}^u(t)\h_{jl}(t)|^2\} - p_{jk} |\mathbb{E}\{v_{jk}^u(t)\h_{jk}(t)\}|^2 + |\mathbb{E}\{v_{jk}^u(t)\v_{j}(t)\}|^2 + \sigma^2 \mathbb{E}\{|v_{jk}(t)|^2\}}$$

(7)

where $\mathbf{D}_{\delta(t)} = \text{diag}(e^{-\frac{\delta}{2}(t-1)}, e^{-\frac{\delta}{2}(t-2)}, \ldots, e^{-\frac{\delta}{2}(t-B)})$, $\mathbf{X}_{\kappa(t)} = \sum_{l=1}^K \lambda_{jl} \mathbf{x}_{\ell m} + \sigma^2 \mathbb{I}_B$, $\mathbf{X}_{\kappa(t)} = \mathbf{x}_{\ell m} + \kappa^2 \mathbf{D}_{\kappa(t)}$, and $\mathbf{D}_{\kappa(t)} = \text{diag}(|\mathbf{x}_{\ell m}(1)|^2, \ldots, |\mathbf{x}_{\ell m}(B)|^2)$. The channel estimates in Lemma 1 are utilized to select receiver filters $v_{jk}^u(t) \in \mathbb{C}^N$. Using an approach from [7] and [6, Lemma 1], the achievable rate for UE $k$ in cell $j$ is

$$R_{jk} = \frac{1}{T} \sum_{t=B+1}^{T} \log_2 (1 + \text{SINR}_{jk}(t)) \quad \text{[bit/channel use]}$$

where SINR$_{jk}(t)$ is given in (7) at the top of this page. The expectations in (7) can be computed in closed form if the BS applies maximum ratio combining (MRC): $v_{jk}(t) = \hat{\mathbf{h}}_{jk}(t)$.  

**Lemma 2.** If the MRC receive filter is used, then

$$\mathbb{E}\{|v_{jk}(t)|^2\} = N \lambda^2_{jk} \hat{x}_{jk}^2 \mathbb{E}\{v_{jk}^u(t)|^2\}$$

$$\mathbb{E}\{v_{jk}^u(t)\h_{jk}(t)|^2\} = \mathbb{E}\{v_{jk}^u(t)|^2\}$$

$$\mathbb{E}\{v_{jk}^u(t)\h_{jl}(t)|^2\} = \mathbb{E}\{|v_{jk}^u(t)|^2\}$$

$$+ N \lambda^2_{jk} \hat{x}_{jk}^2 \mathbb{E}\{\mathbf{D}_{\delta(t)}\mathbf{X}_{\kappa(t)} \mathbf{X}_{\kappa(t)}^{-1} \mathbf{D}_{\delta(t)}^\dagger \mathbf{X}_{\kappa(t)} \}$$

$$\times \{\lambda^2_{jk} \hat{x}_{jk}^2 \mathbf{D}_{\delta(t)} \mathbf{X}_{\kappa(t)} \mathbf{X}_{\kappa(t)}^{-1} \mathbf{D}_{\delta(t)}^\dagger \mathbf{X}_{\kappa(t)} \}$$

if a CLO is used, or

$$\mathbb{E}\{v_{jk}^u(t)\v_{j}(t)|^2\} = \kappa^2 \mathbb{E}\{|v_{jk}(t)|^2\} \sum_{l=1}^L \sum_{m=1}^K p_{lm} \lambda_{jl} \hat{x}_{jl}^2$$

$$+ \mathbb{E}\{|v_{jk}(t)|^2\} \sum_{l=1}^L \sum_{m=1}^K p_{lm} N \lambda^2_{jk} \hat{x}_{jk}^2 \mathbb{E}\{\mathbf{D}_{\delta(t)} \mathbf{X}_{\kappa(t)} \mathbf{X}_{\kappa(t)}^{-1} \mathbf{D}_{\delta(t)}^\dagger \mathbf{X}_{\kappa(t)} \}$$

The expectations for SLOs were previously derived in [6, Theorem 2] and, interestingly, it is only the second order moments that are different with a CLO. Hence, the case with the smallest variance $\sum_{l=1}^L \sum_{m=1}^K p_{lm} \mathbb{E}\{|v_{jk}^u(t)\h_{jl}(t)|^2\}$ of interference gives the largest rate for UE $k$ in cell $j$. This term depends mainly on the pilot sequences and phase drifts, as seen from the expressions in Lemma 2.
is that \( X_{\ell m} \) with a CLO is replaced by \( D_{\delta(t)} X_{\ell m} D_{\delta(t)}^H \) with SLOs. These terms are equal when there are no phase drifts (i.e., \( \delta = 0 \)), while the difference grows larger with \( \delta \). In particular, the term \( X_{\ell m} \) is unaffected by the time index \( t \), while the corresponding term for SLOs decays as \( e^{-\delta t} \) (from \( D_{\delta(t)} \)). Hence, we expect SLOs to provide larger user rates than a CLO, because interference reduces faster with \( t \) when the independent phase drifts mitigate pilot contamination.

By letting \( N \to \infty \) in Lemma 2, one can obtain closed-form expressions for the asymptotic SINRs. It can be seen that the detrimental impact of hardware imperfections vanishes almost completely as \( N \) grows large [6 Corollary 1]. This result holds for any fixed values of the parameters \( \delta, \kappa, \) and \( \xi \). It is also possible to increase these parameters with \( N \). This gives a gradual degradation of the circuits’ quality at the BS and the scaling should fulfill the following scaling law.

**Lemma 3.** Suppose the hardware imperfection parameters are replaced as \( \kappa^2 \to \kappa_0^2 N^{\tau_1}, \) \( \xi \to \xi_0 N^{\tau_2}, \) and \( \delta \to \delta_0 (1 + \log_e(N^{\tau_3})) \), for some scaling parameters \( \tau_1, \tau_2, \tau_3 \geq 0 \) and some initial values \( \kappa_0, \xi_0, \delta_0 \geq 0 \). If

\[
\begin{cases}
\max(\tau_1, \tau_2) \leq \frac{1}{2} \quad \text{and} \quad \tau_3 = 0 & \text{if a CLO} \\
\max(\tau_1, \tau_2) + \frac{\delta_0 (1 + \log_e(N^{\tau_3}))}{2} \tau_3 \leq \frac{1}{2} & \text{if SLOs},
\end{cases}
\]

then SINR_{jk}(t) with MRC has a non-zero limit as \( N \to \infty \).

**Proof.** This follows along the lines of [6, Corollary 3].

Lemma 3 proves that the circuit design can be relaxed as \( N \) increases. By accepting larger distortions we can achieve better energy efficiency in the circuits or lower hardware costs; see Section 4. The scaling law shows that the variances of the additive distortion noise and noise amplification can be increased simultaneously as \( N \) to some exponent. The phase-drift variance can scale only for SLOs and only logarithmically with \( N \), since it affects the signal itself. In this case, 3 manifests a trade-off between increasing imperfections that cause additive and multiplicative distortions.

### 4. SCALING LAW AWARE CIRCUIT DESIGN

We now exemplify what the scaling law in Lemma 3 means for the hardware components at the BS, depicted in Fig. 1.

#### 4.1. Analog-to-Digital Converter (ADC)

The ADC quantizes the received signal to \( b \) bit resolution. The quantization error can be included in the additive distortion noise \( v_j(t) \) and contributes to \( \kappa^2 \) with \( 2^{-2b} \) [5]. The scaling law in Lemma 3 allows us to increase the variance \( \kappa^2 \) as \( N^{\tau_1} \) for \( \tau_1 \leq \frac{1}{2} \). This corresponds to reducing the resolution of the ADC with \( \frac{1}{2} \log_2(N) \) bits, which allows for substantial cost reduction. For example, we can reduce the ADC resolution by 2 bits if we deploy 256 antennas instead of one. For very large arrays, it is even sufficient to use 1-bit ADCs.

The power dissipation of an ADC, \( P_{ADC} \), is proportional to \( 2^{2b} \) [5, Eq. (8)] and can, thus, be decreased as \( 1/N^{\tau_1} \). If each antenna has a separate ADC, the total power \( NP_{ADC} \) still increases with \( N \) but proportionally to \( N^{1-\tau_1} \) for \( \tau_1 \leq \frac{1}{2} \), instead of \( N \), due to the gradually lower ADC resolution.

#### 4.2. Low Noise Amplifier (LNA)

The LNA is an analog circuit that amplifies the received signal. It is shown in [8] that the behavior of an LNA is characterized by the figure-of-merit (FoM) expression

\[
\text{FoM}_{LNA} = \frac{G}{(\xi - 1) P_{LNA}}
\]

where \( \xi \) is the noise amplification factor defined in Section 4.1. \( G \) is the amplifier gain, and \( P_{LNA} \) is the power consumption of the LNA. For optimized LNAs, \( \text{FoM}_{LNA} \) is a constant determined by the circuit architecture [8]; thus, \( \text{FoM}_{LNA} \) basically scales with the hardware cost. The scaling law in Lemma 3 allows us to increase \( \xi \) proportional to \( N^{\tau_2} \) for \( \tau_2 \leq \frac{1}{2} \). The noise figure, defined as \( 10 \log_{10}(\xi) \), can thus be increased by \( \tau_2 \log_{10}(N) \) dB. For example, we can increase it by 10 dB if we deploy 100 antennas instead of one.

For a given architecture, the invariance of the \( \text{FoM}_{LNA} \) in [9] implies that we can decrease the power consumption (roughly) proportional to \( 1/N^{\tau_2} \). Hence, we can make the total power consumption of the \( N \) LNAs, \( NP_{LNA} \), increase as \( N^{1-\tau_2} \) instead of \( N \) by increasing the noise amplification.

#### 4.3. Local Oscillator (LO)

Phase noise in the LOs is the main source of multiplicative phase drifts. If the LOs are free-running, the drifts are modeled by the Wiener process, defined in Section 2 with variance

\[
\delta = 4\pi^2 f_c^2 T_s \zeta
\]

where \( f_c \) is the carrier frequency, \( T_s \) is the symbol time, and \( \zeta \) is a constant that characterizes the quality of the LO [9]. Moreover, the power dissipation \( P_{LO} \) in an LO is directly coupled to \( \zeta \), such that \( P_{LO} \zeta \approx \text{FoM}_{LO} \) where the FoM value \( \text{FoM}_{LO} \) depends on the circuit architecture [9, 10] and naturally on the hardware cost. For a given architecture, we can increase \( \delta \) and, thereby, decrease the power \( P_{LO} \). The scaling law in Lemma 3 allows us to increase \( \delta \) proportionally to \( (1 + \log_e(N^{\tau_3})) \) when using SLOs. Hence, the power dissipation in the LOs can be reduced as \( \frac{1}{1 + \log_e(N^{\tau_3})} \). This reduction is only logarithmic in \( N \), which stands in contrast to the \( 1/\sqrt{N} \) scalings for ADCs and LNAs (with \( \tau_1 = \tau_2 = \frac{1}{2} \)). Since linear increase is much faster than logarithmic decay, the total power \( NP_{LO} \) with SLOs increases almost linearly with \( N \). Note that no scalings are allowed when having a CLO.

The LO variance formula in (10) gives other possibilities than decreasing the circuit power. In particular, one can increase the carrier frequency \( f_c \) with \( N \) by exploiting the scaling law in Lemma 3. This is interesting because massive
MIMO has been identified as a key enabler for operating in millimeter wave bands, in which phase noise is more severe since the variance in $\sigma^2$ increases as $f_2^2$. Fortunately, massive MIMO has an inherent resilience towards phase noise.

5. NUMERICAL EXAMPLE

The analytic results are corroborated by simulating a scenario with 16 cells and wrap-around to avoid edge effects; see Fig. 2. Each square cell is 250 × 250 meters and divided into 8 virtual sectors; each sector contains one uniformly distributed UE (with minimum distance 35 meters). Each sector has an orthogonal pilot sequence from a DFT matrix [6], but the same pilot is reused in the same sector of other cells.

The channel attenuations are $\lambda_{jlk} = 10^\gamma_{jlk} - 1.53 / d_{jlk}^3.76$ where $d_{jlk}$ is the distance in meters between BS $j$ and UE $k$ in cell $l$ and $s_{jlk} \sim N(0, 0.25)$ is a realization of the shadow-fading. The transmit powers are $p_{jlk} = -47$ dBm/Hz, the thermal noise power is $\sigma^2 = -174$ dBm/Hz, $B = 8$ is the pilot sequence length, and the coherence block is $T = 500$.

The achievable sum rate is shown in Fig. [5] for a system with either ideal hardware or hardware imperfections given by $b = 8$ bit ADCs, 2 dB noise figure in the LNAs, and LOs with a phase noise variance of $1.6 \cdot 10^{-4}$. This corresponds to $(\kappa_0, \xi_0, \delta_0) = \{2^{-8}, 10^{0.2}, 1.6 \cdot 10^{-4}\}$ when we scale the hardware imperfections with $N$ as described in Lemma 3.

We see that the throughput is reduced by hardware imperfections. The loss is larger when the imperfections are increased with $N$, but the difference essentially vanishes as $N \to \infty$ if the scaling law for SLOs in Lemma 3 is satisfied. The throughput loss is large when the scaling law is not followed. We observe that SLOs provide higher throughput than a CLO. This is because parts of the interference average out.

6. CONCLUSION

Massive MIMO systems are prone to hardware imperfections in ADCs, LNAs, and LOs. We have shown that these systems have an inherent resilience to such imperfections. The distortions can be increased with $N$, which allows the circuit power of ADCs and LNAs to increase as $\sqrt{N}$ instead of $N$. The analysis shows that having a CLO is better in terms of energy efficiency and cost, while SLOs provide higher throughput.

7. REFERENCES


