# Analysis of Body Bias Control for Real Time Systems

Carlos C. Cortes Torres, Hayate Okuhara, Akram Ben Ahmed, Nobuyuki Yamasaki, Hideharu Amano

Department of Information and Computer Science

Keio University

3-14-1 Hiyoshi, Kohoku-ku, Yokohama, Kanagawa, Japan 223-8522 {cortesc, hayate, akram, hunga} @am.ics.keio.ac.jp, yamasaki@ny.ics.keio.ac.jp

Abstract – In the past decade, Real-Time Systems (RTSs) have been widely studied. RTSs should maintain time constraints to avoid catastrophic consequences and should also be energy efficient as it can be embedded in devices where the battery life is primordial.

This paper is the first study of introducing dynamic body biasing (BB) to RTSs, we investigate the energy efficiency of RTSs by analyzing the ability of BB on providing a satisfying tradeoff between performance and energy. The study was conducted using accurate parameters extracted from real chip measurements of a low-power microcontroller using Silicon On Thin Box (SOTB) technology; with such proposal we were able to achieve 46% energy reduction.

# I. Introduction

Nowadays, Real-Time Systems (RTSs) have been part of our daily life as they are employed in different domains, such as home appliances, medical systems, robotics, security, aeronautics, and many others. One class of these systems are used for highly timing-critical tasks which should be executed before a predefined deadline. When failing to meet this deadline, the executed task's results can be corrupted or even the entire system may fail leading to possible catastrophic consequences.

At the same time, and with the raising popularity of Internet of Things (IoT), the need to design RTSs that can be embedded in small devices has become a necessity. In addition, this kind of embedded RTSs requires a battery life of a few years and should operate on a very low power budget in the order of milliwatts. As technology continues to scale, the leakage current kept increasing and a strict control is needed, to find an optimal operational region. Therefore, the energy consumption of RTSs should be kept to its minimum while making sure that the timing constraints are strictly met.

The energy efficiency of RTSs has been intensively studied for a while now. Some techniques that have been used in general-purpose VLSI designs were employed. Such techniques include *Dynamic Power Management* (DPM) [1], *Dynamic Voltage Scaling* (DVS) [2], *Dynamic Voltage Frequency Scaling* (DVFS) [3], and so on. On the other hand, some solutions were specifically proposed for RTSs. As explained in the next section, these techniques could improve the energy efficiency at a certain point; however, they are either too complex to implement, their energy consumption is still relatively high, or they are not too flexible to accommodate any possible modifications after the chip fabrication.



*Fig. 1 Cross-sectional view of an SOTB MOSFET: (a) pMOS and (b) nMOS.* 

One of the promising solutions to reduce the energy consumption is the use of Body Bias (BB) as it provides an efficient trade-off between leakage power and performance [4, 5]. This efficiency is further endorsed when implemented with Silicon On Thin Box (SOTB) technology [6]. SOTB is an FD-SOI device technology where a cross-sectional view of its MOSFET is shown in Fig. 1.

Unlike other conventional FD-SOI devices, an SOTB device is formed on an ultrathin (about 10 nm) box layer, and it enables a wide range of body bias control. Consequently, SOTB ensures more efficient reduction of leakage current during control of body bias than other conventional MOSFETs.

As it is the case for any FD-SOI technology, the default state of a given MOSFET in SOTB is called zero-bias. If a negative voltage is applied (VBN < VS and VBP > VS) then the depletion width increases, hence threshold voltage increases; this is known as Reverse Body Bias (RBB). Correspondingly, if a positive voltage is applied to the body while the source is tied to ground (VBN > VS and VBP < VS), then the depletion width decreases hence threshold voltage increases, this is known as Forward Body Bias (FBB).

FBB can achieve high operation speeds at the cost of leakage current, while reverse body bias can reduce the leakage current at the price of gate delay. Consequently, the body bias for RTS should be carefully selected, and accurate timing and energy models should be elaborated in order to avoid any unnecessary leakage overhead (excessive FBB) or timing requirement unsatisfactory (excessive RBB).

In this paper, we study the energy efficiency of RTSs by analyzing the effect of dynamic BB on both performance and energy. To this aim, we propose a power model capable of illustrating the energy consumption based on the task execution while taking into consideration a given deadline constraint. We conducted two experiments with different frequencies, supply voltage, and BB voltage to explore the efficacy of BB and demonstrate its flexibility to adjust both energy and performance.



Fig. 2 First scenario: the system works at low frequency to allow the task execution (instruction) to be finished at the deadline. V850 microcontroller 1 instruction = 1 core clock.

# II. Related Work

Many approaches have been proposed to enhance the energy efficiency in embedded RTS power, including voltage algorithms, and device level solutions.

Dynamic Power Management (DPM) is one of the well-known techniques. It consists of shutting down the power supply so that in idle states the chip does not consume any power. Commonly known circuit level techniques it characterize to usually employ sleeping transistors (where a high threshold device is connected in series with low threshold transistors), by executing tasks at low frequency, these gating techniques must employ an algorithm as well in order to maintain RTS deadlines, ensuring all the task have been executed, such as dynamically adjusted sleeping window size.

Nevertheless, this technique requires turning on/off time for saving register values and restoring cache contents. This introduces an execution overhead. Furthermore, as in other existing circuit level shut-down techniques, the performance degradation is unpredictable; Power gating techniques can be applied to a wide range of chips, however must be included in the design phase, certainly, they incur into a hardware layout penalty, and therefore cannot be applied for existing chips. Therefore, the leakage power dissipation increases exponentially due to the subthreshold leakage current, although there are some techniques to minimize this effect (Multiple-Threshold Voltage CMOS (MTCMOS), to create a virtual power supply and ground rails whose voltage levels are very close to the real ones). Another problem is that they are prone to reduced performance and noise [1, 2, 7]. Hence, this kind of solutions is unpractical for embedded RTS.

*Real-time Dynamic Voltage Scaling* (RT-DVS) algorithms have been widely employed in modern computer systems. DVS algorithms are shown to be able to make dramatic power reduction while providing the necessary peak computation power in general-purpose systems. However, for time constraint applications (RTS), are overlooked hence the scaling of processor frequency could be prejudicial. In order to get the DVS benefits in a RTSs embedded systems and to meet the deadline criteria, schedulers must be implemented to ensure that the tasks are executed in time, RT-DVS [8]. Some



Fig. 3 Second scenario: the system works at high frequency and the task execution (instruction) is finished before the deadline. V850 microcontroller 1 instruction = 1 core clock.

of the RT-DVS algorithms proposed are not well balance enough. Some of the problems are that algorithms lower the frequency in one cycle that in the next cycle high voltage-frequency are required to meet the deadlines, with performance penalty (scheduling overheads). Lowering the frequency is obtained by comparing the worst case specification and unused utilization; this could cause excessive and conservative assumption, this provokes a non-deterministic behavior.

For RT-DVS schedulers commonly used are *Earliest Deadline First* (EDF) or *Rate Monotonic* (RM). Both algorithms consist in prioritizing the tasks by time execution, either statically or dynamically, and delay the task executions as late as possible; thereby, group idle periods. In this way, the processor can be idle for a longer period with smaller number of power transitions. Other proposed techniques are based on either timeout mechanisms or stochastic methods; however, they cannot be applied to RTS due to its unpredictability [9].

As the best of our knowledge, no work has been reported on using Dynamic Body Biasing for energy reduction of Real-Time Systems.

#### III. Proposed Approach

In the present study, we focus on two possible scenarios to execute a given task taking into consideration a predefined deadline. In the first scenario, the system works at the minimum frequency at which the task execution finishes right just before the deadline, as illustrated in Fig. 2. This means that minimum supply (VDD) and body bias (VBN) voltages are supplied. The second scenario describe in Fig.3, consists of increasing the VDD and VBN to boost the frequency, and the task is executed in much shorter time than the first one. Hereafter, we present first the timing characteristics of RTS followed by the used power and energy model to illustrate the energy characteristics of each scenario.

#### A. System model:

We define *Texe* as the execution time of a given critical task, and this task is executed with N instructions. Assuming that each instruction is executed in *Cpi* cycles and the clock period is *T*, *Texe* can be represented as:

$$Texe = N Cpi T \tag{1}$$

We also define D as the deadline before which given the critical task should be executed. In general, the following condition should be satisfied to meet a given RTS timing constraints:

$$Texe + \mathcal{E} \le D \tag{2}$$

Here,  $\varepsilon$  represents the additional overhead that is required for acquiring the necessary operational frequency, the time to establish the necessary VDD and VBN supply voltages.

For simplicity, we assume that Texe = D in the first scenario, as shown in Fig.2. In addition, we set  $\varepsilon = 0$  in both scenarios. Knowing that the frequency is the inverse of the period, the minimum operating frequency can be depicted as:

$$f = \frac{N \cdot Cpi}{Texe} \tag{3}$$

B. Energy model:

After giving a simple representation of a given RTS timing characteristics, we present hereafter the power and energy model. In general VLSI systems, the power consumption can be represented as:

$$P = Ps + Pd \tag{4}$$

where *Ps* is the static power and *Pd* is the dynamic one which can be obtained from the following equations, respectively:

$$Ps = I \cdot 10^{A.VDD + B.VBN} \cdot VDD \tag{5}$$

$$Pd = \alpha_{at} C V D D^2 f \tag{6}$$

In equation 5, *VDD* and *VBN* represent the supply and body bias voltages. *I* is the leakage current, and *A* and *B* are the coefficients of an exponential terms for *VDD* and *VBN*, respectively [10]. In equation 6,  $a_{at}$  is the switching activity factor, *C* is the capacitance, and *f* is the minimum operational frequency (minimum frequency required to meet the deadline).

As the energy is the product of the power (P) and the execution time (Texe), the static energy (Es) can be simply calculated by:

$$Es = I \cdot 10^{A.VDD + B.VBN} \cdot VDD \cdot Texe$$
(7)

Using equations (3) and (6), the dynamic energy (Ed) can be simplified as:

$$Ed = \alpha C \cdot VDD^2 \cdot N \cdot Cpi \qquad (8)$$

When applying the above equations, the energy consumption when adopting the first scenario can be formulated as:

$$E = I \cdot 10^{A.VDD+B.VBN} \cdot VDD \cdot Texe + \alpha_{at} \cdot VDD^2 \cdot N \cdot Cpi$$
(9)

On the other hand, when considering the second scenario, another portion of energy should be considered. When the task execution is completed (*Texe*  $\geq$  and  $\leq$  *D*), the system consumes energy in the standby state. In fact, this energy is only static (*Ed* =0). To reduce the standby energy, a strong Reverse Body Bias (RBB) is applied as soon as the task execution is finished. Consequently, and similarly to equation (7), the standby energy consumption (*Esb*) for the second scenario can be represented as:

$$Esb = I \cdot 10^{A.VDD + B.VBN_{sb}} \cdot VDD \cdot Tsb \quad (10)$$

$$Tsb = D - Texe - \mathcal{E}_{RBB} \tag{11}$$

Here,  $VBN_{sb}$  is the applied strong RBB and Tsb is the standby time. In (11) is represented the overhead energy, however for this first study approach will not be consider.

## C. Finding optimal VDD and VBN:

Finally, we demonstrate how the optimal VDD and VBN voltages can be calculated. To decide the supply voltage at a given frequency  $f_{max}$ , we use the alpha power law [11] represented as:

$$f_{max} = \frac{F(VDD - Vth)^{\alpha}}{VDD}$$
(12)

where *F* is a coefficient related to frequency,  $\alpha$  is equal to 2 (in the case of SOTB technology), and *Vth* can be approximated as:

$$Vth = V_{t0} - K_{\nu} \cdot VBN \tag{13}$$

Here,  $V_{t0}$  is the threshold voltage at zero-bias and  $K_{\gamma}$  is a process parameter. From equation (12), the optimal supply voltage VDD can be depicted by:

$$VDD = \frac{\left(V_{TH} + \frac{f_{max}}{F}\right) + \sqrt{\left(V_{TH} + \frac{f_{max}}{F}\right)^2 - 4V_{TH}^2}}{2}$$
(14)

As for the optimal VBN, it can be extracted from equations (12) and (13) and simplified as:

$$VBN = \frac{\left(\frac{VDD f_{max}}{F}\right)^{\frac{1}{\alpha}} - (VDD - V_{t0})}{K_{\gamma}}$$
(15)

In the next section, we apply the above model and analyze the BB effect on performance and energy efficiency.

#### **IV. Evaluation**

#### A. Evaluation methodology:

To evaluate the proposed methodology, some parameters are obtained from a real chip microcontroller V850 which is a 32-bit RISC microcontroller for car electronics, digital signal



Fig. 4 Chip photograph of V850 microcontroller.



Fig. 5 Evaluation board of V850. This system was employed to perform the meassurments, in such testing a time constraint was pre-defined to emulate a RTSs.

processing, and digital servo-motor control [12]. It is composed of five-stages standard pipeline, 46.2k gate logic cells and 128kb Instruction/Data memories. The chip is implemented with LEAP 65-nm FD-SOI SOTB technology. The chip photograph of the V850 is shown in Fig. 4. The chip measurement is done with an evaluation board, as represented in Fig. 5. The evaluation board can change voltages of VDD and body bias voltages with DC-DC converters statically, and also can control the state of V850 with an FPGA. The V850 executes one instruction per cycle. Therefore, the Cpi parameter previously defined in equation (1) is set to one. It is also important to mention that V850 contains a processing core and on-chip memory. These two components have different timing and power characteristics. In fact, both have different supply voltages. However, and for simplicity, we supply the two components with the same supply voltage (VDD) in this evaluation. On the other hand, the core and memory have different BB voltages, called VBN and VBNM, respectively. Consequently, these two components should be modeled independently and the total energy consumption of the target microcontroller for both scenarios ( $E_{sc1}$  and  $E_{sc2}$ ) can be represented by equations (16) and (17), respectively.

$$E_{sc1} = Es_{core} + Es_{mem} + Ed_{core} + Ed_{mem}$$
(16)

$$E_{sc2} = Es_{core} + Es_{mem} + Ed_{core} + Ed_{mem} + Esb_{core} + Esb_{mem}$$
(17)

Here,  $Es_{core}$  and  $Es_{mem}$  are obtained from equation (7),  $Ed_{core}$  and  $Ed_{mem}$  from equation (8), and  $Esb_{core}$  and  $Esb_{mem}$  are acquired from equation (10). (17) considers the chip total energy, static, dynamic and stand by as, one cycle of execution and one cycle of power off as one transaction.

Finally, the chip characteristics in each voltage conditions and parameters of the power model are obtained as described in [11]. The parameters of the power model are shown in Table 1.

| Tuoto T cocyjetetnis jot ine proposal potret modeli |                          |                          |
|-----------------------------------------------------|--------------------------|--------------------------|
| Parameter                                           | Core                     | Memory                   |
| А                                                   | $2.5876 \times 10^{-4}$  | $3.0523 \times 10^{-3}$  |
| В                                                   | 0.51921                  | 0.45172                  |
| Ι                                                   | 1.7926                   | 2.1563                   |
| F                                                   | $3.7121 \times 10^{8}$   | $5.5363 \times 10^{8}$   |
| Kγ                                                  | 0.11104                  | 0.068157                 |
| $\alpha_{at}C$                                      | 6.2478×10 <sup>-11</sup> | $1.3669 \times 10^{-10}$ |

#### B. Ideal case of the second scenario

First, the efficiency of an ideal case of the second scenario is shown. As we previously mentioned, this case has two states: operational state with zero bias and standby state with no switching logic and strong reverse bias. To evaluate the dynamic BB with ideal case, the energy and time overheads of dynamic BB are not assumed. Also, the leakage current of the standby state is ignored. This is because a strong reverse bias state can drastically reduce leakage currents. As described before, the number of instruction is decided by the first scenario with Cpi = 1, since in the V850 one instruction = core clock because of this and as an initial study, we decided one for calculations simplicity, however it could be 0.5, 0.2 according to the task. Thus, higher operational frequencies than that of the first scenario allow the instructions' execution of each task before the deadline. When all instructions finished, then, the system can be put to the standby state by the next operational state.

The energy efficiency of the second scenario, when compared to 10MHz and 20MHz operations of the first scenario, are shown in Figs. 6 and 7. Here, the deadline is assumed to be 1ms and the power supply voltages for each operational frequency are obtained by equation (14) with VBN = 0. As shown in these graphs, higher operational frequencies consume higher dynamic energies because higher VDDs are needed. On the other hand, the static energy can be reduced at higher frequencies. This can be explained by the fact that the system spends less time in operational state which consumes large leakage current with zero bias. In both graphs, the second scenario can achieve lower energy consumption than that of the first scenario. In Fig. 6, the 40MHz operation with the second scenario can reduce 45.9% of energy consumption when compared to the first scenario. Similarly, in Fig. 7, 18.5% energy reduction can be achieved.

Fig. 8 shows the energy reduction ratio results based on Figs. 6 and 7. Each curve of this graph corresponds to the case of 20,000 instructions or 10,000 instructions per task. As we increase the number of instructions, the system's power becomes dynamic energy dominant. Thus, it degrades the leakage reduction efficiency. Therefore, on the condition that a task does not include many instructions, dynamic BB is a promising technique for low power chip operation.



Fig. 6 Comparison of energy consumption between 10MHz (first scenario) and higher frequency of second scenario.



Fig. 7 Comparison of energy consumption between 20MHz of first scenario and higher frequency of second scenario.

# C. Leakage current at standby state

Although the leakage current at standby state is ignored in the previous section, a little leakage current is consumed in actual, as previously stated. For more accurate evaluation, we consider this leakage current in this subsection. Here, the standby state is assumed as 1.0V of reverse bias (VBN=VBNM=-1.0V). Using equation (10), the leakage current of the standby state is calculated.

Fig. 9 depicts the energy reduction of the two cases represented in Fig. 8 and it while considering the standby leakage current. As the graph clearly shows, the degradation of the energy reduction is small. The maximum degradation is only 1.6% in both cases of 10000 and 20000 instructions per task. That is why, when discussing the efficiency of the dynamic BB, we can ignore the leakage current at the standby state.

## D. Operation with optimized voltage condition

So far, we assume that the microcontroller operates at zero bias. However, as described in our previous work [11], an optimal power supply and body bias voltage set for each operational frequency can be obtained. In this section, we assume the chip works at optimal voltage conditions for each frequency. This optimal voltage condition is defined as the one that can achieve the lowest power consumption in the proposed model for each operational frequency. Since our previous work succeeded in the optimization at 22, 30, 40, and 47MHz of operational frequency, we assume the operational frequency for the first scenario is 22MHz and



Fig. 8 Comparison of energy reduction on different instruction density.



Fig. 9 Energy reduction ratio considering leakage current at standby state.

higher frequencies (e.g. 30, 40, and 47MHz) are corresponding to the second scenario.

Fig. 10 shows the energy consumption of this case study. As shown in the graph, the first scenario achieves the lowest energy consumption. The optimal voltage condition can set the V850 to the low leakage condition and the gain of deep reverse bias is lowered. However, the optimal voltage operation needs many power suppliers, because each BB domain requires a private voltage regulator.

Here, we compare the energy consumption at the optimal voltage conditions without dynamic BB to the one at zero bias operation with dynamic BB. The dynamic BB assumes negligible leakage current at the standby state. Fig. 11 shows the energy consumptions of the zero bias operation normalized by the one at the optimal voltage condition. Since the zero bias operation causes large leakage current, the energy consumption is increased when compared to the energy at the optimal voltage conditions without dynamic BB. However, the graph also shows that dynamic BB can suppress the energy overhead of zero bias. Therefore, dynamic BB can provide a reasonable compromise between the number of regulators and energy reduction. Although the first scenario of zero-bias operation (without dynamic BB) causes 71% of energy overhead, the second scenario reduce this degradation to 45%.

#### V. Summary and Conclusions

In this work, we presented one of the first studies to analyze



Fig. 10 Comparison of energy consumption between 22MHz of the first scenario and higher frequency of the second scenarios with optimal operational conditions.

BB control as a technique to improve the energy efficiency of RTSs. The study takes into consideration the timing constraints and analyzes how to reduce the energy consumption. We started by proposing an analytical model that uses conventional timing and model the energy constraints of RTSs when employing BB control. Accurate parameters were extracted from real chip measurements and used in the proposed model. From the evaluation results, we observed that manipulating supply and BB voltages to boost the frequency and execute a given task in shorter time can reduce the energy consumption by up to 46% at 40MHz. The energy reduction increase according to the frequency. 40MHz is the optimal point since at lower speeds more time is needed to complete the tasks and at higher speeds more energy is consumed to meet such frequencies. We also showed that the standby leakage current can be ignored at the zero-bias state. In addition, we noticed that proposed approach may not provide the wanted energy efficiency when compared to the optimal power setting that we previously proposed. Nevertheless, the proposed approach can be adopted if less implementation complexity is preferred.

## VI. Future work

In the proposed approach, the voltage transition time overhead is not considered for simplicity. Therefore, we plan to extend the present energy model to include the BB control latency and power overheads. This can give us more energy evaluation while making that hard real timing constrains are met to satisfy the requirements or RTSs.

## References

- [1] Y. J. Chen, C. L. Yang, J. W. Chi, and J. J. Chen, "TACLC: Timing-aware cache leakage control for hard real-time systems," *IEEE Trans. Comput.*, vol. 60, no. 6, pp. 767–782, 2011.
- [2] H. Kweon, Y. Do, J. Lee, and B. Ahn, "An efficient power-aware scheduling algorithm in real time system," *IEEE Pacific RIM Conf. Commun. Comput. Signal Process. - Proc.*, pp. 350–353, 2007.



Fig. 11 Energy consumption of zero-bias normalized by the energy achieved at optimal voltage condition.

- [3] V. Hanumaiah and S. Vrudhula, "Temperature-aware DVFS for hard real-time applications on multicore processors," *IEEE Trans. Comput.*, vol. 61, no. 10, pp. 1484–1494, 2012.
- [4] A. Bonnoit, "Reducing Power using Body Biasing in Microprocessors With Dynamic Voltage/Frequency Scaling,", 2010.
- [5] L. Yan, J. Luo, and N. K. Jha, "Joint dynamic voltage scaling and adaptive body biasing for heterogeneous distributed real-time embedded systems," *IEEE Trans. Comput. Des. Integr. Circuits Syst.*, vol. 24, no. 7, pp. 1030–1041, 2005.
- [6] T. Ishigaki, et al., "Ultralow-power LSI Technology with Silicon on Thin Buried Oxide (SOTB) CMOSFET," Solid State Circuits Technologies, Jacobus W. Swart (Ed.), InTech, pp. 146–156, 2010.
- [7] D. Duarte, Y.-F. Tsai, N. Vijaykrishnan, and M. J. Irwin, "Evaluating run-time techniques for leakage power reduction," Proc. Asia and South Pacific and the 15<sup>th</sup> Int. Conf. on VLSI Design pp. 31–38, 2002.
- [8] P. Pillai and K. Shin, "Real-time dynamic voltage scaling for low-power embedded operating systems," *Proc. 18th* ACM Symp. Oper. Syst. Princ., pp. 89–102, 2001.
- [9] M. S. Lee and C. H. Lee, "Enhanced cycle-conserving dynamic voltage scaling for low-power real-time operating systems," *IEICE Trans. Inf. Syst.*, vol. E97-D, no. 3, pp. 480–487, 2014.
- [10]Y. Lee, K. P. Reddy, and C. M. Krishna, "Scheduling Techniques for Reducing Leakage Power in Hard Real-Time Systems," *Real-Time Syst.*, pp. 1–8, 2003.
- [11]H. Okuhara, K. Kitamori, Y. Fujita, K. Usami, and H. Amano, "An optimal power supply and body bias voltage for a ultra low power micro-controller with silicon on thin box MOSFET," IEEE/ACM International Symp. On Low Power Electronics and Design pp. 207-212, 2015.
- [12]Kuniaki Kitamori et al., "Power optimization of a micro-controller with Silicon On Thin Buried Oxide," in The 18th Workshop on Synthesis And System Integration of Mixed Information technologies, Oct. 2013, pp. 68–73.