# Majority Neuron Circuit Having Large Fan-in with Non-volatile Synaptic Weight

Hisanao Akima, Yasuhiro Katayama, Koji Nakajima, Masao Sakuraba, and Shigeo Sato

*Abstract*—We present a design of a majority neuron circuit with non-volatile synaptic weights. It is based on an analog majority circuit composed of controlled current inverters (CCIs). The proposed circuit is immune to device parameter fluctuations, and its fan-in is estimated about 1000. Synaptic weights are realized on the neuron circuit by adding variable resistors. We consider a design of a non-volatile synaptic weight by using a three-terminal magnetic domain-wall motion (DWM) device. The operation of a fully connected recurrent neural network composed of the proposed circuits has been confirmed by SPICE simulation.

# I. INTRODUCTION

ARDWARE implementation of artificial neural networks  $\Pi$  (ANNs) has been demanding neuron and synapse circuits with small chip area and low power consumption to achieve practical large-scale ANNs. A majority neuron circuit based on majority logic, which returns 1 if more than half of its inputs are 1, realizes uniform summation and thresholding of binary inputs as a part of neuron function. Although majority logic is simple, it is not easy to implement it when the number of inputs increases. In digital implementation, the number of logic gates is more than  $C_{(N+1)/2}^N$  for N inputs, and it results in large chip area. Other implementations with counters and comparators are also consume large chip area [1], [2]. An analog majority circuit, in which 2(N+1) MOS transistors are required for N inputs, has been proposed [3]. However, fluctuations of device characteristics caused by process parameter variations limit the degree of integration seriously. Recently, a spin majority gate based on lateral spin valve in which spin-polarized current is utilized as well as charge current has been proposed. It seems to achieve a low power and compact neuron-synapse unit [4], however, its fan-in would be limited up to several tens because spin- polarized current decays exponentially with the distance along the conducting channel.

In this study, we propose a compact neuron circuit with non-volatile synaptic weights. It is based on an analog majority circuit composed of controlled current inverters (CCIs) [5] and requires 4(N+2) MOS transistors for N inputs. The proposed circuit is immune to device parameter fluctuations and has large fan-in more than 1000. The synaptic weights are realized by using magnetic tunnel

Y. Katayama is with Toshiba Corporation Semiconductor & Storage Products Company 580-1, Horikawa-Cho, Saiwai-ku, Kawasaki, Japan (e-mail: yasuhiro.katayama@toshiba.co.jp). junctions (MTJs) [6] as programmable resistive units.

The organization of this paper is as follows. Section II describes the design of the majority neuron circuit based on the CCI without synaptic weight, and the margin of the proposed circuit is estimated. Section III shows the design of synaptic weights realized on the neuron circuit. SPICE simulation results for a fully connected recurrent ANN composed of the proposed circuits are shown in Section IV. Conclusion is given in Section V.

## II. MAJORITY NEURON CIRCUIT

The majority logic is expressed as follows:

$$n = \sum_{i=1}^{N} x_{i} \qquad x_{i} \in \{0,1\},$$

$$y = \begin{cases} 1 & (n > N/2) \\ 0 & (n < N/2) \end{cases},$$
(1)

where  $x_i$ , n, y, and N are the *i*-th binary input, the number of high inputs, the output, and the total number of inputs, respectively. The output y is 1 if n is greater than N/2, and otherwise 0. The input-output relation expressed in (1) is equivalent to that of a McCulloch-Pitts neuron [7], excepting that all synaptic weights are 1. We discuss design of a majority neuron circuit with given synaptic weights in the next section.

Let us first introduce a previous majority circuit since its understanding is helpful for the following discussion. The circuit, which has been proposed by C.L. Lee and C.W. Jen [3], is shown in Fig. 1. The circuit comprises complementary MOS (CMOS) inverters, which we refer as "Output-Wired Inverters" (OWIs), and a CMOS inverter as an output buffer. Each binary input  $v_i$ , which takes high or low voltage, corresponds to each binary input of the majority logic. The output nodes of OWIs are connected together at the node M in Fig. 1. MOS transistors in OWIs work as resistors and the



Fig. 1. Majority circuit composed of output-wired inverters.

H. Akima, K. Nakajima, M. Sakuraba and S. Sato are with the Research Institute of Electrical Communication Tohoku University 2-1-1 Katahira, Aoba-ku, Sendai, JAPAN (e-mail: akima@riec.tohoku.ac.jp; hello @riec.tohoku.ac.jp; sakuraba.masao@myad.jp; shigeo@riec.tohoku.ac.jp)

voltage  $V_{\rm M}$  changes according to the numbers of high and low inputs. The output buffer amplifies the voltage  $V_{\rm M}$  as the decision of the majority logic.

Although the OWI circuit has simple structure and seems to be easy to implement, it has serious defect associated with the offset of the threshold voltage. When the number of high inputs *n* is exactly half of the total number of inputs *N*, i.e. n = N/2,  $V_M$  should be identical to the threshold voltage of the output buffer. But it is very difficult to exclude some offset due to the difference of the threshold voltages of nMOS and pMOS. Therefore, the conductance of MOS transistors should be adjusted by choosing the device dimension. This requirement is critical for larger *N*, however, some fluctuations are inevitable depending on LSI fabrication process.

A majority circuit we utilize as a neuron circuit is shown in Fig. 2. Instead of CMOS inverters, modified inverters which we refer as "Controlled-Current Inverters" (CCIs) are used. In comparison with an OWI, two additional MOS transistors, pMOS<sub>bs</sub> and nMOS<sub>bs</sub>, are added for adjusting current. The gates of the  $pMOS_{bs}$  and  $nMOS_{bs}$  are biased with the same voltage  $V_{\rm ref}$  which is generated by the reference voltage generator. The resulting effects are given in two points. First, the offset voltage is canceled out, which means the difference between the voltage of the node M  $V_{\rm M}$  when n = N/2 and the threshold voltage of the output buffer becomes zero. This is because offset does not occur as long as CCIs and the output buffer are composed of the transistors with the same dimension. This assumption is valid with the help of symmetric design such as common-centroid layout [8]. Therefore, the CCI circuit has high robustness and is easy to design. The second advantage is that the gain of the CCI circuit is very large compared with the OWI circuit. Fig. 3 shows SPICE simulation results for N = 101. Large gain means the node voltage  $V_{\rm M}$  changes largely near n = N/2resulting in large margin. We define margin as follows:



Fig. 2. Majority circuit composed of controlled-current inverters.



Fig. 3. Simulation results of OWI and CCI circuits for N = 101.

margin = min(margin<sup>+</sup>, margin<sup>-</sup>),  
margin<sup>-</sup> = 
$$V_{\rm M} \left( n = \frac{N-1}{2} \right) - V_{\rm ref}$$
, (2)  
margin<sup>+</sup> =  $V_{\rm ref} - V_{\rm M} \left( n = \frac{N+1}{2} \right)$ ,

where "min" takes the smallest value in its arguments. As shown in Fig. 3, the margin of the CCI circuit is larger than that of the OWI circuit. The large margin is suitable for majority logic operation in which the number of high and low inputs is almost even. This large margin originates from the nonlinear characteristic of a MOS transistor. The MOS transistors connected  $V_{\rm M}$  in the CCIs operate in saturation region near n = N/2 whereas those in the OWIs operate in linear region. A MOS transistor operating in saturation region has larger source-drain resistance than that of in linear region and causes larger voltage drop.

Next, we show the analytical results of the margin for the CCI circuit. If N is relatively small,  $V_{\rm M}$  is given as a function of n by

$$V_{\rm M}(n) = \begin{cases} \left(V_{\rm ref} - V_{\rm DD} - V_{\rm tp}\right) \\ \left\{1 - \sqrt{\frac{2}{N - n} \left(\frac{N}{2} - n\right)}\right\} + V_{\rm DD} & \left(n < \frac{N}{2}\right) \\ \\ \left(V_{\rm ref} - V_{\rm tn}\right) \left\{1 - \sqrt{\frac{2}{n} \left(n - \frac{N}{2}\right)}\right\} & \left(n > \frac{N}{2}\right) \end{cases}$$
(3)

where  $V_{\text{tn}}$  and  $V_{\text{tp}}$  are the threshold voltage of nMOS and pMOS, respectively. Then, the margins expressed in (2) are calculated as follows:

margin<sup>-</sup> = 
$$-V_{tp} + (V_{DD} - V_{ref} + V_{tp})\sqrt{\frac{2}{N+1}}$$
, (4)  
margin<sup>+</sup> =  $V_{tn} + (V_{ref} - V_{tn})\sqrt{\frac{2}{N+1}}$ .

In the case of large N,  $V_{\rm M}$  and margins are given approximately by following equations:



Fig. 4. Margin of the CCI circuit.

$$V_{\rm M}(n) = \frac{N-2n}{n\Delta_n - (N-n)\Delta_p} + V_{\rm ref} ,$$
  
margin<sup>-</sup> =  $\frac{-2}{\Delta_n(1-N) + \Delta_p(1+N)}$ , (5)

margin<sup>+</sup> =  $\frac{2}{\Delta_n(1+N) + \Delta_p(1-N)}$ ,

where  $\Delta_n$  and  $\Delta_p$  are parameters depending on nMOS and pMOS characteristics, respectively. Fig. 4 shows the margin of the CCI circuit relative to  $V_{\rm DD}$  as a function of N. The simulation result agrees with the ideal curves obtained by using above equations. We assume  $\Delta_n = -\Delta_p = 7.1 \times 10^{-3} \,\mathrm{V}^{-1}$ with the MOS parameters of  $V_{\rm tn} = -V_{\rm tp} = 0.7 \,\mathrm{V}$ , body effect coefficient  $\gamma = 0.5 \,\mathrm{V}^{1/2}$ , and channel length modulation coefficient  $\lambda = 0.01 \,\mathrm{V}^{-1}$ . As shown in Fig. 4, up to about 1000 inputs can be achieved with a few percent margin.

# III. SYNAPTIC WEIGHTS REALIZED ON CCI MAJORITY NEURON CIRCUIT

As mentioned earlier, a majority circuit works as a neuron circuit whose synaptic weights are 1. In this section, we discuss synaptic weights realized on the CCI majority circuit. In the CCI majority circuit, each high and low input turns on the nMOS<sub>sw</sub> and pMOS<sub>sw</sub>, respectively (see Fig. 2). If we consider those MOS transistors as switches, which have finite



Fig. 5. (a) Equivalent circuit of the CCI majority circuit for N = 5, n = 2 (omitted reference voltage generator). (b) Simplified circuit of (a).

on-resistance  $R_{n-on}$  and  $R_{p-on}$ , and the nMOS<sub>bs</sub> and pMOS<sub>bs</sub> as resisters, whose resistance  $R_{\text{n-bs}}$  and  $R_{\text{p-bs}}$  varies as to their bias condition, an equivalent circuit of the CCI circuit is given as shown in Fig. 5 (a). The equivalent circuit is simplified as a series of parallel combined resistance  $R_{\rm L} = R_{\rm p-on}/(N-n)$ ,  $R_{\rm L-bs}$ =  $(\sum_{i} 1/R_{\text{p-bs}\,i})^{-1}$ ,  $R_{\text{H-bs}} = (\sum_{i} 1/R_{\text{n-bs}\,i})^{-1}$  and  $R_{\text{H}} = R_{\text{n-on}}/n$ as shown in Fig. 5 (b). The ratio of  $(R_{\rm L} + R_{\rm L-bs})$  to  $(R_{\rm H-bs} + R_{\rm H})$ determines  $V_{\rm M}$ ;  $V_{\rm M}$  is pulled to  $V_{\rm DD}$  or GND if the upper or lower path relative to node M has lower resistance than the other. The CCI majority neuron circuit "fires" if  $V_{\rm M}$  decreases below some threshold voltage. In the CCI majority circuit, each high and low input equally changes  $R_{\rm H}$  and  $R_{\rm L}$  by  $R_{\rm n-on}$ and  $R_{p-on}$ , respectively. To achieve synaptic weight, a weighted input should change  $R_{\rm H}$  or  $R_{\rm L}$  according to the polarity and absolute value of its synaptic weight. For an excitatory weight, high and low inputs decrease  $R_{\rm H}$  and  $R_{\rm L}$  in proportion to its weight value, respectively. An inhibitory weight acts inversely.

The synapse function mentioned above can be achieved by using a neuron-synapse sub-circuit shown in Fig. 6. Two weight resistors  $Rw_p$ ,  $Rw_n$ , two cutoff switches and a weight polarity selector are required for each synapse, and one output buffer for the inverted input from other neuron. We set  $V_{pol}$  = high or low for an excitatory or inhibitory weight, respectively. The two weight resistors have the same resistance in inversely proportional to the weight value. The inset table in Fig. 6 shows enabled weight resistors determined by the combination of  $V_{pol}$  and  $V_{in}$ . For example, the combination of  $V_{in}$  = low and  $V_{pol}$  = high, which means an inhibitory input weighted by an excitatory synapse, turns on the pMOS<sub>sw</sub> below  $Rw_p$  and current flows through  $Rw_p$ resulting in increasing of  $V_M$ . The cutoff switches are required



Fig. 6. Neuron-synapse sub-circuit.



Fig. 7. Structure of three-terminal domain-wall motion device.

to achieve the weight value of zero unless the resistance of the weight resistors takes large value enough to be considered as open.

To achieve variable synaptic weight, some variable resistive device is required. Non-volatile memory devices such as MRAM [9], PRAM [10], and ReRAM [11] are suitable for low power operation. Let us try to utilize a three-terminal magnetic domain-wall motion (DWM) device [6], and consider a structure as shown in Fig. 7 for example. The free layer consists of DWM region, whose length is  $L_{\rm DW}$ , and two fixed regions located at the both ends. The tunnel barrier and reference layer are stacked on the free layer, and a magnetic tunnel junction (MTJ) is formed. Unlike the original device proposed in [6], the MTJ region expands over the length of  $L_{\rm DW}$ . The spin direction of the two fixed regions are fixed antiparallel by magnetic coupling with the spin in the adjacent pinning layers. The DW in the DWM region can be moved by injection of spin-polarized current from the write bit line 1 to 2, and vice versa. The spin-polarized current from left to right moves the DW to the right side and the up spin part in the DWM region increases; the number of spin aligned parallel with respect to the spin in the reference layer becomes larger than that of anti-parallel. The resulting effect is that the resistance of MTJ  $R_{\rm MTJ}$  decreases by tunnel magnetoresistance (TMR) effect. The spin-polarized current in opposite direction increases  $R_{\rm MTJ}$  conversely.  $R_{\rm MTJ}$  takes the minimum  $R_{\text{MTJ-min}}$  and maximum  $R_{\text{MTJ-max}}$  at which the DW is located to the left and right edges of the DWM region, respectively. The ratio of  $R_{\rm MTJ-max}$  to  $R_{\rm MTJ-min}$  defines TMR ratio:

TMR ratio = 
$$\frac{R_{\text{MTJ-max}} - R_{\text{MTJ-min}}}{R_{\text{MTJ-min}}} \times 100$$
 (%) . (6)

As a binary memory, TMR ratio more than 600 % has been achieved [12]. Multilevel  $R_{\rm MTJ}$  can be achieved by incorporation of submicron notches in the DWM region to stabilize the location of the DW [13].

# IV. NETWORK SIMULATION

By using SPICE, we simulated the proposed circuit in the associative memory operation by constructing a fully connected recurrent ANN. This ANN can memorize bit patterns and recall them from incomplete key patterns. In our simulation, a binary bit which takes -1 and 1 corresponds to low and high outputs of the CCI neuron, respectively. The ANN consists of N = 50 neurons and the  $50 \times 50$  matrix of synaptic weight  $w_{ij}$ , which weighs the input from the *j*-th neuron to the *i*-th neuron, are calculated by auto-associative



Fig. 8. Weight resistance and conductance determined from synaptic weight.

learning [14]. The resistance of a weight resistor  $r_{wij}$  is determined so that the conductance  $g_{wij} = r_{wij}^{-1}$  is proportional to a synaptic weight  $w_{ij}$  except for  $w_{ij} = 0$ , and expressed as following equation;

$$r_{wij} = \frac{R_{\rm MTJ-min} R_{\rm MTJ-max} w_{\rm max}}{\left(R_{\rm MTJ-max} - R_{\rm MTJ-min}\right) w_{ij} + R_{\rm MTJ-min} w_{\rm max}} , \qquad (7)$$

where  $w_{\text{max}}$  is the maximum of  $w_{ij}$ .  $R_{\text{MTJ-max}}$  and  $R_{\text{MTJ-min}}$  were assumed to be 70 k $\Omega$  and 10 k $\Omega$ , respectively. Fig. 8 shows  $r_{wij}$  and  $g_{wij}$  determined by (7). The number of memorized pattern m = 5, which means a "load parameter" l = m / N = 0.1, and  $w_{ij}$  takes different 7 values,  $0, \pm 1, \pm 3$ , and  $\pm 5$ . We evaluated the network operation by directional cosine *a* calculated from a memorized and recalled pattern vectors  $\boldsymbol{\xi}$ ,  $\boldsymbol{\eta}$  as follows,

$$a = \frac{1}{N} \sum_{i=1}^{N} \xi_i \eta_i \qquad \qquad \xi_i, \eta_i \in \{-1, 1\}.$$
(8)

a = 1.0 if a recalled pattern agrees with the corresponding memorized pattern. Fig. 9 shows *a* with varying the number of flipped bits  $N_f$  in the key patterns. Each *a* is the average of 5 key patterns in which different  $N_f$  bits are flipped with respect to one memorized pattern selected at random.

The transient responses ( $t < 50 \ \mu$ s) of  $V_{\rm M}$  and  $V_{\rm out}$  in each CCI neuron for  $N_f$ = 10 are represented in Fig. 10. The solid and dashed lines show  $V_{\rm M}$  and  $V_{\rm out}$ , respectively, and inset numbers show the memorized pattern bit. The graphs enclosed by thick line show the transient responses at which a



Fig. 9. Directional cosine with varying the number of flipped bit in key pattern.



Fig. 10. Transient responses of V<sub>M</sub> and V<sub>out</sub> in each CCI neuron.

bit is flipped with respect to the memorized pattern. As shown in Fig. 10, the output voltages starting with wrong values are recovered to correct values.

The values of directional cosine *a* are plotted in Fig. 11 with varying load parameter *l*. The *a* values are calculated by averaging 5 ensemble of key patterns in which different 1 bit is flipped with respect to *m* memorized patterns. Although starting with the key patterns almost the same as the memorized patterns, *a* keeps near 1.0 over the theoretical estimation of l = 0.14 [15] by which the ANN can recall the memorized pattern correctly.

### V.CONCLUSION

We have proposed a neuron circuit based on the CCI majority circuit. It has large fan-in with small chip area by utilizing nonlinear characteristics of MOS transistors. The tolerable number of inputs is estimated about 1000 by theoretical analysis and SPICE simulation. Synaptic weights are realized on the neuron circuit by adding two weight resistors and eight MOS transistors, composing two cutoff switches and a weight polarity selector, to each CCI in the neuron circuit, and four MOS transistors for an additional output buffer. We assume to utilize a three-terminal DWM device as a non-volatile variable resistor. The network operation of the recurrent ANN composed of the proposed circuits has been confirmed by SPICE simulation.

The proposed circuit is suitable for large fan-in applications, however, its power consumption seems to be relatively high because current always flows from  $V_{\rm DD}$  to GND through the MOS transistors in the CCIs. Power gating technique using the cutoff switches will be necessary for low



Fig. 11. Directional cosine with varying load paramaeter.

power operation. The estimation of power consumption is future work.

#### REFERENCES

- E. E. Swartzlander Jr., "Parallel counters," *IEEE Trans. Comput.*, vol. C-22, pp. 1021-1024, Nov. 1973.
- [2] Y. Leblebici, H. Özdemir, A. Kepkep, and U. Çilingiroglu, "A compact high-speed (31, 5) parallel counter circuit based on capacitive threshold-logic gates," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 8, pp. 1177-1183, Aug. 1996.
- [3] C. L. Lee, and C. W. Jen, "A novel design of binary majority gate and its application to median filtering," in *Proc. 1900 IEEE Int. Symp.* on *Circuits and Systems (ISCAS '90)*, vol. 1, 1990, pp. 570-573.
- [4] M. Sharad, C. Augustine, G. Panagopoulos, and K. Roy, "Spin-based neuron model with domain-wall magnets as synapse," *IEEE Trans. Nanotechnol.*, vol. 11, no. 4, July 2012.
- [5] Y. Katayama, K. Suzuki, S. Sato, and K. Nakajima, "Implementation of a large fan-in majority circuit," in *Proc. 2000 Int. Symp. on Nonlinear Theory and its Applications (NOLTA 2000)*, 2000, pp. 413-416.
- [6] S. Fukami, N. Ishiwata, N. Kasai, M. Yamanouchi, H. Sato, S. Ikeda, and H. Ohono, "Scalability prospect of three-terminal magnetic

domain-wall motion device," *IEEE Trans. Magn.*, vol.48, no. 7, pp.2152-2157, July 2012.

- [7] W. S. McCulloch, and W. Pitts, "A logical calculus of the ideas immanent in nervous activity," Bulletin of Mathematical Biophysics, vol. 5, 1943.
- [8] R. J. Baker, "CMOS Circuit Design, Layout, and Simulation 3rd Edition," Wiley, 2010, pp. 111-112.
- [9] N. Sakimura, T. Sugibayashi, R. Nebashi, H. Honjo, S. Saito, Y. Kato, and N. Kasai, "A 250-MHz 1-Mbit embedded MRAM macro using 2T1MTJ cell with bitline separation and half-pitch shift architecture," in IEEE ISSCC Tech. Dig., pp. 216-219, Nov. 2007.
- [10] R. Bez, F. Pellizzer, "Progress and perspective of phase-change memory," *Proc. EPCOS*, Sep. 2007.
- [11] B. J. Choi, S. Choi, K. M. Kim, Y. C. Shin, C. S. Hwang, S. Y. Hwang, S. S. Cho, S. Park, and S. K. Hong, "Study on the resistive switching time of TiO<sub>2</sub> thin films," *Appl. Phys. Lett.*, vol. 89, 012906, July 2006.
- [12] S. Ikeda, J. Hayakawa, Y. Ashizawa, Y. M. Lee, K. Miura, H. Hasegawa, M. Tsunoda, F. Matsukura, and H. Ohono, "Tunnel magnetoresistance of 604% at 300 K by suppression of Ta diffusion in CoFeB/MgO/CoFeB pseudo-spin-valves annealed at high temperature," *Appl. Phys. Lett.*, vol. 93, 082508, Aug. 2008.
- [13] D. Lacour, J. A. Katine, L. Folks, T. Block, J. R. Childress. M. J. Carey, and B. A. Gurney, "Experimental evidence of multiple stable locations for a domain wall trapped by a submicron notch," *Appl. Phys. Lett.*, vol. 84, pp. 1910-1912, March. 2004.
- [14] T. Kohonen, "Self-organization and associative memory 2nd edition," Springer-Verlag, pp. 183, Dec. 1988.
- [15] S. Amari and K. Maginu, "Statistical Neurodynamics of Associative Memory," Neural Networks vol.1, pp. 63–73, 1988.