University of Notre Dame
Department of
Computer Science and Engineering
CSE
40547/60547: Computing at the Nanoscale Out: 09/27/07
NOTE:
This assignment is designed to walk you through some of
the most important issues associated with transistor scaling. While some questions may seem to have a
lot of ÒtextÓ associated with them, more often than not, this text is designed
to (a) explain the fundamental principles behind the questions that you will
answer, (b) point you to a good resource to answer the question if you need
more information, and (c) link the different lecture topics together.
There are nine questions in this assignment –
although about half of the questions have sub-questions associated with
them. No one question or
sub-question should require more than 5 or 6 sentences to answer – and
some questions can probably just be answered with one or two sentences. I should also add that more often than
not, IÕm looking for qualitative rather than quantitative answers. That said, while a good answer may only
require a short paragraph, some questions may require 20-30 minutes of research
and thinking first. Thus, this
assignment is
longer the last one and probably will take a little more time to do. In other words, I would NOT wait until
the last minute to get started J.
I will also be adding a bonus question to this
assignment. You do not have to
answer the bonus question to receive full credit, but if you do, IÕll replace
your lowest question score with the score of the bonus question – but
only if the points you receive for the bonus question would help your overall
score. (In other words, answering
the question will NOT hurt your grade.)
Like before, feel free to discuss answers to questions
with your classmates. However, you
need to do your write-ups individually.
This assignment is due by the end of the October 10th
class period (i.e. in about 2 weeks).
If the assignment is turned in later than this, IÕll subtract 15% from
your overall score for every day it is late – i.e. if you turn it the
assignment on October 10th at 5 p.m., you can at most receive an
85%. If you turn in the assignment
on October 11th at 2:30 p.m., you can at most receive a 70%. (etc.,
etc.)
Finally, IÕm always happy to answer questionsÉ
Question 1:
Background:
In Lecture 4, we derived equations for current flow
– which is responsible for the movement and computation of binary
information – in a MOSFET.
More specifically, we saw how the current through a transistor (IDS)
was dependent on the parameters that define its geometry and material
composition (W, L, tox, m). Note that inherent to this discussion
was an explanation of how the voltage between drain and source affects current.
In Lecture 5, we derived an expression for the
propagation delay of a CMOS-based logic gate (an inverter).
Below, the figure on the left illustrates clock rate as
function of feature size (note that time advances from left-to-right along the
x-axis) while the figure on the left illustrates how the supply voltage,
threshold voltage, and oxide thickness have scaled as a function of device
feature size.


Figure
1
Figure 2
Question:
In the context of the material discussed in
Lecture 4 & 5, explain why the clock rates shown here were able to increase
steadily – especially for long-channel devices. (Hint – explain in terms of the IDS
equations derived in Lecture 4 and/or the RC delay expression derived in
Lecture 5.)
Question 2:
Background:
In Lecture 5, we talked about using both NMOS and PMOS transistors to perform
logic. The shift to CMOS-based
logic (from exclusively NMOS-based logic) occurred in the 1980s as an
increasing amount of static power dissipation put a firm upper bound on the
number of gates that could be added to a single die (see slide 8 of Lecture 5
for a picture on an NMOS inverter).
The switch was advantageous as, with a CMOS inverter, there should be no
direct path between the supply and ground rails under steady state operating
conditions (i.e. when the inputs and outputs remain constant). This significantly reduces static power
dissipation.
Question (Part A):
The switch to CMOS-based logic still has
implications on transistor layout.
For example, as was seen in Lecture 5, one of the consequences of
creating CMOS inverters was that the NMOS transistor needed to be sized
differently than the PMOS transistor.
Explain why.
Question (Part B):
The answer to Part A of this question is also
adversely affected as channel lengths are scaled down. Explain one solution to this problem
for sub-100 nm channel lengths.
Question 3:
Background:
In the context of the inverter discussed in Lecture 5,
we also started to think about the amount of energy that each switching event
would dissipate. We began with
dynamic power and saw that it was proportional to gateÕs capacitance (CL),
the supply voltage (Vdd), the clock frequency (f), and the activity
factor (a) – or in other words, P ~
CLVdd2fa0ˆ1.
Question (Part A):
What components of this equation are directly
influenced by the transistor geometry?
How? (i.e.do they make dynamic power worse or better as feature sizes
scale down?)
Question (Part B):
Assuming a full-scaling model, how does dynamic power scale as gate length
decreases? Assuming a
general-scaling model, how does dynamic power scale as gate length
decreases?
Question 4:
Background (Part A):
The energy lost from switching events is not the only
contributor the power budget.
Leakage current – the source of static power loss – also
plays a roll. As you saw in
Homework #1, static power dissipation is especially bad as no computation (i.e.
useful work) is associated with it.
As we discussed in class, there are two components of leakage current
and inherent contributors to static power – sub-threshold leakage and
gate leakage.
Sub-threshold leakage is a ÒweakÓ, inversion current
across a transistor – or in other words, at the onset of inversion (see
Lecture 3 and 4), carriers are available to conduct current even if Vgs
< Vt, and some current does flow. A 2003 paper
by Narayanan, et. al. describes the relationship between sub-threshold leakage
current (Isub) and the threshold and supply voltages:
![]()
In this equation, K1 and n are experimentally
derived, W is the gate width, and Vq is called the Òthermal voltageÓ. Typically, Vt is about 25 mV
– but it increases linearly with temperature. Note that (a) static power is caused in part by Isub,
(b) static power increases the temperature, and (c) that increases in
temperature further increase Isub. Thus, the potential for thermal runaway exists. Additionally, note that Isub
will increase exponentially as the threshold voltage Vt is
reduced. Ironically, part of the
reason for this is that the supply voltage has been continually scaled down to
reduce the dynamic
power consumption of logic gates – especially as dynamic power is
dependent on the square of the supply voltage as seen in Question 3. As supply voltage is scaled down, to
maintain performance, the threshold voltage has to be reduced as well.
Question (Part A):
Explain how we could reduce Isub
– and hence static power – without raising Vt.
Background (Part B):
Gate leakage is the result of current tunneling through
the gate oxide insulator (smaller oxides result in the gate not being an
ideally isolated electrode). Gate
leakage current (Iox) can be approximated by the equation below:
![]()
Here, K2 and a are experimentally derived, V is the supply voltage, and Tox
represents the oxide thickness.
The key component of this equation is the oxide thickness. Clearly, increasing Tox will
reduce gate leakage.
Unfortunately, it also degrades the transistorÕs effectiveness because Tox
must decrease proportionally with process scaling to avoid short channel
effects like drain induced barrier lowering (more on DIBL in Question 5).
Question (Part B):
An alternative – introduced in a 45 nm
process developed by Intel – is to use a different, high-k material for the gate
oxide. How does this help
alleviate gate leakage (Iox) while not degrading transistor
performance?
Question 5:
Background (Part A):
As mentioned in Question 4, one way to alleviate the
sub-threshold leakage current (Isub) is to raise the threshold
voltage Vt. For short
channel devices, this is easier said than done! Downward gate length scaling has led to an effect called
Drain Induced Barrier Lowering (DIBL).
DIBL has an adverse affect on Vt and can in turn adversely
affect Isub and hence lead to an increase in static power.
Question (Part A):
In 4 or 5 sentences, qualitatively explain how
DIBL affects Vt (and static power). Hint: refer to
(a) your lecture notes, (b) Section 3.3 of Jan RabaeyÕs VLSI text on reserve in
the engineering library, or (c) the paper ÒTurning Silicon on Its EdgeÓ by Nowak, et. al. to
answer this question. Also, in
just 1 sentence, explain how changing Tox could theoretically
alleviate some of the effects of DIBL.
Background (Part B):
One way to alleviate performance issues created by DIBL
is to increase the doping concentration in the transistor body. Unfortunately, increasing the doping
concentration exacerbates another short channel effect – sub-threshold
swing (St).
Question (Part B):
Again, in 4-5 sentences, explain what
sub-threshold swing is and how an increase in this metric degrades transistor
performance. ). Hint: refer to (a) your lecture notes, and
(b) the paper ÒTurning Silicon on Its EdgeÓ by Nowak, et. al. to
answer this question.
Background (Part C):
One way to mitigate the affects of both DIBL and
sub-threshold swing in planar MOSFETs is to use a technique called halo doping
(this technique may also be referred to in the context of halo implants). While more information about halo
doping can be found in (a) your lecture notes, (b) the paper ÒCMOS design near the
limit of scalingÓ by Taur, or (c) ÒIn
Search of ÔForever,Õ Continued Transistor Scaling One New Material at a TimeÓ
by Thompson, et. al., the technique is briefly summarized here as well. Namely, a halo implant is a non-uniform
doping profile. In an NMOS device,
there is a greater concentration of N-type dopant atoms near the source and
drain – and the concentration decreases as you move away from the source
and drain. Similarly, there is a
higher concentration of P-type dopant atoms directly under the gate in the
channel.
Question (Part C):
In the context of pn-junctions (see Lecture
3), explain why this helps improve sub-threshold swing and DIBL for short channel
devices. (Hint: Think about this doping profile in the
context of electric field lines).
Question 6:
Background:
Up until now, all of the questions in this assignment
have been associated with the structures required to perform logical
operations. More than just logic
gates are required for computationally interesting structures. We also need materials to connect gates
to form circuits, to connect circuits to form functional units, and to connect
functional units to one another, storage systems, etc. Of course the subject of this question
is on-chip interconnect. It is
fairly well accepted that local interconnect – the wires used to connect
logic gates for example – does scale with feature size. However, in terms of global interconnect, one of the
consequences of downward technology scaling is illustrated in Figure 3. As one can see, the percentage of a die
reachable in a single clock cycle decreases as feature size decreases. Inherently, this chart says that global
interconnect delay does not scale well as feature sizes decrease.

Figure 3
Question:
Explain why global interconnect delay does not
scale well as feature sizes decrease.
Question 7:
Background:
By now, it should be somewhat obvious that the last 6 questions
have talked about the problems that arise as transistor feature sizes scale
downward – and the effects on performance as a result! The last 3 questions will talk about
viable ÒsolutionsÓ to this problem with an eye toward improving systems-level performance. ThereÕs a good chance that the first
solution is already at work in your laptop or desktop – namely, if youÕve
bought a new computer at any point in the last several years, thereÕs a good
chance that it has more than one processor core. (For a short review paper on multi-core designs,
scalability, etc., see the Intel white paper ÒFrom
a Few Cores to ManyÓ.)
Question:
Using the scaling trends illustrated in Figure
1, Figure 2, and Figure 4 (below), explain why multi-core architectures have
helped to improve performance even in the face of processor clock rates that
have plateaued between 2 and 3 GHz.

Figure 4
Question 8:
Background (Part A – C)
As we discussed in class, smaller device feature sizes
make a circuit more susceptible to error inducing noise, defects caused by
lithographic variation, etc. One
way to leverage smaller feature sizes and the ÒfaultinessÓ that comes with them
is to find ways to design circuits that can tolerate the faultiness while still
offering a performance improvement.
To this end, we discussed two versions of ÒProbabilistic CMOSÓ. The first effort discussed was based on
work by Krishna Palem at the Georgia Institute of Technology. (A paper summarizing PalemÕs approach
can be found here.) The second effort is based on work by
Iris Bahar at Brown University. (A
paper summarizing BaharÕs approach can be found here,
and other papers describing this work are linked off of the course website in
the Lecture 10 entry.)
Question A:
Briefly explain the differences between the
two approaches. ~5 sentences on
each should be sufficient.
(Hint: think about what
aspect of ÒperformanceÓ each method seeks to improve)
Question B:
Palem admits that his approach is not well
suited for a general purpose processor and is better matched with particular
applications. In your opinion, is
BaharÕs approach suited for a general-purpose processor? Why or why not? (ThereÕs no right answer to this
question; you score for Question 8B will be based on how will you justify your
response. That said, a paragraph
is sufficient.)
Question C:
BaharÕs approach advocates operating the gates
(transistors) that form her MRF-based circuits with a gate voltage that is below a given transistorÕs
threshold voltage. If one were to
compare the MRF-based inverter discussed by Bahar (low voltage, 20 transistors)
to a standard, CMOS inverter (higher voltage for noise tolerance, 2
transistors), how might the subthreshold leakage currents compare? Would the MRF circuitÕs leakage current
be about the same, better, or worse?
Why?
Question 9:
Background:
In a sense, the work that formed the premise of Question
8 was centered on ÒlivingÓ with the performance degradations induced by short
channel effects/gate scaling.
Another approach – and the subject of the final question of this
assignment – is to redesign our FETs to alleviate short channel effects
and the results performance degradations.
Much of the research in this area is focused on building multiple gate
devices (FinFETs, MUGFETs, Trigates, etc.) As discussed in the paper ÒTurning Silicon on its EdgeÓ (see
link in Question 5B), in multiple gate transistors, the longitudinal electric
field generated by the drain is better screened from the source end of the
channel due to the proximity to the channel of the second gate. This reduces short channel effects
(especially DIBL) and hence Isub and static power.
Question:
How would a switch to FinFETs affect dynamic
power? (Hint: read the bottom of page 3 and see the
figure on page 4 of the document linked here
– an overview of IntelÕs trigate device. Then, consider how the transistor geometry affects the
components of the dynamic power equation discussed in Question 3.)
Bonus Question:
A bonus question on Òband gapsÓ is forthcoming J.