UNIT VII
DESIGNING WITH
PROGRAMMABLE GATE ARRAYS AND COMPLEX
PROGRAMMABLE
LOGIC DEVICES:
Xilinx 3000
Series FPGAs, Designing with FPGAs, Using a One-Hot State Assignment, Altera
Complex Programmable Logic Devices (CPLDs), Altera FLEX 10K Series CPLDs.
UNIT VII:
DESIGNING WITH PROGRAMMABLE GATE ARRAYS AND COMPLEX
PROGRAMMABLE
LOGIC DEVICES
Xilinx 3000 series FPGAs
Consider the part of the basic structure of xilinx XC3020
Logic cell array (LCA), which consists of an interior array of 64 Configurable
Logic Blocks (CLBs) surrounded by a ring of 64 input-output interface blocks.
The interconnections between these blocks can be programmed by storing data in
internal configuration memory cells. Each CLB contains some combinational logic
and two D-FlipFlops and can be programmed to perform a variety of logic
functions.
The configuration memory cells are programmed after power is
applied to LCA and the programmed logic functions and interconnections are
retained until the power is turned off.
Fig. 1: Layout of part of a
programmable logic cell array
During configuration, each memory cell is selected in turn.
When a WRITE signal is applied to the pass transistor, DATA is stored in the
cell. Each connection point in the LCA has an associated memory cell, and the
data stored in that cell determines whether the connection is made or not.
Fig.2: Configuration Memory Cell
Fig.3: Xilinx 3000 Series Logic Cell
A CLB has five logic inputs(A,B,C,D,E), a data input (DI), a
clock input (K), a clock enable (EC), a direct reset (RD), and two outputs (X
and Y). the trapezoidal blocks on the diagram represent multiplexers, which can
be programmed to select one of the inputs. Eg.: the X output can either come
from the upper flip-flop (QX) or from the F output of the “Combinational
Function” block. Similarly, the Y output can come either from the lower
flip-flop (QY) or from G. Each M represents a configuration memory cell, and
the data in the cell determines which mux input is selected.
The combinational function block contains RAM memory cells
and can be programmed to realize any five variable function or any two
functions of four variables. The functions are stored in truth table format, so
the number of gates required to realize the functions is not important.
The block can be operated in three different modes based on
the mux programmed to select one of its inputs.
1.
The
FG mode generates two functions of four variables each. A is one variable
common to both the functions and the next two variables can be chosen from
B,C,QX and QY and finally the remaining variable can be chosen from D or E.
Eg. If F = A B’ + QX E and G = A’ C +
QY D, then if QX and QY are not used then the two four variable functions must
have A,B,C in common and the fourth variable can be D or E.
Fig.4: Combinational Logic Options
The F mode can generate one function of five variables
(A,D,E, and two variables chosen from B, C, QX, and QY). The functions can be
realized ranging in complexity from a simple AND gate, F=G=ABCDE to a parity
function, F=G= A xor B xor C xor D xor E which has 16 terms when expanded to a
sum of products.
The FGM mode uses a multiplexer with E as a control input to
select one of two four-variable functions. Each function uses inputs A, D, and
two of the inputs B, C, QX, and QY. The FGM mode can realize some functions of
six or seven variables. Eg.: this mode could realize the seven-variable
function
F=G=E(AB’+QXD) + E’(A’C+QYD)
The D input on each flip-flop can be programmed to come from
F, G, or the DI data input. The two flip-flops have a common clock. The MUX
connected to the CLOCK input (K) can be programmed to select either K or K’, so
the D flip-flops will change state either on the rising or falling edge of the
clock. The clock is either always enabled, or it is controlled by the Enable
Clock (EC) input. The MUX connected to the D input of each flip-flop is used to
effectively disable the clock. If EC=0, the Q output is fed back to the D input
so that Q+=Q, and the flip-flop never changes states even though the
clock is changing. If EC=1, the D input is connected to F, G, or DIN, and state
changes occur in response to the clock. The D flip-flop and MUX combination is
equivalent to a D flip-flop with an enable clock (EC) input as shown in fig.5.
since Q can change only when EC=1, the following characteristic equation
describes the flip-flop behaviour
Q+ = EC D + EC’ Q
Using this type of flip-flop makes it unnecessary to gate
the clock with a control signal. Since the clock can go directly to each
flip-flop input, achieving proper synchronous operation is much easier. The
flip-flops have an active high asynchronous reset (RD). The direct reset input
(if it is not inhibited) will clear both flip-flops when it is set to 1. The
global reset will clear the flip-flops in all of the cells in the array.
Fig.5: Flip-Flops with Clock Enable
Eg.: implement a parallel adder-subtracter with an
accumulator using an XC3020. The overall structure is similar to fig.6, except
control signals are needed for both add and subtract.
Fig.6: Parallel adder with accumulator
If Ad=1, the B input will be added to the accumulator. If
Su=1, the B input will be subtracted from the accumulator. Subtraction will be
accomplished by adding the 2’s complement of B to the accumulator. If Ad=Su=0,
the accumulator should remain unchanged. (i.e., 1’s complement +1 for B)
Since each logic cell has two flip-flops, it might be
possible to implement two bits of the accumulator in one cell. If two bits of
the adder-subtracter were implemented in one cell, two outputs from the
accumulator flip-flops plus a carry output to the next cell would be required.
Since each cell has only two outputs, this scheme would not work. Therefore, we
can implement only one bit per cell.
Fig.7: Parallel Adder-Subtractor Logic
Cell
Fig.7 shows a typical cell of the parallel adder-subtracter
where the logic inputs are bici (carry from previous
cell), and Su. The accumulator flip-flop output (ai) is fed back
internally within the cell. The combinatorial function block implements the
following equations:
F=sum=ai+ = ai xor (bi
xor Su) xor ci
G=ci+1=carry out = aici+(ai
+ ci)(bi xor Su)
If Su=0, these equations reduce to the standard equations
for a full adder. If Su=1, bi is complemented by the XOR. If the carry-in to
the LSB is also connected to Su, when Su=1 the 2’s complement of B is added to
A, so that subtraction will occur. Since both F and G are 4-variable functions
of the same variables, which can be implemented by the combinatorial function
block using the FG mode in fig.4. In fig.7, ci and bi are
connected to the A and B block inputs, so the internal feedback from the ai
flip-flop (QX) must be routed to the third block input. Then the remaining
input, Su, can be connected to block input D or E. since the accumulator should
only change when Ad=1 or Su=1, we connect the clock enable (EC) to the signal
Ad+Su. An OR gate in another logic cell generates this signal, which is used by
all of the adder-subtracter cells.
Fig.8: Signal Paths within
Adder-Subtractor Logic Cell
The dashed lines in fig.8 indicate the relevant signal paths
that are present within the logic cell after it has been programmed. The F
function is connected to the D input of the accumulator flip-flop (ai)
and the G function is connected to the carry out (Ci+1).
Input-Output Blocks:
Fig.9 shows a configurable IOB, the I/O pad connects to one
of the pins on the IC package so that external signals can be input to or
output from the array of logic cells. To use the cell as an input, the 3-state
control must be set to place the tristate buffer, which drives the output pin,
in the high-impedance state. To use the cell as an output, the tristate buffer
must be enabled. Flip-flops are provided so that input and output values can be
stored within IOB. The flip-flops are bypassed when direct input or output is
desired. Two clock lines (CK1 and CK2) can be programmed to connect to either
flip-flop. The input flip-flop can be programmed to act as an edge-triggered D
flip-flop or as a transparent latch. Even if the I/O pin is not used, the I/O
flip-flops can still be used to store data.
Fig.9: Xilinx 3000 series I/O Block
An OUT signal coming from the logic array first goes through
an exclusive-OR gate. Where it is either complemented or not, depending on how
the OUT-INVERT bit is programmed. The OUT signal can be stored in the flip-flop
if desired. Depending on how the OUTPUT-SELECT bit is programmed, either the
OUT signal or the flip-flop output goes to the output buffer.
If the 3-STATE signal is 1 and the 3-STATE INVERT bit is 0
(or if the 3-STATE signal is 0 and the 3-STATE INVERT bit is 1), the output
buffer has a high-impedance output. Otherwise, the buffer drives the output
signal to the I/O pad. When I/O pad is used as an input, the output buffer must
be in the high-impedance state. An external signal coming into the I/O pad goes
through a buffer and then to the input of a D flip-flop. The buffer output
provides a DIRECT IN signal to the logic array. Alternatively, the input signal
can be stored in the D flip-flop, which provides the REGISTERED IN signal to
the logic array.
Each IOB has a number of I/O options, which can be selected
by configuration memory cells. The input threshold can be programmed to respond
to either TTL or CMOS signal levels. The SLEW RATE bit controls the rate at
which the output signal can change. When the output drives an external device,
reduction of the slew rate is desirable to reduce the induced noise that can
occur when the output changes rapidly. When the PASSIVE PULL-UP bit is set, a
pull-up resistor is connected to the I/O pad. This internal pull-up resistor
can be used to avoid floating inputs.
Programmable Interconnects
The programmable interconnections between the configurable logic blocks
and I/O blocks can be made several ways – general-purpose interconnects, direct
interconnects, and long lines.
Fig.10: General Purpose Interconnects
Fig.10 shows general-purpose interconnect system. Signals between CLBs or
between CLBs and IOBs can be routed through switch matrices as they travel
along the horizontal and vertical interconnect lines.
Direct connection of adjacent CLBs is possible as shown in fig.11.
Fig.11: Direct Interconnection Between Adjacent CLBs
Long lines are provided to connect CLBs that are far apart. All the
interconnections are programmed by storing bits in internal configuration
memory cells within the LCA. Long lines provide for high fan-out, low-skew
distribution of siganls that must travel a relatively long distance.
Fig.12: Vertical and Horizontal Long Lines
From fig.12, there are four vertical long lines between each pair of
adjacent columns of CLBs, and two of these can be used for clocks. There are
two horizontal long lines between each pair of adjacent rows of CLBs. The long
lines span the entire length or width of the interconnection area.
Each logic cell has two adjacent tristate buffers that connect to
horizontal long lines. Designers can use these long lines and buffers to
implement tristate busses.
(a). Multiplexer Implementation
(b). Wired-AND Implementation
Fig.13: Uses of Tristate Buffers
Fig.13(a) shows how tristate buffers can be used to multiplex signals onto
a horizontal long line. These buffers have an active-low output enable, so when
A=0, DA is driven onto the line. A weak keeper circuit at the end of
the line remembers the last value driven onto the line, so it is never left
floating. Care must be taken to avoid bus contention, which would occur if both
a 0 and 1 were driven onto the bus at the same time.
The tristate buffers can also be used to implement a wired-AND function,
as shown in fig.13(b). When one or more of the D inputs is 0, the line is
driven to 0. When all the D inputs are 1, all the buffer outputs are high-Z,
and the pull-up resistor pulls the line up to a 1.
Fig.14: Crystal Oscillator
A crystal oscillator may be implemented using an internal high-speed
inverting buffer together with an external crystal (Y1), resistors, and
capacitors, as shown in fig.14. The external components connect to two of the
IOB pins, and the oscillator output connects to the alternate clock buffer. The
alternate clock buffer drives a horizontal long line, which in turn can be used
to drive vertical long lines and the K (clock) inputs to the logic blocks. If
an external clock is used, it can be connected to the global clock buffer. This
buffer drives a global network, which provides a high fan-out, synchronized
clock to all the IOBs and logic blocks. If a symmetric clock is required, the
oscillator output can be routed through a flip-flop that divides the frequency by
2.
The XC3020 FPGA, has 64 CLBs (8 X 8), 64 user I/Os, 256 flip-flops (128 in
the CLBs and 128 in the IOBs), 16 horizontal long lines, and 14,779
configuration data bits. Other members of the XC3000 family have up to 484 CLBs
(22 X 22), 176 user I/Os, 1320 flip-flops, 44 horizontal long lines, and 94,984
configuration data bits.
Designing with FPGAs
Sophisticated CAD tools are available to assist with the
design of systems using PGAs.
The following steps are used to design a digital system with
a FPGA
1.
Draw
a block diagram of the digital system. Define condition and control signals and
construct SM charts or state graphs that describe the required sequence of
operations.
2.
Write
a Verilog module to describe the system. Simulate and debug the verilog code
and make necessary corrections to the design developed in step.1
3.
Work
out the detailed logic design of the system using gates, flip-flops, registers,
counters, adders, etc.
4.
Enter
a logic diagram of the system into the computer using a schematic capture
program. Simulate and debug the logic diagram, and make any necessary
corrections to the design of step3.
5.
Run
a partitioning program. This program will break the logic diagram into pieces
that fit into the configurable logic blocks.
6.
Run
an automatic place and route program, which places the logic blocks in
appropriate places in the FPGA and then route the interconnections between the
logic blocks.
7.
Run
a program that will generate the bit pattern necessary to program the FPGA.
8.
Download
the bit pattern into the internal configuration memory cells in the FPGA and
test the operation of the FPGA.
Automatic Synthesis tools take a verilog HDL to describe a
system as a input and generate an interconnection of gates and flip-flops to
realize the system. When the final system is built, the bit pattern for
programming the FPGA is normally stored in an EPROM and automatically loaded
into the FPGA when the power is turned on. When the final system is built, the
bit pattern for programming the FPGA is normally stored in an EPROM and
automatically loaded into the FPGA when the power is turned on. The EPROM is
connected to the FPGA as shown in fig.1. The FPGA resets itself after the power
has been applied. Then it reads the configuration data from the EPROM by
supplying a sequence of addresses to the EPROM inputs and storing the EPROM
output data in the FPGA internal configuration memory cells.
Fig.1: EPROM Connections for LCA
Initialization
Fig.2: Block Diagram of Dice Game
The dice game can be implemented using an XC3020 LCA.
Fig.3: Dice Game block Diagram after
entered using ViewDraw CAD Software.
The heavy lines indicate busses, which connect some of the
modules. The IPADs and OPADs represent the input and output pins on the XC3020.
The output buffers (OBUFs) drive external LEDs to indicate the state of each
counter and WIN or LOSE. IPADs P12 and P14 are connected to an external RC
network, which together with the GOSC module forms an RC clock. This oscillator
drives all the CLK inputs on the LCA through the ACLK buffer. External Push
buttons connected to P11 and P16 are used for the GAME_RESET and the roll
button, respectively. The roll button is connected to two D Flip-flops, which
serve to debounce the signal from the push button and synchronize it with the
clock. The dice_controller module based on the SM charts of fig.4 can be
deisgned.
Fig.4: SM Chart for the dice_controller
The main control has four states and requires two
flip-flops. Using the state assignment T0: AB=00, T1: AB=01, T2: AB=10, T3:
AB=11, the simplified logic equations derived from the SM Chart are
A+=A’B’ Dn_roll D711+A’B’ Dn_roll D2312 + A’B
Dn_roll Eq + A’B Dn_roll D7 + A Reset’
B+=A’B’ Dn_roll D711’ + A’B Dn_roll’ + A’B Eq’ +
AB Reset’
Win=AB’
Lose=AB
En_roll=A’
Sp=A’B’ Dn_roll D711’ D2312’
The roll control requires one flip-flop, and by inspection
of its SM Chart,
Q+=Q’En_roll Rb + Q Rb
Roll=Q Rb
Dn_roll=Q Rb’
Since En_roll is always ‘1’ in S0, we can rewrite the
equation for Q+ as
Q+=Q’ En_roll Rb + Q En_roll Rb = En_roll Rb
Fig.5 shows a ViewDraw schematic that implements above
equations
Fig.5(a): Main Controller
Fig.5(b): Dice roll controller
Fig.5: Dice Game Controller Module
The logic equations for the modulo-6 counter are
Q2+ = Q1Q0+Q2Q1’
Q1+=Q2’Q0’+Q1’Q0
Q0+=Q0’
The 3-bit counter module is implemented as shown in fig.6.
the CHIP_ENABLE connects to CE on the flip-flops, so the counter increments
only when CHIP_ENABLE=1.
Fig.6: Modulo-6 Counter
After the final
design has been partitioned into logic cells, the logic cells placed and the
connections routed, 29 out of the 64 logic cells on the 3020 are used to
implement the dice game. Fig.7 shows the final routing of the interconnections
for the dice game.
Fig.7: Layout and Routing for Dice Game
for XC3020
Realizing Functions with six or More
Variables:
Although some 6-variable logic functions can be realized
with one or two logic cells, a general 6-variable function may require three
cells.
A method to describe a general method for realizing any
6-variable function.
1.
Expand
the function as
Z(a,b,c,d,e,f) = a’Z(0,b,c,d,e,f) +
aZ(1,b,c,d,e,f) =a’Z0 +aZ1
It is an example of Shannon’s Expansion
theorem, we can verify it by setting a=0 and a=1 on both sides. If the equation
satisfies then the expansion is correct.
This equation directly leads to the
network of fig.8(a), which uses two cells to realize Z0 and Z1.
Fig.8(a):
General 6-variable function
Fig.8:
Realizing 6- and 7- variable functions
Half of a third cell is used to realize the 3-variable
function, Z=aZ0+aZ1.
Eg: Consider the function,
Z= abcd’ef’ + a’b’c’def’ + b’cde’f
Setting a=0 gives Z0=0·bcd’ef’ + 1·b’c’def’ +
b’cdef’ = b’c’def’ + b’cde’f
And setting a=1 gives Z1=1·bcd’ef’ + 0·b’c’def’ + b’cde’f =
bcd’ef’ + b’cde’f
Since Z0 and Z1 are 5-variable functions, each of them can
be realized by a single cell.
Any 7-variable function can be realized with 6 or fewer
logic cells. The expansion for a general 7-variable function is
Z(a,b,c,d,e,f,g) = a’b’ Z(0,0,c,d,e,f,g) + a’b Z(0,1,c,d,e,f,g) + ab’ Z(1,0,c,d,e,f,g)
+ ab Z(1,1,c,d,e,f,g)
= a’b’Y0 + a’b Y1 + ab’ Y2 + ab Y3
The above equation can be obtained by applying the expansion
theorem twice, first expanding about a and then expanding about b.
Eg.: consider the 7-variable function
Z=c’de’fg + bcd’e’fg’ + a’c’def’g + a’b’d’ef’g’ + ab’defg’
By substituting a=b=0 gives Y0 = c’de’fg +
c’def’g + d’ef’g’
By substituting a=0 and b=1 gives Y1= c’de’fg +
cd’e’fg’ + c’def’g
By substituting a=1 and b=0 gives Y2= c’de’fg +
defg’
By substituting a=b=1 gives Y3 = c’de’fg +
cd’e’fg’
Fig.8(b): General 7-variable function
The above equations are implemented as a network shown in
fig.7(b). Four cells implement the 5-variable functions - Y0,Y1,Y2,
and Y3. A fifth cell implements the 4-variable function, Z0
= a’b’ Y0 + a’b Y1 and the remaining cell implements a
5-variable function, Z = Z0 + ab’ Y2 + ab Y3.
As the number of variables (n) increases, the maximum number
of logic cells required to realize an n-variable function increases rapidly.
Hence, PLDs may be a better solution than LCAs when n is large.
Using a One-Hot State Assignment
As in PGAs, each logic cell contains
two flip-flops, minimization of number of flip-flops is necessary for any
design. So that the total number of logic cells used and the interconnections
between the cells can be reduced. A one-hot state design can be used to design
faster logic i.e., the number of cells required to realize each equation can be
reduced.
It uses one flip-flop for each state
=> a state machine with N states requires N flip-flops. Eg.: a system with
four states (T0,T1,T2 and T3) can use four flip-flops (Q0,Q1,Q2 and Q3) with
the following state assignment:
T0: Q0Q1Q2Q3=1000, T1:0100, T2:0010,
T3:0001
ð The other 12 combinations are not used
ð
The
next state and output equations can be written by inspection of the state graph
or by tracing link paths on an SM chart.
Fig.1: Partial State Graph
Consider the partial state graph shown
in fig.1, the next state equation for flip-flop Q3 can be written as
Q3+=X1Q0Q1’Q2’Q3’+X2Q0’Q1Q2’Q3’+X3Q0’Q1’Q2Q3’+X4Q0’Q1’Q2’Q3
Since Q0=1 => Q1=Q2=Q3=0 then the
term Q1’Q2’Q3’ term is redundant and hence can be neglected. Then using similar
case all the primed state variables can be eliminated from the other terms,
then the next-state equation reduces to
Q3+=X1Q0+X2Q1+X3Q2+X4Q3
ð Each term has exactly one state
variable.
ð
Similarly
each output equation contains exactly one state variable
Z1=X1Q0+X3Q2 Z2=X2Q1+X4Q3
When a one-hot assignment is used, the
next-state equation for each flip-flop will contain one term for each arc (or
link path) leading into the corresponding state. Hence in general, each term in
every next-state equation and in every output equation will contain exactly one
state variable. For asynchronous network additionally a “holding term” is
required for each next state equation.
Method-1:
For one-hot assignment, resetting the
system requires that one flip-flop be set to 1 instead of resetting all
flip-flops to 0. If the flip-flops used do not have a preset input then we can
replace Q0 with Q0’ throughout. Hence the changes in assignment will be
T0: Q0Q1Q2Q3=0000, T1:1100, T2:1010,
T3:1001
And the modified equations are
Q3+=X1Q0’+X2Q1+X3Q2+X4Q3
Z1=X1Q0’+X3Q2
Z2=X2Q1+X4Q3
Method-2:
To solve the reset problem without
modifying the on-hot assignment, add an extra term to the equation for the
flip-flop, which should be 1 in the starting state.
Eg.: consider eq.1and fig.2 for the
main dice game control
Fig.2: SM chart for serially linked
state machine
The next state equation for Q0 is Q0+=Q0Dn_roll’+Q2
Reset+Q3 Reset
If the system is reset to state 0000
after power-up, then add the term Q0’Q1’Q2’Q3’ to the equation for Q0+.
Which then changes after the first clock to 1000 (T0) which is the correct
starting state. In general both assignment with a minimum number of state
variables and a one-hot assignment have to be tried to see which one leads to a
design with the smallest number of logic cells. Based on the requirement the
choice has to be made i.e., for faster speed choose the faster design. When a
one-hot assignment is used more next state equations are required but in
general both next state and output equations will contain fewer variables, and
hence requires fewer logic cells to realize the equation. Equations with fewer
variables require a single cell but for six variables require cascading two
cells, for 7 variables require three cascading cells. As more cells are
cascaded, the propagation delay increases and the operation will be slow.
Altera Complex Programmable Logic Devices (CPLDs)
CPLDs are an extension of the PAL concept, it is an IC that
consists of a number of PAL-like logic blocks together with a programmable
interconnect Matrix. Each PAL-like logic blocks together with a programmable
AND array that feeds macrocells, and the outputs of these macrocells can be
routed to the inputs of other logic blocks within the same IC. CPLDs can be
electrically erasable and reprogrammablesuch as EPLDs (Erasable PLDs).
Features of the Altera MAX 7000 series
Ø a family of high performance CMOS
CPLDs.
Ø Uses EEPROM-based configuration memory
cells, => once the configuration is programmed, it is retained until it is
erased.
Fig.1: ALtera 7000 seires Architecture for EPM7032, 7064,
and 7096 Devices
The basic 7000 series architecture consists of a number of
Logic Array Blocks (LABs), I/O Control Blocks, and a Programmable Interconnect
Matrix (PIA). Each LAB contains 16 macrocells, each of which contains
combinational logic and a flip-flop. Each LAB has 36 inputs from the PIA and 16
outputs to the PIA. From 8 to 16 outputs from each LAB can be routed to the I/O
pins through the I/O control block to the PIA. The global clock input (GCLK)
and the global clear input (GCLRn) connect to all macrocells. Two output enable
signals (OE1n and OE2n) connect to all I/O control Blocks.
Fig.2: Macrocell for EPM7032, 7064, and 7096 Devices
Each macrocell includes a logic array, a product-term select
matrix that feeds an OR gate, and a programmable flip-flop. The vertical lines
in the logic array, are common to all of the macrocells in a LAB, are driven
with the programmable interconnect signals from the PIA and from shared logic
expanders. Product terms are formed in the logic array just as they are in a
PAL. Five product terms are provided in each macrocell, and these product terms
are allocated by the product term select matrix. A product term may be used as
an OR gate input, an XOR gate input, a logic expander, or as a flip-flop
preset, clear, clock, or enable input.
The flip-flop in each macrocell is a D flip-flop with clock
enable and asynchronous preset and clear. The clock input can be driven by the
global clock or from a product term. The clock enable can be driven from a
product term or Vcc (always enabled). The clear can be driven from the global
clear or from a product term. The preset can be driven from a product term. The
D input always comes from the XOR gate output. Either the flip-flop Q output or
the XOR gate output can be selected by the Register Bypass Multiplexer. The
selected output goes to the PIA and to the I/O control Block. The D flip-flop
in a cell can be converted to T flip-flop using the XOR gate. Since the
characteristic equation for a T flip-flop is Q+=Q XOR T, using the T flip-flops
to implement counters and adders often requires fewer gates than using D
flip-flops.
Since five product terms are avilable in each macrocell, more complex
functions can be implemented by utilizing unused product terms from other
macrocells. Two types of expanders are – sharable and parallel expanders.
Fig.3: Sharable Expanders
From fig.3, the selected product term is fed back to the
logic array through an inverter and hence the inverted product term can be used
as an input to any macrocell AND gate. When sharable expanders are used, the
realization is equivalent to a three level NAND-AND-OR network. An AND-OR logic
expression with more than five terms can often be factored to utilize sharable
expanders from other macrocells.
Eg.: P=AB+B’C+C’D+E’F+E’G+E’H+F’I+F’J =
AB+B’C+C’D+E’(F’G’H’)’+F’(I’J’)’
Can use shareable
expanders to generate (F’G’H’)’ and (I’J’)’. The XOR gate in a cell can be used
to complement a function, since F=F’ XOR 1. Sometimes the complement of a
function (F’) requires fewer terms than the original function (F), => it is
more economical to implement F’ and complement it using the XOR gate.
Fig.4: Parallel Expanders
The parallel expanders as shown in fig.4 allow unused
product terms from a macrocell to be used in a neighboring macrocell. The
parallel expander product terms can be chained from one macrocell to the next
within two groups- macrocells 8 downto 1 and 16 downto 9. When parallel
expanders are used without shareable expanders, the maximum number of product
terms in any logic function is 20, five from the macrocell itself, and three
additional groups of five changed from neighboring macrocells.
Fig.5: I/O Control Block for EPM 7032, 7064, and 7096
Fig.5 shows an I/O control block for an I/O pin, which
allows each I/O pin to be configured as an input, output, or bidirectional pin.
A tristate buffer drives the I/O pin. The OE control mux is programmed to
select either Vcc, Gnd, or one of the global output enable signals. If Vcc is
selected, the macrocell output is enabled to the I/O pin. If Gnd is selected,
the buffer is disabled and the I/O pin can be used as an input. Otherwise, the
buffer is controlled by OE1n or OE2n.
Software provided by Altera can be used to optimize and
partition a design to fit it into logic cells and route the connections between
the cells.
Fig.6: Parallel adder with accumulator
Eg.: if we use the Altera software to implement two bits of
the full adder of the fig.6 using the
following equations
The software first determinesthat T flip-flops will require
fewer gates than D flip-flops and then it factors the equations to utilize
shareable expanders. The resultant equations are
C3=A1 A2 X01 + B1 C1 X02 + A2 B2
Where X01 = B1 + C1 and X02 = A2 + B2 are shareable
expanders outputs
T2 = Ad A1 B2’ C1 + Ad B1 B2’ X03 + Ad B1’ B2 X04 + Ad A1’
B2 C1’
Where X03 = A1 + C1 and X04 = A1’ + C1’ are shareable
expanders outputs
T1= Ad B1 C1’ + Ad B1’ C1
Each logic equation has less than or equal to five terms, so
it fits in a logic cell. Implementing these equations requires three logic
cells and four shareable expanders.
For the dice game, the schematics are entered and compiled
for the design, the resulting equations require 23 logic cells and 8 shareable
expanders, which fit into an EPM7032 CPLD.
Altera manufactures several other series of CPLDs. The MAX
7000S series is similar to the MAX 7000 series, except that it is in-circuit
programmable rather than requiring a programmer. The MAX 9000 series, which is
an enhanced version of the MAX 7000S series, has higher density and additional
routing resources. The FLEX 8000 and FLEX 10K series use RAM-Based
configuration memory cells instead of EEPROM-based cells.
ALTERA FLEX 10K SERIES CPLDs
The Altera FLEX 10K embedded programmable logic family
provides high-density logic along with RAM memory in each device. The logic and
interconnections are programmed using configuration RAM cells similar to that
of xilinx FPGAs.
Fig.1 shows the block diagram of a FLEX 10K device where
each row of the logic array contains several logic array blocks (LABs) and an
embedded array block (EAB). Each LAB contains eight logic elements and a local
interconnect channel. The EAB Contains 2048 bits of RAM memory. The LABs and
EABs can be interconnected through fast row and column interconnect channels,
called as Fast Track Interconnect. Each inpu-out element (IOE) can be used as
an input, output, or bidirectional pin. Each IOE contains a bidirectional
buffer and a flip-flop that can be used to store either input or output data. A
single FLEX 10K device provides from 72 to 624 LABs, 3 to 12 EABs, and up to
406 IOEs. It can utilize from 10,000 to 100,000 equivalent gates in a typical
application.
Fig.34: FLEX 10K Device Block Diagram
The block diagram of FLEX 10K LAB is shown in fig.35,
contains 8 logic elements (LEs). The local interconnect channel has 22 or more
inputs from the row interconnect and 8 input fed back from the LE outputs. Each
LE has four data inputs from the local interconnect channel as well as
additional control inputs. The LE outputs can be routed to the row or column
interconnects, and connections can also be made between the row and column
interconnects.
Fig.35: FLEX 10K Logic Array Block
Fig.36: FLEX 10K Logic Element
Each logic element contains a function generator that can
implement any function of four variables using a lookup table (LUT). A cascade
chain provides connections to adjacent Les so functions of more than four
variables can be implemented. The cascade chain can be used in an AND or in an
OR configuration as shown in fig.37.
Fig.37: Cascade Chain Operation
When used in the arithmetic mode, an LE can implement the
sum and carry for one bit of a full adder. The carry chain provides for
propagation of carries between adjacent cells. Each LE contains one D flip-flop
with a clock enable and asynchronous clear and preset inputs. The LE output can
come from the flip-flop or directly from the combinational logic.
Functions of more than four variables require multiple LEs
for implementation. Eg., consider a six variable function Z(a,b,c,d,e,f) which
can be implemented using six LEs.
By applying the expansion theorem,
Z(a,b,c,d,e,f)=a’b’Z0(c,d,e,f)+a’bZ1(c,d,e,f)+ab’
Z2(c,d,e,f)+abZ3(c,d,e,f)
Z0,Z1,Z2 and Z3 can
each be implemented with an LE. The outputs of these LEs can be connected to
inputs of other LEs via the local interconnect. The 4-variable functions Y0=a’b’Z0+a’bZ1
and Y1=ab’Z2+abZ3 each require another LE. Y0
can be ORed with Y1 using the cascade chain, so no additional LE is
required.
Fig.38: FLEX 10K Embedded Array Block
Fig.38 shows an embedded array block. The inputs from the
row interconnect go through the EAB local interconnect and can be utilized as
data inputs or address inputs to the EAB. The internal memory array can be used
as a RAM or ROM of size 256×8, 512×4,1024×2, or 2048×1. Several EABs can be
used together to form a larger memory. The memory data outputs can be routed to
either the row or column interconnects. All memory inputs and outputs are
connected to registers so the memory can be operated in a synchronous mode.
Alternatively, the registers can be bypassed and the memory operated
asynchronously.
Use of CPLDs such as the FLEX 10K series allows us to
implement a complex digital system using a single IC.