Chapter start Previous page Next page 2.6.6 Other Datapath OperatorsFigure 2.32 shows symbols for some other datapath elements. The combinational datapath cells, NAND, NOR, and so on, and sequential datapath cells (flipflops and latches) have standardcell equivalents and function identically. I use a bold outline (1 point) for datapath cells instead of the regular (0.5 point) line I use for scalar symbols. We call a set of identical cells a vector of datapath elements in the same way that a bold symbol, A , represents a vector and A represents a scalar. A subtracter is similar to an adder, except in a full subtracter we have a borrowin signal, BIN; a borrowout signal, BOUT; and a difference signal, DIFF: DIFF = A NOT(B) NOT( BIN) = SUM(A, NOT(B), NOT(BIN))(2.65) NOT(BOUT) = A · NOT(B) + A · NOT(BIN) + NOT(B) · NOT(BIN) = MAJ(NOT(A), B, NOT(BIN))(2.66) These equations are the same as those for the FA (Eqs. 2.38 and 2.39) except that the B input is inverted and the sense of the carry chain is inverted. To build a subtracter that calculates (A B) we invert the entire B input bus and connect the BIN[0] input to VDD (not to VSS as we did for CIN[0] in an adder). As an example, to subtract B = '0011' from A = '1001' we calculate '1001' + '1100' + '1' = '0110'. As with an adder, the true overflow is XOR(BOUT[MSB], BOUT[MSB 1]). We can build a rippleborrow subtracter (a type of borrowpropagate subtracter), a borrowsave subtracter, and a borrowselect subtracter in the same way we built these adder architectures. An adder/subtracter has a control signal that gates the A input with an exclusiveOR cell (forming a programmable inversion) to switch between an adder or subtracter. Some adder/subtracters gate both inputs to allow us to compute (A B). We must be careful to connect the input to the LSB of the carry chain (CIN[0] or BIN[0]) when changing between addition (connect to VSS) and subtraction (connect to VDD). A barrel shifter rotates or shifts an input bus by a specified amount. For example if we have an eightinput barrel shifter with input '1111 0000' and we specify a shift of '0001 0000' (3, coded by bit position) the rightshifted 8bit output is '0001 1110'. A barrel shifter may rotate left or right (or switch between the two under a separate control). A barrel shifter may also have an output width that is smaller than the input. To use a simple example, we may have an 8bit input and a 4bit output. This situation is equivalent to having a barrel shifter with two 4bit inputs and a 4bit output. Barrel shifters are used extensively in floatingpoint arithmetic to align (we call this normalize and denormalize ) floatingpoint numbers (with sign, exponent, and mantissa). A leadingone detector is used with a normalizing (leftshift) barrel shifter to align mantissas in floatingpoint numbers. The input is an n bit bus A, the output is an n bit bus, S, with a single '1' in the bit position corresponding to the most significant '1' in the input. Thus, for example, if the input is A = '0000 0101' the leadingone detector output is S = '0000 0100', indicating the leading one in A is in bit position 2 (bit 7 is the MSB, bit zero is the LSB). If we feed the output, S, of the leadingone detector to the shift select input of a normalizing (leftshift) barrel shifter, the shifter will normalize the input A. In our example, with an input of A = '0000 0101', and a leftshift of S = '0000 0100', the barrel shifter will shift A left by five bits and the output of the shifter is Z = '1010 0000'. Now that Z is aligned (with the MSB equal to '1') we can multiply Z with another normalized number. The output of a priority encoder is the binaryencoded position of the leading one in an input. For example, with an input A = '0000 0101' the leading 1 is in bit position 3 (MSB is bit position 7) so the output of a 4bit priority encoder would be Z = '0011' (3). In some cell libraries the encoding is reversed so that the MSB has an output code of zero, in this case Z = '0101' (5). This second, reversed, encoding scheme is useful in floatingpoint arithmetic. If A is a mantissa and we normalize A to '1010 0000' we have to subtract 5 from the exponent, this exponent correction is equal to the output of the priority encoder. An accumulator is an adder/subtracter and a register. Sometimes these are combined with a multiplier to form a multiplieraccumulator ( MAC ). An incrementer adds 1 to the input bus, Z = A + 1, so we can use this function, together with a register, to negate a two's complement number for example. The implementation is Z[ i ] = XOR(A[ i ], CIN[ i ]), and COUT[ i ] = AND(A[ i ], CIN[ i ]). The carryin control input, CIN[0], thus acts as an enable: If it is set to '0' the output is the same as the input. The implementation of arithmetic cells is often a little more complicated than we have explained. CMOS logic is naturally inverting, so that it is faster to implement an incrementer as Z[ i (even)] = XOR(A[ i ], CIN[ i ]) and COUT[ i (even)] = NAND(A[ i ], CIN[ i ]). This inverts COUT, so that in the following stage we must invert it again. If we push an inverting bubble to the input CIN we find that: Z[ i (odd)] = XNOR(A[ i ], CIN[ i ]) and COUT[ i (even)] = NOR(NOT(A[ i ]), CIN[ i ]). In many datapath implementations all oddbit cells operate on inverted carry signals, and thus the oddbit and evenbit datapath elements are different. In fact, all the adder and subtracter datapath elements we have described may use this technique. Normally this is completely hidden from the designer in the datapath assembly and any output control signals are inverted, if necessary, by inserting buffers. A decrementer subtracts 1 from the input bus, the logical implementation is Z[ i ] = XOR(A[ i ], CIN[ i ]) and COUT[ i ] = AND(NOT(A[ i ]), CIN[ i ]). The implementation may invert the odd carry signals, with CIN[0] again acting as an enable. An incrementer/decrementer has a second control input that gates the input, inverting the input to the carry chain. This has the effect of selecting either the increment or decrement function. Using the allzeros detectors and allones detectors , remember that, for a 4bit number, for example, zero in ones' complement arithmetic is '1111' or '0000', and that zero in signed magnitude arithmetic is '1000' or '0000'. A register file (or scratchpad memory) is a bank of flipflops arranged across the bus; sometimes these have the option of multiple ports (multiport register files) for read and write. Normally these register files are the densest logic and hardest to fit in a datapath. For large register files it may be more appropriate to use a multiport memory. We can add control logic to a register file to create a firstin firstout register ( FIFO ), or lastin firstout register ( LIFO ). In Section 2.5 we saw that the standardcell version and gatearray macro version of the sequential cells (latches and flipflops) each contain their own clock buffers. The reason for this is that (without intelligent placement software) we do not know where a standard cell or a gatearray macro will be placed on a chip. We also have no idea of the condition of the clock signal coming into a sequential cell. The ability to place the clock buffers outside the sequential cells in a datapath gives us more flexibility and saves space. For example, we can place the clock buffers for all the clocked elements at the top of the datapath (together with the buffers for the control signals) and river route (in river routing the interconnect lines all flow in the same direction on the same layer) the connections to the clock lines. This saves space and allows us to guarantee the clock skew and timing. It may mean, however, that there is a fixed overhead associated with a datapath. For example, it might make no sense to build a 4bit datapath if the clock and control buffers take up twice the space of the datapath logic. Some tools allow us to design logic using a portable netlist . After we complete the design we can decide whether to implement the portable netlist in a datapath, standard cells, or even a gate array, based on area, speed, or power considerations. 




