## HW#4 Solution

**Q.1.** Consider a technology library containing the following cells:

| Cell                                   | Area Cost |
|----------------------------------------|-----------|
| INV(x1) = x1'                          | 1         |
| NAND2(x1, x2) = (x1 x2)'               | 2         |
| NAND3 $(x1, x2, x3) = (x1 x2 x3)'$     | 3         |
| NOR2 $(x1, x2) = (x1 + x2)'$           | 2         |
| AOI21 $(x1, x2, x3) = ((x1 x2) + x3)'$ | 3         |
| OAI21(x1, x2, x3) = $((x1+x2)x3)'$     | 3         |

(i) Show the **pattern trees** of the library cells using **NAND2** and **INV** as base functions. Assume that symmetric representations do not need to be stored.

| -  | Cell  | gate        | Pattern tree                           | cost |
|----|-------|-------------|----------------------------------------|------|
| €I | INV   |             | •Ē                                     | 1    |
|    |       | -CF         |                                        | 2    |
| ŧ3 | NAND3 | -Do-mar Do- | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 3    |
| ŧų | NOR2  | -Do-Do-Do-  |                                        | 2    |
| ts | A0I21 |             |                                        | 3    |
| to | 01721 | D-          |                                        | 3    |
|    |       |             |                                        | 1    |

(ii) Decompose the function f = a + b' + c using NAND2 and INV as base functions into all possible **non-symmetric decompositions**. Then, **map** the decomposed circuits using the given library and determine the decomposition that leads to a **lower area cost**.

The function f = a + b + c can be decomposed into 2-input OR gates by one of the following decompositions; The first and second decompositions have the same structure and will lead to a solution of the same cost. So, we will consider mapping the first and third decompositions. The minimum solution for mapping this decomposition has a cost of 5 as shown below; a - Do - - - - - cost=5 Next, we consider the third decomposition. and a porto The minimum solution for mapping this decomposition has a cost of 4 as shown below:  $c \rightarrow 2$  cost = 4

(iii) Using the dynamic programming approach, **map** the circuit given below using the given library into the **minimum area** cost solution. Inputs are  $\{a, b, c, d, e, f, g\}$  and output is  $\{Z\}$ .



| vertex | Motch      | gate              | cost         |
|--------|------------|-------------------|--------------|
|        | <b>t</b> 2 | Nandz(a,b)        | 2            |
|        | ŧI         | INV(c)            | t            |
|        | tı         | INV(d)            | 1            |
| 33     | tz         | Nandz (eff)       | 2            |
| 84     | t2         | Nand2 (91,92)     | 2+2+1=5      |
| 95     | 61         | INV (84)          | 1+2=3        |
| 96     | t2         | Nandz (95,93)     | 2+5+1=8      |
| 97     |            | Nondz (96,8)      | 2+3=5        |
| 98     | 42<br>43   | Nondz (erf.g)     | 3            |
|        | ti         | INV (27)          | 1+8=9        |
| 99     | ts         | A0I21 (31,92,d)   | 3+2+1= 5     |
|        | tz         | Nond2 (99,98)     | 2+6+3=11     |
| 310    | E3         | Nand 3 (95,33, 98 | ) 3+5+1+3=12 |
|        |            |                   |              |

Thus, the minimum cost is II and the mapped solution is shown below;



(iv) Using the given library, use the SIS command *read\_libray* q1.lib to read the library. Then, map the circuit to the library using the sis command *map* -s -m 0. Compare your solution to the solution obtained in (iii). You can save the mapped circuit using the sis command *write\_blif* -n. Why do you think the solution obtained by SIS is better than your solution?

The solution achieved by SIS is given below with a total area of 10.



The obtained solution is better than the solution we obtained because SIS uses an additional optimization technique by inserting pairs of inverters at each line in the subject graph and then finding an optimal mapping of the subject graph. Any mapped pair of inverters can then be eliminated. Using this technique it is possible to obtain a better mapping of the given network.

(v) Assuming **Boolean matching**, determine the <u>number</u> of ROBDD's that need to be stored in the cell library for each of the following cells. <u>Justify your answer</u>.

a. f=ab+ac+bc
Since a, b, and c are all symmetric
C3 = { (a,b,c) } = # ROBOD's = 1
b. f=abc+a'b'd
we have 2 binate variables {a,b}
and 2 unate variables {c,d}
variables a and b are symmetric.
⇒ # ROBOD's = 2

| Present State | Next State, Output |       |       |       |
|---------------|--------------------|-------|-------|-------|
|               | 00                 | 01    | 11    | 10    |
| S1            | S2, 0              | -,-   | S5, 1 | -,-   |
| S2            | S1, 0              | S3, - | -, -  | _,_   |
| S3            | S3, -              | S1, 1 | S5, – | S4, 1 |
| S4            | -, -               | S2, – | S1, - | -, -  |
| S5            | S3, -              | S2, 1 | -, 0  | S6, - |
| S6            | S6, 1              | S1, – | S2, – | S5, – |

**Q.2.** Consider the incompletely-specified FSM that has 6 states, two inputs and one output, represented by the following state table:

(i) Determine the incompatible and the compatible states along with their implied pairs.



The incompatible states are ; (S1, 54), (S1, 55), (S1, 56), (S2, 56), (S3, 54) compatible states with their implied pairs: The (51,52) £ (52,53) (51,53) *←* (*S*1, *S*3) (52,53) £ (52,53) (S2, S4) € (51,53), (52,53) (52, 55) € (S1, S2), (S4, S6) (53, 55) € (S2, S5) , (S4, S5) (33, 56) (542 55)  $\in (S_1, S_2)$  $\in (S_1, S_2), (S_3, S_6)$ 56) 🗲 (Si, S2) (54) (55, 56)

(ii) Compute the maximal compatible classes along with their implied state pairs.

(iii) Reduce the state table into the minimum number of states and show the reduced state table.

| classes to    | find a ,<br>ither cic<br>nimum of<br>ne state of<br>ates by u | 2C3C5<br>four<br>hable c<br>using h | or c<br>reduced | , we<br>1 C2 CUCS<br>States.<br>reduced |
|---------------|---------------------------------------------------------------|-------------------------------------|-----------------|-----------------------------------------|
| The reduced   | d state ta                                                    |                                     | t,<br>_, output |                                         |
| Present state | 20                                                            | ٥١                                  | ι <b>ι</b>      | 10                                      |
| 5123          | 5123,0                                                        | 5123,1                              | ارکەنى          | 545,1                                   |
| <br>5વ 5      | 5123, -                                                       | 5123,1                              | 5123,0          | 56,-                                    |
| 56            | ا ر 56                                                        | ــ را\$                             | s 2, -          | \$5,-                                   |
|               |                                                               |                                     |                 |                                         |

| Product | Input | Present State | Next State | Output |
|---------|-------|---------------|------------|--------|
| P1      | 10    | S1            | S2         | 11     |
| P2      | 00    | S2            | S2         | 11     |
| P3      | 01    | S2            | S2         | 00     |
| P4      | 00    | S3            | S2         | 00     |
| P5      | 10    | S2            | S1         | 11     |
| P6      | 10    | S3            | S1         | 11     |
| P7      | 00    | S1            | S1         |        |
| P8      | 01    | S3            | SO         | 00     |
| P9      | 11    | S1            | S1         | 10     |
| P10     | 11    | S3            | <b>S</b> 3 | 01     |
| P11     | 11    | <b>S</b> 0    | <b>S</b> 0 | 11     |

Consider the incompletely-specified FSM that has 4 states, two inputs and two **Q.3**. outputs, represented by the following state table:

(i) Assuming the following constraints, S0 = S1 OR S3, and that the code of S2 is covered by all other state codes, the state table can be reduced into the table shown below. Using implicant merging, covering and disjunctive relations show step by step how you can obtain the reduced state stable given below.

| Input | Present State | Next State | Output |
|-------|---------------|------------|--------|
| -0    | S1, S2        | S2         | 11     |
| 10    | S2, S3        | S1         | 11     |
| 00    | <b>S</b> 1    | S1         |        |
| 01    | S3            | S0         | 00     |
| 11    | S0, S1        | S1         | 10     |
| 11    | S0, S3        | <b>S</b> 3 | 01     |

- p5 and p6 can be merged using implicant merging into the following row; \$2, \$3 51 u 10
- be changed to : can PI 52 11 impose the covering constraint \$12.52 51 - 0 we if

changed to; be can

- P2 52 11 if we impose the covering constraint SI ] SZ These two rows can now be merged to:
- S 2 u s1, S2 -0

(ii) Show the encoding constraint matrix and compute all the seed dichotomies of the encoding constraint matrix. Then, eliminate seed dichotomies that violate the given covering and disjunctive constraints.

| Encoding constraint Matri                                                                            | × 3 s  |                  |
|------------------------------------------------------------------------------------------------------|--------|------------------|
| $A = \begin{bmatrix} 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 \\ 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 1 \end{bmatrix}$ | X      |                  |
| Seed Dichotomies:                                                                                    |        |                  |
| sia: (si,sz), (so) ×                                                                                 | 52a ;  | (5,152) , (53) X |
| sib: (so), (si,sz)                                                                                   | sz b ; | (53) , (51,52)   |
| 53a; (s2, 53), (s0)×                                                                                 | 54× 3  | (52, 53), (51)X  |
| 536; (s0), (s2, s3)                                                                                  | 5463   | (51), (52, 53)   |
| 55a: (So, SI),(52)                                                                                   | 56a;   | (so, si), (s3)   |
| 55b: (52), (So, SI)X                                                                                 | 56b;   | (s3), (s0, si) X |
| 57a; (50,53), (51)                                                                                   | 58a %  | (50,53), (52)    |
| 57b; (51), (50,53)X                                                                                  | 526 %  | (52), (50,53) X  |

(iii) Compute all the prime dichotomies and eliminate those that violate the disjunctive constraints.



(iv) Find a state encoding satisfying the given constraints. Verify that your encoding satisfies all the constraints.

| A minim | um cover | is pl and p2 which       |
|---------|----------|--------------------------|
| results | in the   | following state encoding |
| state   | code     |                          |
| 50      | 1 1      |                          |
| SI      | 0        |                          |
| 52      | 0 0      |                          |
| ક્ર     | 10       |                          |

Note that the disjunctive constraint so = SI ORS3  
is satisfied since II = OI OR IO.  
Also, the code of 
$$S2 = OO$$
, which is covered  
by all other codes.  
Also,  $SI.S2 = O-$ , which does not intersect  
with other codes.  
 $S2.S3 = -O$ , which does not intersect  
with other codes.  
 $So:S1 = -I$ , which does not intersect  
with other codes.  
 $So:S3 = I-$ , which does not intersect  
with other codes.

(v) Using K-MAP, obtain the equations for the output and flip-flops. Compare your solution to the solution obtained by running the SIS command *stg\_to\_network* using the state codes obtained in (iii).



The circuit resulting based on the stg\_to\_network command is as follows:

sis> read\_blif hw4q3\_081.blif sis> stg\_to\_network sis> print [0] = IN\_0 IN\_1 LatchOut\_v2 + IN\_0' IN\_1 LatchOut\_v2 [1] = IN\_0 IN\_1' LatchOut\_v3' + IN\_0' IN\_1 LatchOut\_v2 + IN\_0' LatchOut\_v3 + IN\_1 LatchOut\_v3 {OUT\_0} = IN\_0 IN\_1' LatchOut\_v3' + IN\_1 LatchOut\_v3 + IN\_1' LatchOut\_v2' {OUT\_1} = IN\_0 IN\_1 LatchOut\_v2 + IN\_0 IN\_1' LatchOut\_v3' + IN\_1' LatchOut\_v2' sis> print\_stats hw4q3\_081 pi=2 po=2 nodes= 4 latches= 2 lits(sop)= 31 #states(STG)= 4

Note that the number of literals is more than what we have obtained using K-map because the objective here is to minimize the number of products and not the number of literals. To optimize the number of literals, we need to do single output optimization using the command stg\_to\_network –e 2 which produces the following circuit:

```
sis> read_blif hw4q3_081.blif

sis> stg_to_network -e 2

sis> print

[0] = IN_1 LatchOut_v2

[1] = IN_0 IN_1' LatchOut_v3' + IN_0' IN_1 LatchOut_v2 + IN_0' LatchOut_v3

+ IN_1 LatchOut_v3

{OUT_0} = IN_0 IN_1' + IN_1' LatchOut_v2' + LatchOut_v3

{OUT_1} = IN_0 LatchOut_v2 + IN_1' LatchOut_v2'

sis> print_stats

hw4q2_081 pi= 2 po= 2 nodes= 4 latches= 2

lits(sop)= 21 #states(STG)= 4
```

(vi) Perform state assignment using the program nova by running the SIS command *state\_assign nova*. Compare the obtained solution to your solution in (v) in terms of number of literals.

The result produced is as follows which is worse that the solution obtained in (v).

```
sis> read_blif hw4q3_081.blif
sis> state assign nova
Running nova, written by Tiziano Villa, UC Berkeley
sis> print
             \{OUT \ 0\} = IN \ 0 IN \ 1' LatchOut \ v2' + IN \ 1 LatchOut \ v2 LatchOut \ v3 + IN \ 1' LatchOut \ v2 + IN \ 1' LatchOut \ 
IN_1' LatchOut_v3' + LatchOut_v2 LatchOut_v3'
             {OUT_1} = IN_0 LatchOut_v2' + IN_1' LatchOut_v2 + IN_1' LatchOut_v3' + LatchOut_v2
LatchOut_v3'
             v4.0 = IN_0 IN_1' LatchOut_v2' + IN_0' IN_1 LatchOut_v3 + IN_0' LatchOut_v2 + IN_1
LatchOut v2 LatchOut v3 + LatchOut v2 LatchOut v3'
            v4.1 = IN_0 LatchOut_v2' + IN_0' LatchOut_v2 + IN_1 LatchOut_v2 LatchOut_v3
sis> print stats
hw4q3_081
                                                           pi=2 po=2 nodes= 4
                                                                                                                                                                     latches = 2
lits(sop) = 40 \# states(STG) = 4
```

The state assignment generated by nova is:  $\{S0 = 10, S1 = 11, S2 = 00, S3 = 01\}$ .

**Q.4.** Consider the following circuit with inputs {a, b, c} and outputs {F1, F2}. Assume that the delay of an Inverter is 1 unit delay, the delay of a 2-input AND gate is 2 unit delays, and the delay of a 2-input OR gate is 2 unit delays. Consider the circuit given below:



(i) Determine the critical path of this circuit and the maximum propagation delay.

(ii) Using only the **Retiming** transformation, reduce the critical path of this circuit with the minimum number of flip-flops possible.

The resulting circuit after retiming is .



It is clear that the critical path of the retimed circuit is 4 delay units.

(iii) Read the library q4.lib using the command *read\_library q4.lib*. Then, map your design to the library using the command *map –s*. Then, retime the circuit using the command *retime -i*. Compare the maximum arrival time before and after retiming. Compare the obtained solution to the solution you obtained in (ii).

The result of applying the retiming transformation is given below:

final cycle delay = 4.00final number of registers = 4final logic cost = 11.00final register cost = 64.00

RETIME: Final cycle time achieved = 4.00

The resulting circuit is: .model hw4q4\_081.blif .inputs a b c .outputs f1 f2 .default\_input\_arrival 0.00 0.00 .default\_output\_required 0.00 0.00 .default\_input\_drive 0.00 0.00 .default\_output\_load 1.00 .default\_max\_input\_load 999.00 .mlatch dff\_re D=[90] Q=[121] NIL 3 .mlatch dff\_re D=f1 Q=[124] NIL 3 .mlatch dff\_re D=[70] Q=[127] NIL 3 .mlatch dff\_re D=f2 Q=[130] NIL 3 .gate and2 1A=a 1B=b O2=g1 .gate or2 1A=c 1B=g1 O1=[90] .gate and2 1A=[121] 1B=[124] O2=g3 .gate inv 1A=f2 O=[70] .gate or2 1A=g3 1B=[127] O1=f1 .gate and2 1A=[121] 1B=[130] O2=f2 .end



Note that the number of registers obtained by SIS is different due to our description of the circuit having the fanout connecting to G5 not coming directly from F2. This is why we were able to minimize the number of registers more than what the tool could have done. If we have connected F2 to a buffer representing S2 and then let S2 fanout, SIS will obtain the same solution we have obtained.