# Recursive bisection placement algorithm with the predicted wirelength

# Hao Jie Ma Hong Peng Silong

(National ASIC Design Engineering Center Institute of Automation Chinese Academy of Sciences Beijing 100190 China)

Abstract: To obtain a better placement result a partition ing\_based placement algorithm with wirelength prediction called HLPl is presented A new method is proposed to estimate proximity of interconnects in a netlist which is capable of predicting not only short interconnects but long interconnects accurately The predicted wirelength is embedded into the partitioning tool of b isection based g bbal placement which can guide our placement towards a solution with shorter interconnects. In addition the timing objective can be handled within the algorithm by minimizing the critical path delay Experimental results show that compared to Capqo.5 mPI6 and NTUplace HJP1 outperforms these placers in terms of wirelength and run time The improvements in terms of average wire length over Capq<sub>0.5</sub> mH<sub>6</sub> and NHU place are 13%, 3%, and 9% with on 19 19%, 91%, and 99% of their runting respectively By integrating the predicted wirelength-driven clustering into Capq of the placer is able to reduce average wirelength by 3%. The tim ing\_driven HJP1 can reduce the critical path delay by23%.

Key words hierarchy, interconnect placement VLSI circuit wirelength prediction

Process in deep submicron technologies A lot of algorithms have been proposed in the past30 years including partitioning based methods of partitioning based methods of partitioning based placement a gorithms determine cell locations by recursively dividing an initial region with successive bisections or quadrisections. Advances in partitioning research have provided a number of fast algorithms which produce extremely good results

The placement problem can be driven by different objectives such as timing routability thermal distribution or a combination of them. The classical placement objective function is the total wire length which correlates well with global routing resource demand. Interconnection prediction is very important for early feasibility studies in design flow During the past two decades different wirelength prediction approaches have been proposed because of the need for early interconnect optimization in the design flow to achieve timing closure Rent's rule has been a basic tool to estimate the wire length. In Refs [9-11], wirelengths are estimated based on circuit characteristics Most of the estimation techniques provide estimates of just the average wire length. We develop an individual wire length prediction based on circuit characteristics to estimate the wire length of a layout design in advance

before placement. In fact, the final wire lengths depend on placement algorithms W ire length predictions that are accurate for one placement algorithm may be inaccurate for another. In our placement tool the predicted wire length as a main optimal objective is embedded into placement flow Experimental results show that the wire lengths of final placement can trend to predicted wire lengths

In this paper we propose a novel partitioning based placement a gorithm for standard cells. The major contributions of this paper can be summarized as follows.

- 1) We find that the basic circuit characteristics (e.g. net degree and net area) and node level can be used to predict wirelengths in the final layout These prelayout measures demonstrate good correlations with postlayout interconnect lengths. We propose to couple the wirelength predictions with our placement flows
- 2) In conventional partitioning based placements partitioning tools are typically done with the minimum objective. In order to reduce the lengths of intracluster interconnects we introduce a new objective function that incorporates a predicted wirelength component for partitioning tools Experiments show that we can obtain a betterwirelength result with little bas in time

# Overall of Placement Tool

Fig. 1 shows that our placement framework consists of pur stages. We predict individual wirelengths for all nets based on circuit characteristics in stage 1. In stage 2 we partition both the chip region and the circuit netlists recursive v by horizon tal and vertical cut lines. In the partitioning phase wire length-driven clustering is performed to reduce the wire lengths of intracluster interconnects. A fter clustering for bet term in cut wire length-driven refinement is executed without loss of wirelength quality. The subcircuits after partitioning



Fig 1 Fram ework of our placement tool

R eceived 2008-05-12

B iographies Hao Jie(1981-) male graduate Peng Silong corresponding author, male doctor professor silong peng@ ia ac cn

Foundation item: The National Key Project of Scientific and Technical Supporting Programs (No 2006BAK07B04).

C itation: Hao Jie MaHong Peng Silong Recursive bisection placement algorithm with the Predicted wirelength J. Journal of Southeast University

<sup>(</sup>English Edition). 2008. 24(4):462-467. (English Edition). 2008. All rights reserved. http://www.cnki.net

are assigned to rectangular bins A bin based simulated an nealing is performed to improve the current placement The fi nal step simply spreads overlapped cells and makes bcal inprovements to obtain the detailed placement

## Wirelength Prediction

In this section we explain our wirelength prediction technique in detail A circuit netlist can be modeled by a hyper. graph H(V E) with a node celly set  $V= \frac{1}{2} V$   $= 1, 2, \dots, M$ and a hyperedge (net) set E= 款e | ± 1, 2 ..., m归 Each net e is a subset of V with cardinality | e| \geq 2. An edge (s t) \in e is an output of source nodes and an input of sink node, t which is a subset of the hyperedge. The degree of the node y denoted by d(v) is the number of nets incident to it The de. gree of net e denoted by d(e) is the number of nodes inci dent to it Each node is associated with an area cost area (v). Net area is the total area of nodes be longing to that  $\text{net name } V \text{ area}(\underbrace{e}_{i}) = \sum_{v \in F} \text{area}(\underbrace{v}_{i}).$ 

Individual wirelength is dependent on three factors in this paper 1) The net degree and the net area which are the ba ses of the wirelength; 2) The shape distribution of a node lev el3) The range of a node level Wire length prediction is per formed in three phases in which we calculate three different weights for all the nets according to the above three factors The final predicted wirelength is obtained by combining these weights

#### 2. 1 Basic wirelength

Obviously the netwith a larger net degree and net area tends to have a larger fan out range. It is intuited that larger fan out nets usually correspond to longer net lengths We use net degree and net area of net as its basic wire length. The ar ea factor of net e is computed by

$$A(e) = \sum_{v \in e} \frac{a \operatorname{rea}(v_i)}{d(e)}$$
 (1)

The basic wirelength of net e is the combination of the area factor and the number of nodes belonging to the net and is given as

$$I_{\text{Basic}} (9 = 1 + \frac{A(e)}{A_{\text{max}}} + \frac{d(e)}{d_{\text{max}}}$$
 (2)

where Anax is the largest node area amongst all nodes, does is the largest net degree over all nets

#### 2. 2. Shape distribution of node level

Define level (v) the level of node v as the maximum topological depth over all directed paths beginning at a PI (Primary input) and terminating at node v num\_level(v) is the number of nodes at level (v). Fig 2 shows the shape of the rdy3 circuit in the MCNC (Microelectronics Center of North Carolina) benchmark In Fig. 2, the number of nodes at level 1 is maximal and the placement tool needs more sites for the nodes at level 1. So the wirelength of nets with nodes at level 1 may dilate during the placement phase. The nets connecting the nodes with larger num levels will have larger fan out ran

the dilatability due to the shape distribution of node levels for net e can be calculated as

$$DNL(e) = \sum_{v \in e} \frac{num_{-} |eve|(v)}{d(e)}$$
 (3)



For example referring to Fig. 3 num\_level( $\frac{x}{0}$ ) =  $\operatorname{num}_{\underline{\phantom{a}}} |\operatorname{evel}(u) = 1 \quad \operatorname{num}_{\underline{\phantom{a}}} |\operatorname{evel}(x)| = \operatorname{num}_{\underline{\phantom{a}}} |\operatorname{evel}(x)| =$ 

 $\operatorname{num}_{\underline{}} |\operatorname{eve}_{\underline{}}(\underline{x}) = \operatorname{num}_{\underline{}} |\operatorname{eve}_{\underline{}}(\underline{x}) = \mathbf{5} \operatorname{num}_{\underline{}} |\operatorname{eve}_{\underline{}}(\underline{x}) =$  $num_{\underline{}} |eve|(x_{7}) = 2$ 



Fig. 3 An example of node level

Then we can obtain DNL( $\dot{\mu} \stackrel{X}{\downarrow} = 1$  DNL( $\dot{\mu} \stackrel{X}{\downarrow} \stackrel{X}{\downarrow} = 1$ 3.67,  $DNL(\ \ \frac{x}{3}) = \dots = DNL(\ \ \frac{x}{5}) = 3$ ,  $DNL(\ \ \frac{x}{6}) =$ 1. 5 DNL( $\frac{1}{7}$ )=3. 5.

Since DNL(y, x, x) is larger than others, it indicates that net ( u x x ) has a stronger dilatability than others

#### 2.3 Range of node level

If node levels are within a limited range the wirelength of the output net will likely to be less than that of a net with widely distributed node levels To capture the ranges of the node levels of net e we define the factor RNL(e) as

$$RNL(e) = \sum_{\substack{y \in e \\ \neq j}} \frac{|\text{eve } (v_i) - |\text{eve} (v_j)|}{d(e)}$$
(4)

If the nodes connecting the net have the same node level the RNL factor of the net is 0. In Fig. 3 from our defini tion, RNL( $\frac{u}{3}$ )=1, RNL( $\frac{u}{5}$ )=2 That is by the met ric net (u x) has a longer wire length than net (u x) Note that sometimes the RNL and DNL are approximately orthogonal. When the RNL factor of a net is of the DNL factor of the net may be very large

The predicted result of the apexy circuit is shown in Fg4. In these figures on the x axis are net id and on the y axis are net lengths. The solid line represents predicted wire.

ges in the final layout The factor DNL (e) used to measure ?1994-2015 China Academic Journal Electronic Publish onic Publishing House. All rights reserved. http://www.cnki.net

lengths and the points represent the actual wire lengths (in percentages, in the final placement using our HJP! The shape distribution of node level DNL and the range of node level RNL in estimation of circuits are presented in Figs 4

(b) and (c) As shown in Fig. 4 wirelengths of long inter connects increase rapidly with an increase in RNL DNL is more efficient for short interconnects



 $F \not E \not A \qquad Prediction \ results \ of \ apex \\ 2 \quad (a) \ \ I_{pre} \ vs. \ actual \ w \ ire \ leng \ th, \ (b) \ DNL \ vs. \ actual \ w \ ire \ leng \ th, \ (c) \ RNL \ vs. \ actual \ w \ ire \ leng \ th, \ (d) \ Connumber \ Appendix \ Prediction \ results \ of \ apex \\ 2 \quad (d) \ Connumber \ Appendix \ Prediction \ results \ of \ apex \\ 3 \quad (d) \ Connumber \ Appendix \ Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ results \ of \ apex \\ 4 \quad Prediction \ apex \\ 4 \quad Prediction \ apex \\ 4 \quad Prediction \ apex \\ 4 \quad P$ nectivity vs actual wirelength, (e) Edge separability vs actual wirelength

#### Individual wirelength 2.4

The wirelength of net e Lp, is its basic wirelength L<sub>hasic</sub> (e) adjusted by DNL(e) and RNL(e):

$$\int_{\mathbb{R}^{n}} (e) = a \times I_{\text{fasic}}(e) \times DNL(e) + b \times I_{\text{fasic}}(e) \times RNL(e)$$
(5)

$$I_{pre}(e) = \frac{\int_{pre} (e)}{\int_{max}}$$
 (6)

where a b are parameters that control the trade off between DNL and RNL; have is the largest hope (e) amongst all nets.

Most proposed wirelength predictions have poor results for long interconnects Fgs 4 (d) and (e) are the predicted results using connectivity and edge separability 10-11, which are not sensitive for long connections Fig 4 (a) shows that our approach has good predictions for bng interconnects

# Recursive Bisection Based Global Placement

Recursive bisection based placement algorithms seek to decompose a given placement instance into smaller in Partition the bin into smaller bins using WL ?1994-2015 China Academic Journal Electronic Publishing House. All rights reserved. http://www.cnki.net

stances by subdividing the placement region assigning cells to subregions reformulating constraints and cutting the net[st (see Fig 1). The top-down placement process can be viewed as a sequence of passes where each pass examines all blocks and if required divides them into two smaller blocks using min\_cut partitioning Such netlist decomposition is typically done with the min cut objective A novel clustering\_refinement multilevel partitioning algorithm by incorporating a min. wire length objective is proposed in this paper The global placement algorithm is as follows

GLOBAL PLACEMENT (H(V E), Layout)//Q: the queue of placement bins

While bin size is big enough

do Q→ a bin;

Choose a (horizontal or vertical) cut line for the

bin;

Partitioning attempts to split each bin roughly in half

Build partitioning hypergraph from netlist and cells contained in the bin.

Partition the bin into smaller bins using WL PAR-

TIF DN NG(
$$H_i(V_i E_i)$$
);  
 $//H_i(V_i E_i)$  is a sub-hypergraph of  $H(V_i E)$ 

Q each child bin;

for all finest bins//bin basedo swapping

do swap two bins

if HPW Lincreases then undo swapping:

# WL\_PARTITON NG(H(V E))

for  $\not\in E_{//}$  clustering  $p_hase$ 

do if the area of partition area the number of all cells num th

// area\_th is the area threshold; num\_th is them inimal specified number of cells after clustering

then Cluster the cells of nete in a nonincreasing net weight order

Initial random bisection,

for  $\leftarrow$  partition boundary refinement Phase

do Compute  $g = \lambda I_{pre} + (1 - \lambda) g$ ;

Choose cellmove according to FM\_scheme;

# 3.1 Wirelength-driven clustering

The nets are initially sorted in a nonincreasing netweight order which is computed as the predicted wirelength L The nets of the same weight are sorted in a nondecreasing net area order. Then the nets are visited in that order and for each net that connects cells that have not yet been grouped the cells are grouped together. Thus, this scheme gives preference to the nets that have large weight After all of the nets have been visited the groups of cells that have been matched are contracted together to form the next level coarser netlist The cells that are not parts of any contracted nets are simply copied to the next level coarser netlist

# Tim ing\_driven clustering

In addition timing objectives can be handled within our a gorithm by minim izing the critical path delay. The critical path delay can be determined by computing the arrival time iteratively making use of the following equation.

$$ARR(t) = \begin{cases} 0 & \text{epi} \\ \max_{s \in E} ARR(s) + d(s, t) & \text{otherwise} \end{cases}$$
 (7)

where d(s t) is the delay of edge (s t). The critical path delay is

$$T = \max_{\epsilon \to 0} ARR(t)$$
 (8)

Sim jarly we can also compute the required arrival time for each node using the following equation

$$REQ(s) = \begin{cases} T & \in FO \\ \underset{(s) \in E}{\text{min}} ARR(t) - d(s, t) & \text{otherwise} \end{cases}$$

$$(9)$$

We can now compute the slack value for each edge which measures how much additional delay can be added to an edge without increasing the critical path delay of the whole circuit The slack of a given edge (s t) can be computed as

$$\operatorname{slack}(s) = \operatorname{REQ}(s) - \operatorname{ARR}(s) - \operatorname{d}(s)$$
 (10)

where  $d(s, t) = r(s, t) (c(s, t)/2 + c(T_s)) c(s, t) =$  $(c_a W(s, t) + c_b) (s, t) r(s, t) = r_b (s, t)/W(s, t) (s, t) = r_b (s, t)/W(s, t)$  $I_{pre}(e)/d(e)$  w(s, t)  $c_s$   $c_s$  and  $r_s$  are wine width, area cal pacitance fringing capacitance and resistance for unitwidth wire respectively;  $T_v$  is the subtree rooted at v;  $c(T_v)$  is the capacitance of a dc connected subtree in T rooted at  $T_{v}$  's root Finally according to Eq. (10) we can obtain the net we ght with tining objective

weight (e) = 
$$1 - \sum_{s \in \mathbb{R}} \frac{\operatorname{slack}(s)}{s}$$
 (11)

where  $s_{max}$  is the largest slack sum over all nets. For optim izing timing objectives as the netweight weight (e) instead of I is embedded into the clustering Phase. The net with small slack will be protected by the timing-driven cluste. ring which can effectively minimize critical path delay dur ing placement

### 3 3 Wirelength-driven refinement

In the refinement phase the FM algorithm [2] is used to reduce the cutsize We change the original gain of FM by introducing a predicted wirelength objective to reduce the degeneration of the final wirelength. The wirelength-aware gain of FM swapping is given as

$$g_{w} = \lambda I_{pre} + (1 - \lambda) g \qquad (12)$$

where is a parameter that controls the trade\_off between wirelength and the orginal gain of FM which is less than 1 and larger than 0

# Bin\_based swapping

After clustering form in in izing the wirelengths of intrapartition interconnects bin-based simulated annealing is conducted to find a good beatin for each partition to be placed in thus minimizing the total wirelength between bins There are three types of moves in bin based simulated annealing horizontal switch vertical switch and diagonal switch These moves switch two adjacent bins

#### Detail Placement

In the detailed placement step overlaps between cells have to be resolved to obtain a legal placement When pla cing cells to remove overlaps we have to consider two conflicting factors, the total wirelength and the legality of re. sult We use a greed heuristic to obtain a legal placement while the local swapping tries to reduce the wirelength of the placement

After global placement we divide each row in the place. ment region into placeable segments based on the overlap of the blockages with the row We now use a greedy heuristic to bring every now in the placement region to within its ca pacity by moving the standard cells Once the rows have been brought under capacity we move the cells among the placeable segments to satisfy their respective capacities The cells are then assigned to legal positions within each seg. m ent

A fter all overhps are removed greedy local in Provement 94-2015 China Academic Journal Electronic Publishing House. All rights reserved. http://www.cnki.net

is performed We try to switch adjacent standard cells to see if total wire length can be improved. This step will cure the wire length loss caused by the blind cell spreading step of legalization. For each row, all the bins from left to right are traversed and an attempt is made to place in the bin onto the site array. If there are blocked sites, we ignore it and go to the right site of the chunk of blocked sites.

#### 5 Experiment

Our placement algorithm is implemented in C programming language and compiled with gcc 4.0 on a Linux PC with an Intel Pentium IV 1.5 GHz CPU and 768 MBm enc. BY These algorithms are tested using a set of benchmarks published in ISPD 2002 ISPD 2005 ISCA S89 and PEKQ In the first experiment we compare HJ-Pl with three well-known academic placement tools namely Capolo. 5<sup>[3]</sup>, mPI6<sup>[4]</sup> and NTUP lace 13 Moreover, the wirelength driven clustering algorithm is an individual soft package which can

embed other open source took For the second experiment we integrate the wirelength clustering with Capolo 5 due to its code availability. Our wirelength prediction tool estimates on MHPW L in all the experiments For the last experiment we compare the timing-driven HJ-Pl with the original HJ-Pl

### 5 1 Comparison with other placers

For all experiments we give an average of 10 runs and use the ISPD2002 benchmarks with 20% to alwhite space and random pad locations. The statistical information of benchmarks is listed in Tab 1. Tab 1 shows that HJPl obtains a HPWL in provement of 13% versus Capq 0.5 and is 5.24 times faster than Capol 0.5. The runtine of Capq 0.5 for the ibm 6 circuit is very long HJPl (WL) reduces runtime by 10% and 9% overmP16 and NTUP lace with 3% and 1% shorter wirelengths respectively

Tab 1 The resulting HPW L and run time for different placers

| C'mu't     | Ha all  | HJ-Pl                      |            | Capo <sub>10.</sub> 5                           |            | mPL6                                  |            | NTU p Jace                          |           |
|------------|---------|----------------------------|------------|-------------------------------------------------|------------|---------------------------------------|------------|-------------------------------------|-----------|
| Circuit    | #ce]]   | HPW L/(10 <sup>6</sup> μm) | Time/s     | $\overline{\text{HPWL}/(10^6\mu^{\mathrm{m}})}$ | Time/s     | $\overline{\text{HPWL/(10^6 \mu m)}}$ | Timey s    | $\overline{\text{HPWL/(10^6\mum)}}$ | Time/s    |
| ibm01      | 12 752  | 2. 02                      | 257. 17    | 2. 54                                           | 1 035. 84  | 2. 12                                 | 615. 38    | 2. 23                               | 97. 4     |
| $ibm_{02}$ | 19 601  | 4. 24                      | 315. 32    | 5. 63                                           | 1 585. 57  | 4. 63                                 | 765. 09    | 4. 74                               | 165. 1    |
| ibm07      | 45 926  | 9. 35                      | 1131. 41   | 11. 13                                          | 1 656. 26  | 9. 68                                 | 1 707. 49  | 10. 24                              | 993. 8    |
| 1000       | 51 309  | 11. 19                     | 2 483. 09  | 13. 55                                          | 5 006. 82  | 11. 22                                | 2 550. 87  | 11. 99                              | 798. 6    |
| ibm09      | 53 395  | 12. 19                     | 2 153. 11  | 14. 33                                          | 1 983. 73  | 12. 08                                | 2 777. 82  | 12. 30                              | 959. 8    |
| ibm10      | 69 429  | 28. 85                     | 3 031. 21  |                                                 |            | 29. 28                                | 3 701. 04  | 28. 85                              | 1 350. 3  |
| ibт11      | 70 558  | 17. 80                     | 3 069. 43  | 20. 02                                          | 6 835. 09  | 17. 43                                | 3 569. 79  | 17. <b>7</b> 9                      | 1 644. 9  |
| $ibm_{12}$ | 71 076  | 31. 97                     | 2 324. 90  | 38. 91                                          | 4 347. 09  | 32. 77                                | 2 119. 54  | 32. 86                              | 1 231. 2  |
| ibm16      | 183 484 | 51. 49                     | 5 485. 23  | 61. 70                                          | 98 889. 93 | 53. 45                                | 5 646. 05  | 55. 49                              | 6 286. 4  |
| $ibm_{18}$ | 210613  | 40. 17                     | 10 174. 31 | 44. 78                                          | 22 158. 01 | 41. 98                                | 10 152. 01 | 51. 17                              | 17 264. 3 |
| Ratio      |         | 1                          | 1          | 1. 13                                           | 5. 24      | 1. 03                                 | 1. 10      | 1. 09                               | 1. 01     |

### 5.2 Integration with Capo

We integrate our wire length driven clustering with Capolo 5 which is based on min cut placement techniques In addition the wire length driven clustering using Connectivity 10 -11 is embedded into Capolo 5 as a comparison with our method Tab 2 shows the results without and with wire length driven clustering based on the PEKO

benchmark with 10% total white space uniform cell sizes and random pad locations and the ISPD 2005 benchmarks. From Tab 2 the original Capolo 5 averages 3% more wirelength than Capolo 5 with wirelength-driven clustering and runs faster. The experimental results also reveal the fact that our wirelength prediction is superior to Connectivity for placement quality.

Tab 2 HPWL and runtine comparison of orginal Capo and Capowith wirelength-driven clustering

| Circuit               | Capo <sub>10.</sub> 5       | S(WL)      | Capo <sub>10.5</sub> ( co   | nnec tiv itv) | Capo <sub>10.</sub> 5       |            |  |
|-----------------------|-----------------------------|------------|-----------------------------|---------------|-----------------------------|------------|--|
| Chealt                | HPW L/(10 <sup>6</sup> μ m) | Time/s     | HPW L/(10 <sup>6</sup> μ m) | Timeys        | HPW L/(10 <sup>6</sup> μ m) | Time/s     |  |
| pek401                | 1 43                        | 671. 42    | 1. 56                       | 674. 78       | 1. 57                       | 514. 38    |  |
| $pek_{05}$            | 4 11                        | 1 423. 34  | 4. 97                       | 1 400. 17     | 4. 09                       | 1 138. 97  |  |
| $pek_{00}$            | 14. 11                      | 3 978. 12  | 15. 31                      | 4 309. 12     | 14. 52                      | 3 823. 33  |  |
| $pek_{015}$           | 45. 92                      | 39 871. 44 | 44. 88                      | 40 671. 02    | 46. 81                      | 38 625 60  |  |
| pekol8                | 38. 73                      | 12 150. 71 | 40. 01                      | 11 098. 15    | 39. 64                      | 11 427. 60 |  |
| adapteq               | 87. 43                      | 26 587. 89 | 88. 01                      | 27 564. 77    | 88. 72                      | 25 065 40  |  |
| adapteo2              | 96. 89                      | 33 158. 54 | 98. 90                      | 32453. 98     | 98. 25                      | 32 530 90  |  |
| bigb Jue <sub>l</sub> | 101. 14                     | 40 138. 23 | 104. 65                     | 41 453. 01    | 106. 85                     | 39 549 60  |  |
| Ratio                 | 1                           | 1          | 1. 02                       | 1 01          | 1. 03                       | 0. 96      |  |

#### 5 3 Tim ing-driven HLPl

The results without and with timing-driven clustering are shown in Tab 3. A fter placement (using the ISCAS89 bench mark) we perform global and detailed routing RC extraction

and tining analysis using commercial tools The electrical parameters have been chosen to resemble a typical 90 m process Tab 3 shows that HJ-Pl(T) with tining driven clustering reduces the critical path delay by about 23%.

Tab 3 Critical path delay comparison of orginal HJPl and HJPlw ith timing driven clustering

| Circuit | Critical path delay/ns |        |  |  |
|---------|------------------------|--------|--|--|
| Chealt  | HJ-Pl(T)               | HJ-Pl  |  |  |
| C6288   | 14. 21                 | 18. 34 |  |  |
| C7552   | 18. 77                 | 23. 90 |  |  |
| S9234   | 27. 96                 | 32. 89 |  |  |
| S13207  | 26. 33                 | 37. 41 |  |  |
| S15850  | 35. 47                 | 39. 19 |  |  |
| Ratjo   | 1                      | 1 23   |  |  |

### 6 Conclusion

A new method for standard cell placement is presented in this paper. The proposed method is based on a new min. cut partitioning with wirelength-driven clustering and refinement Experiments show that we can obtain shorter HPWL than Capo mPL and NTUP ace Our tool is also capable of doing power driven placement in the future

# References

- [1] Adya SN Markov II, Combinatorial techniques form ixed size Placement J. ACM Transactions on Design Automation of Electronic Systems, 2005 10(1):58-90
- [2] Doll K, Johannes FM, Antreich K, J. Iterative Placement improvement by nework flow methods [3]. EEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 1994, 13(10): 1189-1200
- [3] Roy JA, Papa DA, NgAN, et al Satisfying whitespace requirements in top-down placement Q//Proc of Internation.
  al Symposium on Physical Desgn. San Jose California
  USA: EEE Press 2006: 206-208
- [4] Wang M, Yang X, Sarrafzadeh M, Dragon 2000: fast standard-cell placement for large circuit C // Proc of the International Conference on Computer Aided Design San Jose

California USA 2000: 260-263

- [5] Vygen J A Borithms for large scale flat placement C<sub>J</sub> // Proc of Design Automation Conference Anahem, Califor n a USA 1997: 746-751.
- [6] Chan T Cong J Joseph R et al mPL6: enhanced multi level mixed-size placement C //Proc of International Symposium on Physical Design San Jose California USA 2006: 212-214
- [7] Donath W. E. Placement and average interconnection lengths of computer logiq. J. EFE Transactions on Circuits and System, \$1979. 26(4):272-277.
- [8] Callwell A.E. Kahng A.B. Mantik S. et al. On wire length estimations for row-based placement. J. IEEE Transactions on Computer. Aided Design of Integrated Circuits and Systems 1999 18(9): 1265-1278
- [9] Balachandran S Bhatia D A proriwire length interconnect estimation based on circuit characteristics J. IEEE Transactions on Computer Aided Des En of Integrated Circuits and System, s 2005 24(7):1054-1065.
- [10] HuB Magorzata Marck-Sadowaka Wire length Prediction based clustering and its application in placement C]//Proc of Design Automation Conference Anahem, California USA 2003: 800-805
- [11] Liu Q Magorzata Marck-Sadowaka Senindividual wirelength prediction with application to pgic synthesis J. EEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2006 25(4):611-624
- [12] Fiduccia C M, Mattheyses R M. A linear time heuristic for improving network partitions C]//Proc of Design Automation Conference Las Vegas, Nevada USA 1982:175-181.
- [13] Chen Tung-Chieh Hsu Tien-Chang Jiang Zhe-Wei et al NTUP lace a ratio partitioning based placement algorithm for large scale mixed size designs [C]//Proc of International Symposium on Physical Design San Francisco California USA 2005: 236-238

# 预测线长驱动的二分布局算法

蒿 杰 马 鸿 彭思龙

(中国科学院自动化研究所国家专用集成电路设计工程研究中心,北京 100190)

摘要: 为了有效提高布局质量,提出一种基于预测线长的二分布局算法 HJP.l该算法对长、短互连线都有很好的预测效果.通过将预测线长嵌入到布局框架中,使算法可以预先控制布局后可能产生的长互连线,有效降低它们在布局过程中被分割的几率,从而达到减小总线长的目的.另外,该算法通过最小化关键通路时延,得到了较好的时序优化效果.实验表明,与现有的  $Capo_{10.5}$ , NTUPlace  $mPI_6$  算法相比,该布局算法可分别减小线长 13%, 3%, 9%.将预测线长目标集成到  $Capo_{10.5}$ 中可减小线长 3%.带有时序驱动功能的 HJP 可以减小关键通路时延 23% 左右.

关键词: 层次 化; 互连线; 布局; 超大规模集 成电路; 线长 预测中图分类号: TP391. 72