Technical press coverage traditionally focuses on the bleeding-edge nodes because that’s where the biggest challenges are. But today, our industry sits astride two distinct paths: One, next-generation process nodes. But the second path is mature nodes which are getting extended life thanks to exploding markets such as the Internet of Things.
And those mature nodes—and attendant methodologies–are constantly being refreshed to address evolving asic/fpga design demands. To get a sense of what’s required, I chatted with Nilesh Ranpura, engineering manager at eInfochips, to get a sense for his company’s experience in physical design (Netlist to GDSII implementation).
A: So we did physical design for 40+ blocks in a recent project. Block sizes varied from 400K to 2M instances, with an average of 90 memory modules per block. Macro placement was done manually based on data flow diagrams. Static Timing Analysis (STA) signoff across more than 10 timing scenarios with crosstalk and OCV analysis is being carried out through PTSI.
A: Congestion was one. We identified higher local congestion in the core area. On analysis, it appeared that all flops have higher connectivity among modules. The Netlist was modified and the path took a detour because of the logic cell placement density. Global clock channel was present and hence macro could not be placed properly.
Then we had to deal with long runtime iterations. Weeks of compute runtime are required to perform steps like placement, clock tree synthesis and routing for each block in the chip. The huge runtime involved compels designers to act smartly and think twice before firing off any new iteration. The chip will be unable to sustain more than 2 iterations. Most blocks on our design have an implementation runtime of over 2 weeks!
And then there was timing closure. Memory-to-flop paths have high logic levels and a memory delay of ~400ps. We needed to create placement regions near the memories to ensure less buffering in the path, and achieve timing closure. Clock gating issues also cause high congestion in the localized area.
A: Chip and block floor planning with a huge number of macros or hard blocks have various different shapes and sizes. None of the current floor planning tools provides an ideal macro placement solution. By getting the right macro placement, one can get best standard cell placement.
Numbers of parameters, which need to be controlled and minimized, have increased substantially. Previously, Timing, Area, and Power were three main parameters. But today, many more design constraints are being added – viz., different placement rules for different types of standard cells/macros, local power drop targets on top of the global targets, special clock tree design to take care of the skew and to minimize the number of buffers on clock tree. We also need to avoid signal integrity issues, electro migration on power as well as signal nets, as we check the timing on hundreds of PVT corners.
At the end of the day, keeping in view the above-mentioned challenges, the overall implementation flow development and maturity of the flow checklists determines success or failure of the design. We have matured our design flow checklists from dozens of projects.
We have addressed these challenges for global clients.
A: Yes, I will host a session on ‘Implementation Challenges for Large 28nm SoCs’ at 9:30 am – 10:10 am. The session will cover these challenges, and also explore our solution approach and the achieved results.
A: We’ll see you there! We have a booth and if your readers would like to schedule a meeting, they can write to firstname.lastname@example.org.