PHASE 1 — OpenLANE Flow Familiarity
The initial steps to invoke openlane flow.
Prepping the design involves setting all the environment variables like PDK, SDC file and standard cell library used. Below statistics shows the synthesis report about the standard cells used to synthesize picorv32a netlist.
Flop ratio = no.of flipflops/Total no of cells = 1613/15762 = 10.2%.
Synthesis report shows the area occupied by the logic
Timing report generated after the synthesis step. Both TNS and WNS are zero. Worst slack is 0.52ns. Library used ``nom`` typical library.
PHASE 2 — Floorplan Fundamentals
Aspect ration and the utilization factor in the config.tcl file.
Die area from the def file generated after the floorplan step
Changing the utilization factor to 30% and then running the floor plan again.
Die area is changed as it can be observed from the new def file.
RAM is a macro different from a standard cell. These Macro's are placed along with preplaced cells before floorplan and placement. During the placement stage standard cells are placed. These Macro are blackboxed, it means these macro's are already implemented so only the input and output pins.
PHASE 3 — Timing Literacy with Ideal Clocks
The pre_sta.conf file used to run timing analysis using OpenSTA
After replacing a cell with large fanout of 10
The new modified slack value from the report checks command
After changing synthesis strategy from "AREA O" to "DELAY 3" and then running the timing analysis changes the slack value to -2.79ns.
PHASE 4 — CTS and Timing with Real Clocks
Entering the OpenROAD interactive shell to load the post-CTS database and timing libraries. The crucial step here is executing the set_propagated_clock command. This instructs the timing engine to stop using "ideal" clock estimates (zero delay) and start calculating the actual physical wire and
buffer delays introduced during Clock Tree Synthesis.
A detailed STA report for the worst-case hold path. Real physical clock skew can sometimes cause a destination flip-flop to receive the clock edge too late, causing fast-moving data to overwrite previous values. This results in a hold timing violation, which the tool will later repair by inserting delay buffers on the data path during the routing stages.
A detailed Static Timing Analysis (STA) report for the worst-case setup path. Notice that the clock path now explicitly lists actual physical delays traveling through standard cell clock buffers (like sky130_fd_sc_hd__clkbuf_8 and 16). The introduction of real-world clock delays has exposed a setup violation (negative slack), which is completely normal at this stage and will be targeted during post-CTS optimization.
Using the report_clock_skew command for both setup and hold checks. Because physical clock buffers and wires are now in the design, the clock signal no longer arrives at all flip-flops at the exact same time. This report quantifies that physical skew (for example, showing a 1.07ns variance for setup), which is a realistic reflection of the physical layout.
PHASE 5 — PDN Awareness
In the RTL-to-GDSII flow, the PDN is build immediately after floorplanning (and sometimes macro placement), long before you route a single data signal. Power and ground (VDD and VSS) need to travel across the entire chip with the least possible resistance. Therefore, they claim the top, thickest metal layers.
PHASE 1 — ORFS Execution in GitHub Codespaces
Cloned this repo https://github.com/vsdip/vsd-scl180-orfsand then launched github codespaces. This task verifies that the OpenROAD devcontainer environment was successfully built in the cloud and all necessary physical design tools are correctly installed and accessible in the system path.
1. Toolchain Version Verification
Terminal output displaying the successful installation and version checks for the core toolchain components within the GitHub Codespaces container: OpenROAD (v2.0), Yosys (0.58), Python (3.10.12), and GNU Make (4.3).
2. Synthesis Execution
The execution log of the synthesis stage showing the Yosys engine successfully parsing the Verilog RTL, linking the Sky130 standard cell libraries, and generating the initial OpenROAD database (1_synth.odb).
3. Synthesis Statistical Report
The synthesis statistical report detailing the post-synthesis metrics. It shows an initial estimated design area of 61,097 µm² at 100% utilization before physical floorplanning and placement expand the core.
4. Floorplan Initialization Log
The floorplan execution log confirming the core area initialization, standard cell track generation, and an initial design area of 61,496 µm² with 46% utilization.
5. Power Delivery Network (PDN) Generation
Terminal output showing the successful insertion of the Power Delivery Network (PDN) grid, which is essential for routing VDD and VSS across the chip without voltage drop.
6. Physical Placement Execution
Terminal log for the global and detailed placement stages, showing standard cells legally placed into rows with an updated design area of 69,948 µm² and a core utilization of 52%.
7. Detailed Placement GUI Visualization
The OpenROAD GUI rendering the detailed placement database, visually confirming that all standard cells are packed and snapped to the site rows within the defined core boundary.
8. Clock Tree Synthesis (CTS) & Resizing
The Clock Tree Synthesis log showing buffer insertions and aggressive resizing to repair timing violations, resulting in a design area of 75,857 µm² at 57% utilization.
9. Routing Stage Completion
Terminal output confirming the successful completion of the global and detailed routing stages, followed by the insertion of 10,342 filler cells.
10. Routing Violations Check
The routing antenna and DRC checker output proudly displaying zero net violations and zero pin violations, indicating a perfectly clean and manufacturable routing database.
11. Detailed Routing GUI Visualization
The OpenROAD GUI showcasing the massive, multi-layered maze of metal interconnects successfully routing data signals and the clock tree across the standard cells.
12. Final GDSII Generation
The final, fab-ready .gds binary file rendered visually via KLayout in the Codespaces environment, marking the successful end of the automated physical layout flow.
13. Total Execution Runtime
The final make elapsed summary table aggregating the exact wall-clock compute time. The complete cloud-based RTL-to-GDS flow successfully finished with a total runtime of 2189 seconds (~36.5 minutes).
14. Final Post-Route Timing Report
The ultimate post-route timing report confirming physical design success. The engine's optimizations resulted in a manageable Worst Negative Slack (WNS) of -0.57ns and a Total Negative Slack (TNS) of -10.31ns.
PHASE 2 — Toolchain Understanding (Devcontainer Deep Dive)
| S.No | Tool Name | Installed From | Purpose in the flow | Stage used |
|---|---|---|---|---|
| 1 | OpenROAD | Compiled from source | Bind all the tools for PnR flow | All the stages after Synthesis |
| 2 | Yosys | Compiled from source | Synthesizing the netlist | Synthesis |
| 3 | TritonCTS | Integrated within OpenROAD | Generation of Clock Tree | During CTS |
| 4 | FastRoute | Integrated within OpenROAD | Global Routing | Routing |
| 5 | OpenSTA | Integrated within OpenROAD | Static Timing Analysis | Timing check at all stages |
| 6 | KLayout | Compiled from source | Layout Viewer | Final sign-off & GDS generation |
| 7 | Python | Package Manager | Calculating Run times | Reporting & log creation |
| 8 | Make | Package Manager | Running flow scripts | All the stages |
| 9 | Git | Package Manager | Version Cotrol | Environment setup |
ORFS automates the passing of data between completely separate point tools (Yosys -> OpenROAD -> KLayout). It automates the complete RTL2GDSII flow using Makefiles. Using one command "make all", it generates all the reports and final GDS with the required results.
It creates switches for every step of the flow. Like make synth, make floorplan, make place, make gui_place. For every operation underneath the make switch it runs a script to send the result to the appropriate location and take the appropriate inputs from the correct locations.
The true power of a Makefile is Dependency Tracking. If you edit your Verilog file, the Makefile knows it has to rerun Yosys. It orchestrates the flow by tracking the timestamps of input and output files to ensure steps are only run when necessary.
synthesis means converting RTL into gate level netlist. Yosys creates the netlist. SDC constranints + netlist(from synthesis) is merged into one file .odb file. this odb file is the input for place and route step.
Physical design begins when OpenROAD spins up and imports three separate things: The Yosys Verilog netlist, The technology files (LEF/LIB) and The SDC constraints (Separate file) Once OpenROAD reads all three of those pieces into its memory to initialize the floorplan,then it saves that combined state as the very first 1_synth.odb database.
At every stage actually. After synthesis, after placement we call it prelayout sta with ideal clocks. after cts pre-layout sta with actual clocks.after routing post layout sta
OpenROAD finishes by producing a .def or .odb file, which is just an abstract text description of where the wires are. GDS is produced during the make finish stage. A completely different tool (KLayout or Magic) takes the abstract OpenROAD routing data and physically merges it with the exact, proprietary polygon shapes provided by the foundry (SkyWater) to draw the final photographic mask (.gds).
PHASE 3 — Local Installation (Self-Owned Environment)
The image below shows the tools that are locally installed to run the flow
The OpenROAD toolchain supports two installation approaches: a pre-built binary installation (used in the Cloud Codespaces environment) and a manual compilation from source (used for the local Ubuntu VM).
The dependencies required to compile openroad from the source.
Successful compilation of the OpenROAD toolchain from source. The terminal indicates a 100% build completion and verifies the executable by querying the installed version (26Q2-1229-g08035c8730).
PHASE 4 — Re-Run RTL-to-GDS Locally
This phase documents the successful local execution of the OpenROAD Flow Script (ORFS) from RTL synthesis all the way down to final GDSII generation, including intermediate database inspections and timing analysis.
The synthesis report generated by Yosys shows the mapping of the Verilog code to the sky130_fd_sc_hd standard cell library. According to the log, the total estimated chip area for the top module 'riscv' is 61,480.21 µm², with sequential elements (like flip-flops) occupying roughly 51.32% of that total area.
2. Physical Placement (DEF Visualization)
The OpenROAD GUI displaying the chip layout after the global placement stage. The standard cells are now legally placed into the standard cell rows (visible as horizontal horizontal tracks) within the core boundary, establishing the physical footprint of the logic gates before routing begins.
3. Clock Tree Synthesis (CTS) Log
A snippet from the CTS execution log confirming the physical dimensions of the chip: a design area of 77,272 µm² with a core utilization of 57%.
4. Post-CTS Timing Reality Check
The timing report immediately following Clock Tree Synthesis. The introduction of real physical clock buffers and wire delays exposes significant clock skew, resulting in a Worst Negative Slack (WNS) of -0.86ns and a Total Negative Slack (TNS) of -42.04ns.
The ultimate output of the physical design flow. The fully routed, multi-layer .gds binary file opened in KLayout, ready for fabrication handoff.
6. Post-Route Timing Optimization
The final post-route timing report. Notice the massive improvement in TNS (reduced from -42.04ns down to -16.28ns). This highlights the aggressive timing repairs, gate sizing, and buffer insertions the engine performs during the routing phase to heal the violations exposed during CTS.
A directory listing validating the successful generation of all required databases (DEF/ODB), logs, and reports across every stage of the flow.
The master summary table generated at the conclusion of the flow. It aggregates the total elapsed wall-clock time, peak memory usage, and the secure hashes for every step of the local build.
| Metric | Cloud | Local |
|---|---|---|
| Runtime | 2189s | 6855s |
| WNS | -0.57 | -0.76 |
| TNS | -10.31 | -16.28 |
| GDS Generated | Yes | Yes |
PHASE 5 — Debugging and Unix Literacy
During the RTL-to-GDSII flow, navigating thousands of lines of logs and deeply nested directories requires basic Unix literacy. Here are three incredibly powerful commands used to extract physical design metrics efficiently:
Description: Short for "concatenate," cat is used to instantly read the contents of a file and print it directly to the terminal. In physical design, it is perfect for quickly reading short summary files, like the final timing reports, without having to open a heavy text editor.
Description: The find command is a search engine for your terminal. When an automated flow generates hundreds of intermediate database files, find allows you to instantly locate a specific file (like a .def or .gds) by searching through the current directory and all sub-directories.
Description: Short for "Global Regular Expression Print," grep is arguably the most used Unix command in VLSI. It allows you to search for a specific word or phrase inside a massive file and prints only the lines containing that phrase. Here, it is used to instantly extract the exact "Design area" metric from a massive synthesis log file without scrolling manually.
PHASE 1 — Analyze the Top-Level Wrapper
The top-level module of the design is user_project_wrapper. This module acts as an interface between the user project and the Caravel SoC. It connects the design to the Wishbone bus, GPIO pins, logic analyzer signals, and interrupt outputs.
Used for communication between the CPU and the user logic.
wb_clk_i-> Clock inputwb_rst_i-> Reset inputwbs_stb_i,wbs_cyc_i,wbs_we_i-> Control signalswbs_dat_i,wbs_adr_i-> Input data and addresswbs_ack_o,wbs_dat_o-> Output response
Used for interaction with external devices.
io_in-> Input pinsio_out-> Output pinsio_oeb-> Output enable
Used for internal debugging and signal observation.
la_data_inla_data_outla_oenb
Used to send interrupt signals to the processor.
user_irq
- Clock:
wb_clk_i - Reset:
wb_rst_i - Target clock frequency: 100 MHz
- Clock period: 10 ns
The following modules are instantiated inside the wrapper:
debug_regs
This module implements debug registers accessible through the Wishbone interface.
user_project_gpio_example (optional)
Instantiated only when GPIO_TESTING is enabled.
user_project_la_example (optional)
Instantiated only when LA_TESTING is enabled.
user_project_wrapper
├── debug_regs
├── user_project_gpio_example (optional)
└── user_project_la_example (optional)
The following RTL files are required for synthesis:
user_project_wrapper.vdebug_regs.vdefines.vuser_project_gpio_example.v(optional)user_project_la_example.v(optional)
The modules must be compiled in the following order:
debug_regs.vuser_project_gpio_example.v(optional)user_project_la_example.v(optional)user_project_wrapper.v
The wrapper depends on the lower-level modules.
PHASE 2 — Prepare the ORFS Design Environment
The directory structure of the files required to run the flow is given below. The config.mk file also specifies fixed die and core areas instead of using core_utilization to determine the chip area. The core and die areas are chosen to accommodate all I/O pins with the required margin, as the previous run resulted in an error with the earlier area settings.
Tools used
yosys
OpenRoad
The following executable paths were used for the complete RTL-to-GDSII flow:
/home/vsdsquadron/workspace/yosys/yosys
/home/vsdsquadron/workspace/OpenROAD/build/bin/openroad
PHASE 3 — Apply 100 MHz Clock Constraint
The objective of Phase-3 is to define and apply a 100MHz clock constraint for the design. This enables timing-driven synthesis and ensures that the design meets the required operating frequency.
The clock signal is identified from the top-level module user_project_wrapper is wb_clk_i
The signal wb_clk_i serves as the primary clock input and drives all synchronous elements in the design through the Wishbone interface.
The clock constraint is defined in the Synopsys Design Constraints (SDC) file as:
create_clock -name wb_clk_i -period 10 [get_ports wb_clk_i]This constraint specifies the clock characteristics for timing analysis.
- File:
constraint.sdc
Content:
create_clock -name wb_clk_i -period 10 [get_ports wb_clk_i]create_clock-> Defines a clock for timing analysis-name wb_clk_i-> Assigns a name to the clock-period 10-> Specifies the clock period in nanoseconds[get_ports wb_clk_i]-> Applies the constraint to the clock input port
This ensures that all timing paths are evaluated with respect to the defined clock period.
During the ORFS flow, the constraint file is read and applied during synthesis and timing analysis.
The clock constraint is used to:
- Perform static timing analysis (STA)
- Optimize logic to meet timing requirements
- Evaluate setup and hold constraints
- Generate timing reports
Successful application of the clock constraint is confirmed when:
- The clock is detected in timing reports
- No missing clock warnings are reported
- Timing analysis (STA) is performed
- Slack values are computed for timing paths
PHASE 4 — Run the RTL-to-GDS Flow
This section documents the complete RTL2GDSII flow executed for the user_project_wrapper design, along with screenshots captured at each major stage of the flow.
In the synthesis stage, the RTL design is converted into a gate-level netlist using the standard cell library.
In the floorplanning stage, the core area, die area, and initial physical layout constraints are defined for the design as letting utilization factor determine the area results in an error in the placement stage.
In the placement stage, the synthesized standard cells are placed inside the core area based on timing and congestion considerations.
In the CTS stage, clock buffers are inserted to distribute the clock signal properly and reduce clock skew across the design.
Min Clock Period = 2.53ns and a maximum frequency of 394.48MHz.
In the routing stage, all placed cells are connected through metal interconnect layers according to the netlist connectivity.
In the fill insertion stage, filler cells are added after routing to maintain physical continuity and satisfy layout requirements.
In the final database generation stage, 6_final.odb the routed and filled design database is finalized for signoff checks and downstream export. The 6_final.odb file is the creation of the last physical-design database inside the EDA tool after routing, fill insertion, and signoff checks. The final GDS generation is the export of that completed database into the tapeout-ready GDSII file for fabrication.
In the final GDS generation stage, the completed physical layout is exported as a GDSII file for fabrication handoff.
The final timing analysis report is used to verify that the design meets timing requirements after implementation.
WNS = 0, TNS = 0 and a Worst slack is 7.52ns
PHASE 5 — Outputs for Gate-Level Verification Preparation
This section documents all the necessary files and their locations after the RTL2GDSII run is completed for gate-level verification.
Gate level representation of the RTL, resulted after the make synth stage.
/home/vsdsquadron/workspace/vsd-scl180-orfs/orfs/flow/results/sky130hd/user_project_wrapper/base/1_2_yosys.v
The post-route netlist is generated after placement, CTS, and routing are completed, including the insertion of clock buffers, clock tree and any final optimizations.
/results/sky130hd/user_project_wrapper/base/6_final.v
A routed database (.db or similar format) is a comprehensive, tool-specific binary file that saves the complete physical and logical state of a design—including geometry, placement, routing traces, timing data, and library references—so the exact layout environment can be reloaded later without starting over.
While the routed netlist contains logical cells and their connections, .db file contains cell connections, physical coordinates, metal routing shapes, power grids, and timing/constraint data.
/results/sky130hd/user_project_wrapper/base/5_route.odb
The final filled database contains all the geometric and logical information from the routed database, but it includes essential non-logical additions required by the silicon foundry for manufacturing.
The routed database is the complete functional design. The final database takes that functional design and adds filler cells, dummy metal, and redundant vias to make it physically manufacturable at the foundry.
/results/sky130hd/user_project_wrapper/base/6_1_fill.odb
The Exported, translated version of final filled database. While the final database is a proprietary, tool-specific file (like a Cadence or Synopsys database, in this case OpenRoad) that knows about logical connections, timing, and constraints, GDSII strips all of that context away.
When the final database is converted to GDSII, the output is simply a standardized, binary file containing raw 2D geometric shapes (polygons, lines, and text labels) mapped to specific fabrication layers.
/results/sky130hd/user_project_wrapper/base/6_final.gds
A timing report is generated using Static Timing Analysis (STA) to definitively prove that the chip will meet its target clock speeds and function correctly across all conditions before it is sent to the foundry.
/home/vsdsquadron/workspace/vsd-scl180-orfs/orfs/flow/reports/sky130hd/user_project_wrapper/base/6_finish.rpt
/home/vsdsquadron/workspace/vsd-scl180-orfs/orfs/flow/reports/sky130hd/user_project_wrapper/base/4_cts_final.rpt (timing report after cts)
Output at the end of the flow
PHASE 6 — Debugging and Issue Resolution
This section includes all the Issues faced during the flow and the debugging methods used
Error in synthesis stage because the openroad executable path is not given. Even for synthesis which deals with logic gates and not physical implementation, Openroad executable path is required to generate .odb file which is required further down the flow.
defines.v has to be read first to initialize the variable before using them, if not it results in the following error. To get rid of this error, the verilog files are exported in the order preferred in the config.mk file instead of just locating and pointing all the .v files in the folder.
Due to this error, we choose the die area and core area in the config.mk instead of core_utilization factor.