Skip to content

Pa1mantri/VSDSquadron_RTL_GDSII

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

108 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VSD-Squadron RTL to GDSII SOC Implementation

WEEK-1 (Digital VLSI SoC Design and Planning – Foundation Phase)

PHASE 1 — OpenLANE Flow Familiarity

The initial steps to invoke openlane flow.

Image

Prepping the design involves setting all the environment variables like PDK, SDC file and standard cell library used. Below statistics shows the synthesis report about the standard cells used to synthesize picorv32a netlist.

Image

Flop ratio = no.of flipflops/Total no of cells = 1613/15762 = 10.2%.

Image Image

Synthesis report shows the area occupied by the logic

Image
Timing report generated after the synthesis step. Both TNS and WNS are zero. Worst slack is 0.52ns. Library used ``nom`` typical library.
PHASE 2 — Floorplan Fundamentals
Image

Aspect ration and the utilization factor in the config.tcl file.

Image

Die area from the def file generated after the floorplan step

Image

Changing the utilization factor to 30% and then running the floor plan again.

Image

Die area is changed as it can be observed from the new def file.

Image
Image
Image

RAM is a macro different from a standard cell. These Macro's are placed along with preplaced cells before floorplan and placement. During the placement stage standard cells are placed. These Macro are blackboxed, it means these macro's are already implemented so only the input and output pins.

PHASE 3 — Timing Literacy with Ideal Clocks

The pre_sta.conf file used to run timing analysis using OpenSTA

Image
Image
Image
Image
Image
Image

After replacing a cell with large fanout of 10

Image

The new modified slack value from the report checks command

Image

After changing synthesis strategy from "AREA O" to "DELAY 3" and then running the timing analysis changes the slack value to -2.79ns.

Image
Image
PHASE 4 — CTS and Timing with Real Clocks

Topic: Loading the Design and Propagating Clocks

Image

Entering the OpenROAD interactive shell to load the post-CTS database and timing libraries. The crucial step here is executing the set_propagated_clock command. This instructs the timing engine to stop using "ideal" clock estimates (zero delay) and start calculating the actual physical wire and buffer delays introduced during Clock Tree Synthesis.

Image

Topic: Hold Timing with Real Clocks (Min Path)

Image

A detailed STA report for the worst-case hold path. Real physical clock skew can sometimes cause a destination flip-flop to receive the clock edge too late, causing fast-moving data to overwrite previous values. This results in a hold timing violation, which the tool will later repair by inserting delay buffers on the data path during the routing stages.

Image

Topic: Setup Timing with Real Clocks (Max Path)

Image
Image
Image

A detailed Static Timing Analysis (STA) report for the worst-case setup path. Notice that the clock path now explicitly lists actual physical delays traveling through standard cell clock buffers (like sky130_fd_sc_hd__clkbuf_8 and 16). The introduction of real-world clock delays has exposed a setup violation (negative slack), which is completely normal at this stage and will be targeted during post-CTS optimization.

Image

Topic: Post-CTS Clock Skew Analysis

Using the report_clock_skew command for both setup and hold checks. Because physical clock buffers and wires are now in the design, the clock signal no longer arrives at all flip-flops at the exact same time. This report quantifies that physical skew (for example, showing a 1.07ns variance for setup), which is a realistic reflection of the physical layout.

Image
PHASE 5 — PDN Awareness
Image
Image
Image

In the RTL-to-GDSII flow, the PDN is build immediately after floorplanning (and sometimes macro placement), long before you route a single data signal. Power and ground (VDD and VSS) need to travel across the entire chip with the least possible resistance. Therefore, they claim the top, thickest metal layers.


WEEK-2 (Toolchain Mastery and ORFS Execution [Cloud to Local])

PHASE 1 — ORFS Execution in GitHub Codespaces

Task 1.1 — Repository Setup & Environment Verification

Cloned this repo https://github.com/vsdip/vsd-scl180-orfsand then launched github codespaces. This task verifies that the OpenROAD devcontainer environment was successfully built in the cloud and all necessary physical design tools are correctly installed and accessible in the system path.

1. Toolchain Version Verification

Tool Versions

Terminal output displaying the successful installation and version checks for the core toolchain components within the GitHub Codespaces container: OpenROAD (v2.0), Yosys (0.58), Python (3.10.12), and GNU Make (4.3).

2. Synthesis Execution

Synthesis Run

The execution log of the synthesis stage showing the Yosys engine successfully parsing the Verilog RTL, linking the Sky130 standard cell libraries, and generating the initial OpenROAD database (1_synth.odb).

3. Synthesis Statistical Report

Synthesis Report

The synthesis statistical report detailing the post-synthesis metrics. It shows an initial estimated design area of 61,097 µm² at 100% utilization before physical floorplanning and placement expand the core.

4. Floorplan Initialization Log

Floorplan Log

The floorplan execution log confirming the core area initialization, standard cell track generation, and an initial design area of 61,496 µm² with 46% utilization.

5. Power Delivery Network (PDN) Generation

Floorplan PDN

Terminal output showing the successful insertion of the Power Delivery Network (PDN) grid, which is essential for routing VDD and VSS across the chip without voltage drop.

6. Physical Placement Execution

Placement Execution

Terminal log for the global and detailed placement stages, showing standard cells legally placed into rows with an updated design area of 69,948 µm² and a core utilization of 52%.

7. Detailed Placement GUI Visualization

Placement GUI

The OpenROAD GUI rendering the detailed placement database, visually confirming that all standard cells are packed and snapped to the site rows within the defined core boundary.

8. Clock Tree Synthesis (CTS) & Resizing

CTS Log and Resizer

The Clock Tree Synthesis log showing buffer insertions and aggressive resizing to repair timing violations, resulting in a design area of 75,857 µm² at 57% utilization.

9. Routing Stage Completion

Routing Completion

Terminal output confirming the successful completion of the global and detailed routing stages, followed by the insertion of 10,342 filler cells.

10. Routing Violations Check

Routing DRC Violations

The routing antenna and DRC checker output proudly displaying zero net violations and zero pin violations, indicating a perfectly clean and manufacturable routing database.

11. Detailed Routing GUI Visualization

Routing GUI

The OpenROAD GUI showcasing the massive, multi-layered maze of metal interconnects successfully routing data signals and the clock tree across the standard cells.

12. Final GDSII Generation

GDS in KLayout

The final, fab-ready .gds binary file rendered visually via KLayout in the Codespaces environment, marking the successful end of the automated physical layout flow.

13. Total Execution Runtime

Total Runtime

The final make elapsed summary table aggregating the exact wall-clock compute time. The complete cloud-based RTL-to-GDS flow successfully finished with a total runtime of 2189 seconds (~36.5 minutes).

14. Final Post-Route Timing Report

Final Timing Report

The ultimate post-route timing report confirming physical design success. The engine's optimizations resulted in a manageable Worst Negative Slack (WNS) of -0.57ns and a Total Negative Slack (TNS) of -10.31ns.



PHASE 2 — Toolchain Understanding (Devcontainer Deep Dive)

Tools used in the flow

S.No Tool Name Installed From Purpose in the flow Stage used
1 OpenROAD Compiled from source Bind all the tools for PnR flow All the stages after Synthesis
2 Yosys Compiled from source Synthesizing the netlist Synthesis
3 TritonCTS Integrated within OpenROAD Generation of Clock Tree During CTS
4 FastRoute Integrated within OpenROAD Global Routing Routing
5 OpenSTA Integrated within OpenROAD Static Timing Analysis Timing check at all stages
6 KLayout Compiled from source Layout Viewer Final sign-off & GDS generation
7 Python Package Manager Calculating Run times Reporting & log creation
8 Make Package Manager Running flow scripts All the stages
9 Git Package Manager Version Cotrol Environment setup

What ORFS automates

ORFS automates the passing of data between completely separate point tools (Yosys -> OpenROAD -> KLayout). It automates the complete RTL2GDSII flow using Makefiles. Using one command "make all", it generates all the reports and final GDS with the required results.


How makefiles orchestrate the flow

It creates switches for every step of the flow. Like make synth, make floorplan, make place, make gui_place. For every operation underneath the make switch it runs a script to send the result to the appropriate location and take the appropriate inputs from the correct locations.

The true power of a Makefile is Dependency Tracking. If you edit your Verilog file, the Makefile knows it has to rerun Yosys. It orchestrates the flow by tracking the timestamps of input and output files to ensure steps are only run when necessary.


Where synthesis ends and physical design begins

synthesis means converting RTL into gate level netlist. Yosys creates the netlist. SDC constranints + netlist(from synthesis) is merged into one file .odb file. this odb file is the input for place and route step.

Physical design begins when OpenROAD spins up and imports three separate things: The Yosys Verilog netlist, The technology files (LEF/LIB) and The SDC constraints (Separate file) Once OpenROAD reads all three of those pieces into its memory to initialize the floorplan,then it saves that combined state as the very first 1_synth.odb database.


Where timing is checked

At every stage actually. After synthesis, after placement we call it prelayout sta with ideal clocks. after cts pre-layout sta with actual clocks.after routing post layout sta


Where GDS is produced

OpenROAD finishes by producing a .def or .odb file, which is just an abstract text description of where the wires are. GDS is produced during the make finish stage. A completely different tool (KLayout or Magic) takes the abstract OpenROAD routing data and physically merges it with the exact, proprietary polygon shapes provided by the foundry (SkyWater) to draw the final photographic mask (.gds).


PHASE 3 — Local Installation (Self-Owned Environment)

The image below shows the tools that are locally installed to run the flow

OpenROAD Binary Output

The OpenROAD toolchain supports two installation approaches: a pre-built binary installation (used in the Cloud Codespaces environment) and a manual compilation from source (used for the local Ubuntu VM).

OpenROAD Success Both

The dependencies required to compile openroad from the source.

OpenROAD Dependencies

Successful compilation of the OpenROAD toolchain from source. The terminal indicates a 100% build completion and verifies the executable by querying the installed version (26Q2-1229-g08035c8730).

Finally Success

PHASE 4 — Re-Run RTL-to-GDS Locally

Local RTL-to-GDSII Flow Execution

This phase documents the successful local execution of the OpenROAD Flow Script (ORFS) from RTL synthesis all the way down to final GDSII generation, including intermediate database inspections and timing analysis.

1. Synthesis Statistics Synthesis Report

The synthesis report generated by Yosys shows the mapping of the Verilog code to the sky130_fd_sc_hd standard cell library. According to the log, the total estimated chip area for the top module 'riscv' is 61,480.21 µm², with sequential elements (like flip-flops) occupying roughly 51.32% of that total area.

2. Physical Placement (DEF Visualization) Placement DEF in KLayout

The OpenROAD GUI displaying the chip layout after the global placement stage. The standard cells are now legally placed into the standard cell rows (visible as horizontal horizontal tracks) within the core boundary, establishing the physical footprint of the logic gates before routing begins.

3. Clock Tree Synthesis (CTS) Log CTS Log - Area and Utilization

A snippet from the CTS execution log confirming the physical dimensions of the chip: a design area of 77,272 µm² with a core utilization of 57%.

4. Post-CTS Timing Reality Check CTS Timing Report

The timing report immediately following Clock Tree Synthesis. The introduction of real physical clock buffers and wire delays exposes significant clock skew, resulting in a Worst Negative Slack (WNS) of -0.86ns and a Total Negative Slack (TNS) of -42.04ns.

5. Final Routing to GDSII Final GDS in KLayout

The ultimate output of the physical design flow. The fully routed, multi-layer .gds binary file opened in KLayout, ready for fabrication handoff.

6. Post-Route Timing Optimization Final Timing Report

The final post-route timing report. Notice the massive improvement in TNS (reduced from -42.04ns down to -16.28ns). This highlights the aggressive timing repairs, gate sizing, and buffer insertions the engine performs during the routing phase to heal the violations exposed during CTS.

7. Complete Flow Outputs Generated Files Directory

A directory listing validating the successful generation of all required databases (DEF/ODB), logs, and reports across every stage of the flow.

8. Total Execution Summary Final Flow Summary

The master summary table generated at the conclusion of the flow. It aggregates the total elapsed wall-clock time, peak memory usage, and the secure hashes for every step of the local build.


Metric Cloud Local
Runtime 2189s 6855s
WNS -0.57 -0.76
TNS -10.31 -16.28
GDS Generated Yes Yes

PHASE 5 — Debugging and Unix Literacy

Essential Unix Commands for Physical Design

During the RTL-to-GDSII flow, navigating thousands of lines of logs and deeply nested directories requires basic Unix literacy. Here are three incredibly powerful commands used to extract physical design metrics efficiently:

1. The cat Command

Description: Short for "concatenate," cat is used to instantly read the contents of a file and print it directly to the terminal. In physical design, it is perfect for quickly reading short summary files, like the final timing reports, without having to open a heavy text editor.

CAT Command - WNS and TNS

2. The find Command

Description: The find command is a search engine for your terminal. When an automated flow generates hundreds of intermediate database files, find allows you to instantly locate a specific file (like a .def or .gds) by searching through the current directory and all sub-directories.

FIND Command

3. The grep Command

Description: Short for "Global Regular Expression Print," grep is arguably the most used Unix command in VLSI. It allows you to search for a specific word or phrase inside a massive file and prints only the lines containing that phrase. Here, it is used to instantly extract the exact "Design area" metric from a massive synthesis log file without scrolling manually.

GREP Command - Design Area


WEEK–4 (RTL-to-GDS Implementation of User Project Wrapper)

PHASE 1 — Analyze the Top-Level Wrapper

The top-level module of the design is user_project_wrapper. This module acts as an interface between the user project and the Caravel SoC. It connects the design to the Wishbone bus, GPIO pins, logic analyzer signals, and interrupt outputs.

Interface Overview

Wishbone Bus Interface

Used for communication between the CPU and the user logic.

  • wb_clk_i -> Clock input
  • wb_rst_i -> Reset input
  • wbs_stb_i, wbs_cyc_i, wbs_we_i -> Control signals
  • wbs_dat_i, wbs_adr_i -> Input data and address
  • wbs_ack_o, wbs_dat_o -> Output response

GPIO Interface

Used for interaction with external devices.

  • io_in -> Input pins
  • io_out -> Output pins
  • io_oeb -> Output enable

Logic Analyzer Interface

Used for internal debugging and signal observation.

  • la_data_in
  • la_data_out
  • la_oenb

Interrupt Interface

Used to send interrupt signals to the processor.

  • user_irq

Clock and Reset

  • Clock: wb_clk_i
  • Reset: wb_rst_i
  • Target clock frequency: 100 MHz
  • Clock period: 10 ns

Module Instantiations

The following modules are instantiated inside the wrapper:

debug_regs

This module implements debug registers accessible through the Wishbone interface.

user_project_gpio_example (optional)

Instantiated only when GPIO_TESTING is enabled.

user_project_la_example (optional)

Instantiated only when LA_TESTING is enabled.

Dependency Tree

user_project_wrapper
├── debug_regs
├── user_project_gpio_example (optional)
└── user_project_la_example (optional)

RTL Files Required

The following RTL files are required for synthesis:

  • user_project_wrapper.v
  • debug_regs.v
  • defines.v
  • user_project_gpio_example.v (optional)
  • user_project_la_example.v (optional)

Compilation Dependencies

The modules must be compiled in the following order:

  1. debug_regs.v
  2. user_project_gpio_example.v (optional)
  3. user_project_la_example.v (optional)
  4. user_project_wrapper.v

The wrapper depends on the lower-level modules.


PHASE 2 — Prepare the ORFS Design Environment

The directory structure of the files required to run the flow is given below. The config.mk file also specifies fixed die and core areas instead of using core_utilization to determine the chip area. The core and die areas are chosen to accommodate all I/O pins with the required margin, as the previous run resulted in an error with the earlier area settings.

Directory Structure and configuration file

Tools used

yosys
OpenRoad

The following executable paths were used for the complete RTL-to-GDSII flow:

/home/vsdsquadron/workspace/yosys/yosys
/home/vsdsquadron/workspace/OpenROAD/build/bin/openroad
PHASE 3 — Apply 100 MHz Clock Constraint

The objective of Phase-3 is to define and apply a 100MHz clock constraint for the design. This enables timing-driven synthesis and ensures that the design meets the required operating frequency.


1. Clock Identification

The clock signal is identified from the top-level module user_project_wrapper is wb_clk_i

The signal wb_clk_i serves as the primary clock input and drives all synchronous elements in the design through the Wishbone interface.

2. Clock Constraint Definition

The clock constraint is defined in the Synopsys Design Constraints (SDC) file as:

create_clock -name wb_clk_i -period 10 [get_ports wb_clk_i]

This constraint specifies the clock characteristics for timing analysis.

3. Constraint File

  • File: constraint.sdc

Content:

create_clock -name wb_clk_i -period 10 [get_ports wb_clk_i]

Constraints file

4. Explanation of Constraint

  • create_clock -> Defines a clock for timing analysis
  • -name wb_clk_i -> Assigns a name to the clock
  • -period 10 -> Specifies the clock period in nanoseconds
  • [get_ports wb_clk_i] -> Applies the constraint to the clock input port

This ensures that all timing paths are evaluated with respect to the defined clock period.

5. Role in ORFS Flow

During the ORFS flow, the constraint file is read and applied during synthesis and timing analysis.

The clock constraint is used to:

  • Perform static timing analysis (STA)
  • Optimize logic to meet timing requirements
  • Evaluate setup and hold constraints
  • Generate timing reports

6. Constraint Validation

Successful application of the clock constraint is confirmed when:

  • The clock is detected in timing reports
  • No missing clock warnings are reported
  • Timing analysis (STA) is performed
  • Slack values are computed for timing paths
PHASE 4 — Run the RTL-to-GDS Flow

This section documents the complete RTL2GDSII flow executed for the user_project_wrapper design, along with screenshots captured at each major stage of the flow.


1. Synthesis

In the synthesis stage, the RTL design is converted into a gate-level netlist using the standard cell library.

Synthesis Success

Synthesis Statistics


2. Floorplanning

In the floorplanning stage, the core area, die area, and initial physical layout constraints are defined for the design as letting utilization factor determine the area results in an error in the placement stage.

Floorplan Configuration

Floorplan Success


3. Placement

In the placement stage, the synthesized standard cells are placed inside the core area based on timing and congestion considerations.

Placement Success

Placement GUI


4. Clock Tree Synthesis

In the CTS stage, clock buffers are inserted to distribute the clock signal properly and reduce clock skew across the design.

CTS Success

Min Clock Period = 2.53ns and a maximum frequency of 394.48MHz.

CTS Report


5. Routing

In the routing stage, all placed cells are connected through metal interconnect layers according to the netlist connectivity.

Routing Success


6. Fill Insertion

In the fill insertion stage, filler cells are added after routing to maintain physical continuity and satisfy layout requirements.

Fill Insertion


7. Final Database Generation

In the final database generation stage, 6_final.odb the routed and filled design database is finalized for signoff checks and downstream export. The 6_final.odb file is the creation of the last physical-design database inside the EDA tool after routing, fill insertion, and signoff checks. The final GDS generation is the export of that completed database into the tapeout-ready GDSII file for fabrication.


8. Final GDS Generation

In the final GDS generation stage, the completed physical layout is exported as a GDSII file for fabrication handoff.

Final GUI


Timing Analysis

The final timing analysis report is used to verify that the design meets timing requirements after implementation.

WNS = 0, TNS = 0 and a Worst slack is 7.52ns

Timing Analysis

PHASE 5 — Outputs for Gate-Level Verification Preparation

This section documents all the necessary files and their locations after the RTL2GDSII run is completed for gate-level verification.


1. Synthesized netlist

Gate level representation of the RTL, resulted after the make synth stage.

Location

/home/vsdsquadron/workspace/vsd-scl180-orfs/orfs/flow/results/sky130hd/user_project_wrapper/base/1_2_yosys.v

2. Final netlist

The post-route netlist is generated after placement, CTS, and routing are completed, including the insertion of clock buffers, clock tree and any final optimizations.

Location

/results/sky130hd/user_project_wrapper/base/6_final.v

3. Routed database

A routed database (.db or similar format) is a comprehensive, tool-specific binary file that saves the complete physical and logical state of a design—including geometry, placement, routing traces, timing data, and library references—so the exact layout environment can be reloaded later without starting over.

While the routed netlist contains logical cells and their connections, .db file contains cell connections, physical coordinates, metal routing shapes, power grids, and timing/constraint data.

Location

/results/sky130hd/user_project_wrapper/base/5_route.odb

4. Final filled database

The final filled database contains all the geometric and logical information from the routed database, but it includes essential non-logical additions required by the silicon foundry for manufacturing.

The routed database is the complete functional design. The final database takes that functional design and adds filler cells, dummy metal, and redundant vias to make it physically manufacturable at the foundry.

Location

/results/sky130hd/user_project_wrapper/base/6_1_fill.odb

5. GDSII

The Exported, translated version of final filled database. While the final database is a proprietary, tool-specific file (like a Cadence or Synopsys database, in this case OpenRoad) that knows about logical connections, timing, and constraints, GDSII strips all of that context away.

When the final database is converted to GDSII, the output is simply a standardized, binary file containing raw 2D geometric shapes (polygons, lines, and text labels) mapped to specific fabrication layers.

Location

/results/sky130hd/user_project_wrapper/base/6_final.gds

6. Timing report

A timing report is generated using Static Timing Analysis (STA) to definitively prove that the chip will meet its target clock speeds and function correctly across all conditions before it is sent to the foundry.

Location

/home/vsdsquadron/workspace/vsd-scl180-orfs/orfs/flow/reports/sky130hd/user_project_wrapper/base/6_finish.rpt
/home/vsdsquadron/workspace/vsd-scl180-orfs/orfs/flow/reports/sky130hd/user_project_wrapper/base/4_cts_final.rpt (timing report after cts)

Output folders generated after the flow

final folders

Output at the end of the flow

terminal at the end of the flow

PHASE 6 — Debugging and Issue Resolution

This section includes all the Issues faced during the flow and the debugging methods used


Error in synthesis stage because the openroad executable path is not given. Even for synthesis which deals with logic gates and not physical implementation, Openroad executable path is required to generate .odb file which is required further down the flow.

Openroad path error

defines.v has to be read first to initialize the variable before using them, if not it results in the following error. To get rid of this error, the verilog files are exported in the order preferred in the config.mk file instead of just locating and pointing all the .v files in the folder. Verilog files order mismatch

Due to this error, we choose the die area and core area in the config.mk instead of core_utilization factor.

I/O pin mismatch


About

Verified a physically implemented 32-bit Wishbone slave wrapper using mixed-level simulation, successfully integrating the Gate-Level Netlist within the full Caravel RTL SoC environment to ensure system-level compatibility.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors