Skip to content

Latest commit

 

History

History
61 lines (42 loc) · 4.01 KB

File metadata and controls

61 lines (42 loc) · 4.01 KB

Good first issues

These are beginner-friendly ways to contribute to MolForge. Each is self-contained, well-scoped, and documented enough that you can start immediately. Open one as an issue with the template below, claim it, and submit a PR.

How to claim

  1. Open a new GitHub issue
  2. Title: "Good first issue: "
  3. Comment "I'll take this" on the issue to claim it
  4. Submit a PR referencing the issue number

Chemistry / data features (no ML required)

  • InChIKey deduplication on import — when importing CSV / SDF, detect duplicate compounds by InChIKey and offer to merge. Good starter because RDKit.js already has get_inchikey().
  • Export selected compounds to SDF — there's already a full-database SDF export; add an option to export only checked rows.
  • Copy-as-SMILES from context menu — right-click on any compound → copy SMILES to clipboard.
  • Bulk edit custom column values — select N compounds, set a custom column value on all of them at once.
  • "Sort by descriptor column" — clicking the MW / LogP / TPSA / HBA / HBD / RotB / MPO column headers should sort the grid. Currently only Name / IC50 sort.
  • Salt stripping option on import — add a checkbox in the Import modal: "Strip salts (largest fragment)". Uses RDKit.js get_molblock_with_no_salts() if available.

UI / UX polish

  • Keyboard shortcuts — add Ctrl/Cmd+S to trigger Save, Ctrl/Cmd+N for new compound, / for search focus.
  • Dark mode — add a toggle in the header; apply Tailwind's dark: utility prefix to the main views.
  • Grid row drag-to-reorder — allow users to manually reorder compounds in the grid (persisted in the customCols ordering).
  • Inline-edit custom column values — currently custom columns are only editable via the profile modal. Allow double-click in-place editing like the standard columns.

Analysis features

  • Free–Wilson analysis in SAR Matrix — compute per-R-group contribution to ΔpIC50, render as a bar chart.
  • pIC50 unit toggle — some compound files use IC50 nM, some µM, some Kd. Detect the unit per row and normalize. Add a cell-level unit-override.
  • Multiple-assay comparison — if a compound has entries for two assays (two rows), join them and plot assay A vs assay B pIC50.
  • Activity-cliff highlighter — in the Chemical Space scatter, highlight any two compounds where Tanimoto ≥ 0.7 AND |ΔpIC50| ≥ 1.5 (classic activity-cliff definition).

Exports & integration

  • Markdown table export — for pasting SAR tables into reports / Slack / papers.
  • Reproducible analysis snapshot — serialize all tab state (filters, selections, active tab) into a single JSON that can be re-loaded later.
  • Image-by-image structure export — bulk-export all compound structures as individual PNG / SVG files in a zip.

Documentation / outreach

  • Video walkthrough — record a 3-minute YouTube walkthrough using the 10 demo compounds; link from README.
  • Translation — translate README.md into another language (French, Spanish, Mandarin, Japanese welcome).
  • Real-world SAR notebook — pick a public dataset (ChEMBL assay) and write a notebook-style walkthrough in docs/walkthroughs/.

Testing

  • Cypress smoke test — add a minimal Cypress test that loads the page, asserts 10 compounds, clicks SAR Analysis, verifies the scaffold tab renders. This unblocks a real CI testing story.
  • Visual regression test — use Playwright + a small reference-image folder to catch accidental UI breakage.

Ground rules

  • Never commit compound data that is not in the 10-drug whitelist. The CI workflow will fail your PR.
  • Keep the single-file architecture. No build step; no separate source files. Edits go directly into molforge_database.html.
  • Follow the existing code style. Tailwind utility classes, React hooks, no new dependencies unless strongly justified.
  • One issue per PR. Easier to review and merge.