Authoritative code map for the current repository state as of 2026-06-02. Update this file whenever classes, public structs, or significant functions change.
openzoom::OpenZoomApp
- Owns the Qt application lifecycle, UI wiring, camera control, CPU/GPU processing selection, output capture, and settings persistence.
- Public API:
OpenZoomApp(int& argc, char** argv)~OpenZoomApp()int Run()
- Core slots:
OnFrameTick()OnPresetSelectionChanged(QListWidgetItem* current, QListWidgetItem* previous)OnCameraSelectionChanged(int index)OnBlackWhiteToggled(bool checked)OnBlackWhiteThresholdChanged(int value)OnZoomToggled(bool checked)OnZoomAmountChanged(int value)OnDebugViewToggled(bool checked)OnZoomCenterXChanged(int value)OnZoomCenterYChanged(int value)OnRotationSelectionChanged(int index)OnControlsCollapsedToggled(bool checked)OnVirtualJoystickToggled(bool checked)OnBlurToggled(bool checked)OnBlurSigmaChanged(int value)OnBlurRadiusChanged(int value)OnFocusMarkerToggled(bool checked)OnSpatialSharpenToggled(bool checked)OnSpatialUpscalerChanged(int index)OnSpatialSharpnessChanged(int value)OnTemporalSmoothToggled(bool checked)OnTemporalSmoothStrengthChanged(int value)OnOcrAssistToggled(bool checked)OnVlmAssistToggled(bool checked)OnAssistiveOverlayToggled(bool checked)OnAssistiveOverlayUpdated(const QString& title, const QString& body, bool visible)
- Important helpers:
- preset/config workflow:
CaptureCurrentAdvancedConfig(),ApplyAdvancedConfig(...),PopulatePresetList(),RefreshPresetSelection(),UpdatePresetDescription(),SyncCurrentConfigToPersistence(),PromoteCurrentConfigToPreset() - assistive workflow:
UpdateAssistiveRuntimeState(),MaybeRequestAssistiveAnalysis(...) - camera lifecycle:
EnumerateCameras(),PopulateCameraCombo(),RefreshCameraModesList(size_t),StartCameraCapture(size_t),StopCameraCapture() - processing/presentation:
BuildCompositeAndPresent(UINT, UINT),PresentFitted(...),EnsureCudaSurface(UINT, UINT),ProcessFrameWithCuda(UINT, UINT) - input/focus:
SetZoomCenter(float, float, bool),ApplyInputForces(),MapViewToSource(...),RotateNormalizedPoint(...) - output:
CaptureSnapshot(...),MaybeRecordFrame(...),EnsureOutputSubdir(const QString&) - persistence/status:
ApplyPersistentSettings(...),SavePersistentSettings(),UpdateProcessingStatusLabel()
- preset/config workflow:
- Key state:
- UI widget pointers from
MainWindow - preset/config persistence via
settings::PersistentSettings persistentSettings_ MediaCapture mediaCapture_processing::CpuFramePipeline cpuPipeline_VideoRecorder videoRecorder_std::unique_ptr<D3D12Presenter> presenter_std::unique_ptr<CudaInteropSurface> cudaSurface_- per-feature booleans and tuning values for BW, zoom, blur, temporal smoothing, sharpening, focus marker, joystick, debug view, rotation, and recording
- UI widget pointers from
openzoom::InteractionController
- Converts keyboard, wheel, mouse-drag, and virtual joystick input into zoom-center updates.
- Public API:
explicit InteractionController(OpenZoomApp& app)bool HandlePanKey(int key, bool pressed)bool HandlePanScroll(const QWheelEvent* wheelEvent)void HandleZoomWheel(int delta, const QPointF& localPos)void ApplyInputForces()void BeginMousePan(const QPointF& pos, const QSize& widgetSize)bool UpdateMousePan(const QPointF& pos)void EndMousePan()bool IsMousePanActive() constvoid ResetJoystick()void SetJoystickAxes(float x, float y)
Namespace openzoom::settings
struct AdvancedConfig- full stage-2 tuning payload including image-processing flags, focus/rotation state, and OCR/VLM scaffolding flags
struct PresetDefinition- stage-1 quick-mode metadata: preset id, name, description, target config id, built-in flag
struct PersistentSettings- persists camera/UI state, selected preset id, current live advanced config, and user-created configs/presets
- Functions:
QString ResolveSettingsPath()void EnsureSettingsDirectory(const QString& path)std::optional<PersistentSettings> Load(const QString& path)bool Save(const QString& path, const PersistentSettings& settings)const std::vector<AdvancedConfig>& BuiltInConfigs()const std::vector<PresetDefinition>& BuiltInPresets()QString DefaultPresetId()const AdvancedConfig* FindAdvancedConfigById(...)const PresetDefinition* FindPresetById(...)std::optional<AdvancedConfig> ResolveConfigForPreset(...)bool AreConfigsEquivalent(...)
Namespace openzoom::app_constants
- UI scaling and step constants for zoom, panning, and blur controls
- Helper functions:
SliderValueToSigma(int sliderValue)SnapBlurRadius(int value)
Types:
struct MediaFramedata,subtype,width,height,stride,dataSize
using FrameCallback = std::function<void(const MediaFrame& frame)>struct CameraDescriptorname,symbolicLink,activation
struct VideoFormatwidth,height,numerator,denominator
openzoom::MediaCapture
- Media Foundation camera enumeration and threaded source-reader capture.
- Public API:
MediaCapture()~MediaCapture()bool Initialize()void Shutdown()std::vector<CameraDescriptor> EnumerateCameras()std::vector<VideoFormat> EnumerateFormats(const CameraDescriptor& descriptor)bool StartCapture(const CameraDescriptor& descriptor, FrameCallback callback, GUID preferredSubtype = MFVideoFormat_ARGB32)void StopCapture()const std::string& LastError() const
- Internal helpers:
ConfigureReader(...)CaptureLoop(FrameCallback callback)ExtractFormats(IMFSourceReader* reader)HrToString(HRESULT hr)
openzoom::AssistiveRuntime
- Asynchronous assistive-analysis runtime owned by
OpenZoomApp. - OCR path:
- exports the current frame to a temporary PNG
- runs
tesseract.exeasynchronously - returns extracted text to the overlay
- VLM path:
- JPEG-encodes the current frame
- posts an OpenAI-compatible
chat/completionsrequest to a configured endpoint - returns the text response to the overlay
- Public API:
AssistiveRuntime(QObject* parent = nullptr)~AssistiveRuntime()void SetModes(bool ocrEnabled, bool vlmEnabled)bool WantsAnalysis() constbool IsBusy() constvoid SubmitFrame(const uint8_t* bgraData, int width, int height)
- Signal:
OverlayUpdated(const QString& title, const QString& body, bool visible)
Namespace openzoom::processing
- Format conversion helpers:
CopyArgbToBgra(...)CopyRgbxToBgra(...)ConvertNv12ToBgra(...)ConvertYuy2ToBgra(...)
- CPU effect helpers:
ApplyBlackWhite(...)ApplyZoom(...)ApplyGaussianBlur(...)ApplyTemporalSmoothCpu(...)
Namespace openzoom::processing
struct CpuPipelineConfigenableBlackWhite,blackWhiteThresholdenableZoom,zoomAmount,zoomCenterX,zoomCenterYenableBlur,blurRadius,blurSigmaenableTemporalSmooth,temporalSmoothAlpha
struct CpuPipelineOutputdata,width,height,isComposite
CpuFramePipeline
- Owns the CPU fallback pipeline and the intermediate stage buffers.
- Public API:
bool ConvertFrameToBgra(...)bool RotateRawBuffer(int quarterTurns, UINT& width, UINT& height)CpuPipelineOutput BuildStages(UINT width, UINT height, const CpuPipelineConfig& config, bool debugViewEnabled)bool ResampleToFill(UINT targetWidth, UINT targetHeight, float centerXNorm, float centerYNorm)void ResetTemporalHistory()const std::vector<uint8_t>& StageRaw() constUINT RawWidth() constUINT RawHeight() const
openzoom::VideoRecorder
- Thin Media Foundation sink-writer wrapper for processed MP4 output.
- Public API:
VideoRecorder()~VideoRecorder()bool Start(const std::wstring& filePath, UINT width, UINT height, UINT fps)void Stop()bool IsRecording() constbool AddFrame(const uint8_t* bgraData, size_t strideBytes)double DurationSeconds() constconst std::string& LastError() const
- Internal helpers:
InitializeSink(...)SetError(const std::string& err)
openzoom::D3D12Presenter
- Manages the D3D12 device, swap chain, upload buffer, shared fence, and readback buffer.
- Public API:
D3D12Presenter()~D3D12Presenter()void Initialize(HWND hwnd, UINT width, UINT height)bool IsInitialized() constvoid Resize(UINT width, UINT height)void Present(const uint8_t* data, UINT width, UINT height)void PresentFromTexture(ID3D12Resource* texture, UINT width, UINT height, const FenceSyncParams* fenceSync = nullptr)bool ReadbackTexture(ID3D12Resource* texture, UINT width, UINT height, std::vector<uint8_t>& outBgra)ID3D12Device* GetDevice() constID3D12Fence* GetFence() constUINT64 GetLastSignaledFenceValue() const
Supporting types:
struct FenceSyncParamsenable,waitValue,signalValue
enum class SpatialUpscalerkFsrEasuRcaskNis
enum class CudaBufferFormatkRgba8kRgba16F
struct ProcessingSettings- toggles and parameters for BW, zoom, blur, focus marker, spatial sharpening, temporal smoothing, and staging format
struct ProcessingInputhostPixels,hostStrideBytes,pixelSizeBytes,width,height
openzoom::CudaInteropSurface
- Imports a D3D12 texture into CUDA and runs the GPU effect chain.
- Public API:
explicit CudaInteropSurface(ID3D12Resource* texture, ID3D12Fence* sharedFence = nullptr)~CudaInteropSurface()bool IsValid() constbool HasExternalSemaphore() constvoid RunGradientDemoKernel(unsigned int width, unsigned int height, float timeSeconds)bool ProcessFrame(const ProcessingInput& input, const ProcessingSettings& settings, const FenceSyncParams& fenceSync)const std::string& LastError() constvoid ResetTemporalHistory()
Kernel launch wrappers:
LaunchGradientKernel(...)LaunchBlackWhiteKernel(...)LaunchZoomKernel(...)LaunchBlackWhiteLinear(...)LaunchZoomLinear(...)LaunchGaussianBlurLinear(...)LaunchFocusMarkerLinear(...)LaunchFsrEasuRcasLinear(...)LaunchNisLinear(...)LaunchTemporalSmoothLinear(...)bool UploadGaussianKernel(int radius, float sigma, cudaStream_t stream)
openzoom::RenderWidget
- Native widget that hosts the D3D12 presenter.
- Public API:
explicit RenderWidget(QWidget* parent = nullptr)QPaintEngine* paintEngine() const overridevoid setPresenter(D3D12Presenter* presenter)bool isPresenterReady() const
openzoom::JoystickOverlay
- Circular on-canvas joystick overlay.
- Public API:
explicit JoystickOverlay(QWidget* parent = nullptr)void ResetKnob()
- Signal:
JoystickChanged(float normX, float normY)
openzoom::MainWindow
- Builds the UI shell and exposes widget accessors used by
OpenZoomApp. - Public API includes getters for:
- camera selection and mode list
- quick-mode preset list, preset description label, and quick-option promotion button
- BW, zoom, blur, temporal smoothing, and spatial sharpening controls
- OCR/VLM scaffolding checkboxes plus assistive overlay toggle
- focus sliders, rotation combo, debug toggle, focus marker, joystick toggle
- capture and recording buttons
- processing status label
- Event handling:
- arrow-key routing for panning
- event filter on the render widget for Ctrl+wheel zoom, plain-wheel pan, and middle-button drag pan
openzoom::AssistiveOverlay
- Semi-transparent overlay widget drawn on top of the render surface.
- Public API:
explicit AssistiveOverlay(QWidget* parent = nullptr)void SetContent(const QString& title, const QString& body, bool visible)
- On Windows, constructs
OpenZoomAppand runs it inside atry/catch. - On non-Windows platforms, exits with an unsupported-platform message.