Problem

Existing graphics engines are either too heavy for small projects or too limiting for advanced rendering techniques. Developers need a lightweight, modern C++ graphics engine that provides maximum performance while maintaining flexibility for custom rendering pipelines.

Approach

Building a next-generation graphics engine from scratch with modern C++20, focusing on performance, modularity, and cross-platform compatibility. The engine combines cutting-edge rendering techniques with a clean, extensible architecture:

Advanced Rendering Pipeline

Physically-Based Rendering: Full PBR implementation with IBL and multiple light types
Clustered Deferred Rendering: Efficient handling of thousands of dynamic lights
Temporal Upsampling: High-quality anti-aliasing with minimal performance cost
Dynamic Resolution Scaling: Adaptive quality based on performance targets
Volumetric Lighting: Real-time fog, god rays, and atmospheric scattering
Screen-Space Reflections: High-quality reflections with temporal filtering

Engine Architecture

Entity-Component-System: Data-oriented design for maximum performance
Multi-threaded Rendering: Parallel command buffer generation and submission
GPU-Driven Rendering: Minimize CPU overhead with compute-based culling
Memory Pool Allocators: Custom allocation strategies for different resource types
Hot-Reload System: Real-time asset reloading for rapid iteration
Cross-Platform Support: Windows, Linux, macOS with unified API

Core Implementation

// ---------------------------------------------------------------------------
// strong_handle.hpp
// Modern C++20 strong handle with generation tracking and hashing
// ---------------------------------------------------------------------------
#pragma once
#include <cstdint>
#include <functional>
 
template<typename Tag>
class StrongHandle {
    uint32_t _index{0};
    uint32_t _generation{0};
public:
    constexpr StrongHandle() = default;
    constexpr StrongHandle(uint32_t idx, uint32_t gen = 0)
        : _index(idx), _generation(gen) {}
 
    constexpr bool is_valid() const noexcept { return _index != 0; }
    constexpr uint32_t index() const noexcept { return _index; }
    constexpr uint32_t generation() const noexcept { return _generation; }
 
    friend constexpr bool operator==(StrongHandle a, StrongHandle b) noexcept {
        return a._index == b._index && a._generation == b._generation;
    }
    friend constexpr bool operator!=(StrongHandle a, StrongHandle b) noexcept {
        return !(a == b);
    }
    friend constexpr bool operator<(StrongHandle a, StrongHandle b) noexcept {
        return std::tie(a._generation, a._index) < std::tie(b._generation, b._index);
    }
};
template<typename Tag>
struct std::hash<StrongHandle<Tag>> {
    size_t operator()(StrongHandle<Tag> h) const noexcept {
        return (static_cast<size_t>(h.index()) << 32) ^ h.generation();
    }
};
 
// Forward tags
struct BufferTag {};
using BufferHandle = StrongHandle<BufferTag>;
struct ShaderTag {};
using ShaderHandle = StrongHandle<ShaderTag>;
 
 
// ---------------------------------------------------------------------------
// bitflags.hpp
// Enum-class bitflags utilities
// ---------------------------------------------------------------------------
#pragma once
#include <type_traits>
 
template<typename E>
concept EnumClass = std::is_enum_v<E> && !std::is_convertible_v<E, int>;
 
template<EnumClass E>
constexpr E operator|(E a, E b) {
    using U = std::underlying_type_t<E>;
    return static_cast<E>(static_cast<U>(a) | static_cast<U>(b));
}
template<EnumClass E>
constexpr E operator&(E a, E b) {
    using U = std::underlying_type_t<E>;
    return static_cast<E>(static_cast<U>(a) & static_cast<U>(b));
}
template<EnumClass E>
constexpr E operator~(E e) {
    using U = std::underlying_type_t<E>;
    return static_cast<E>(~static_cast<U>(e));
}
template<EnumClass E>
constexpr bool has(E value, E flag) {
    return static_cast<bool>(value & flag);
}
 
 
// ---------------------------------------------------------------------------
// culling_system.hpp
// GPU-driven frustum + Hi-Z occlusion culling
// ---------------------------------------------------------------------------
#pragma once
#include "strong_handle.hpp"
#include <span>
 
struct Matrix4f;
struct Vector4f;
 
class CullingSystem {
public:
    struct alignas(16) CullData {
        Matrix4f viewProj;
        Vector4f frustumPlanes[6];
        uint32_t objectCount;
        float lodBias;
        uint32_t depthPyramidMip; // extra param for Hi-Z
    };
 
    void dispatch(const CullData& data) {
        ShaderBinder _(cullingCS_);                 // RAII binder
        cullingCS_.set_uniform("uCull", data);
        const uint32_t groups = (data.objectCount + 63) / 64;
        cmd_.dispatch(groups, 1, 1);
    }
 
    BufferHandle visible_objects() const { return visBuffer_; }
 
    // Debug helper: draws a micro-UI bar with dispatch cost
    void draw_gui() const;
private:
    ComputeShader cullingCS_;
    BufferHandle visBuffer_;
    CommandList cmd_;
};
 
 
// ---------------------------------------------------------------------------
// material_system.hpp
// Automatic shader-variant baker with hot reload
// ---------------------------------------------------------------------------
#pragma once
#include "bitflags.hpp"
#include "strong_handle.hpp"
#include <unordered_map>
 
class MaterialSystem {
public:
    enum class Flags : uint32_t {
        NONE            = 0,
        HAS_ALBEDO_MAP  = 1 << 0,
        HAS_NORMAL_MAP  = 1 << 1,
        HAS_METALLIC    = 1 << 2,
        HAS_ROUGHNESS   = 1 << 3,
        ALPHA_TESTED    = 1 << 4,
        DOUBLE_SIDED    = 1 << 5,
    };
 
    ShaderHandle get_shader(Flags flags) {
        if (auto it = cache_.find(flags); it != cache_.end())
            return it->second;
 
        ShaderPermutation perm;
        perm.define("ALBEDO_MAP",   has(flags, Flags::HAS_ALBEDO_MAP));
        perm.define("NORMAL_MAP",   has(flags, Flags::HAS_NORMAL_MAP));
        perm.define("METALLIC_MAP", has(flags, Flags::HAS_METALLIC));
        perm.define("ROUGHNESS_MAP",has(flags, Flags::HAS_ROUGHNESS));
        perm.define("ALPHA_TEST",   has(flags, Flags::ALPHA_TESTED));
        perm.define("DOUBLE_SIDED", has(flags, Flags::DOUBLE_SIDED));
 
        auto shader = shaders_.compile("pbr_material", perm);
        return cache_.emplace(flags, shader).first->second;
    }
 
    // Call once per frame (cheap) – will rebuild any shaders whose file changed
    void hot_reload() { shaders_.reload_dirty(); }
 
private:
    std::unordered_map<Flags, ShaderHandle> cache_;
    ShaderCompiler shaders_;
};

Technical Achievements

Performance Optimization

GPU-Driven Architecture: All culling and LOD selection on GPU using compute shaders
Bindless Resources: Direct GPU access to textures without CPU binding overhead
Parallel Command Generation: Multi-threaded command buffer recording
Memory-Mapped Buffers: Persistent mapping for streaming data updates
Custom Memory Allocators: Stack, ring, and pool allocators for different use cases

Innovation Highlights

Automatic Shader Variants: Runtime shader compilation with caching system
Temporal Resource Management: Smart resource lifetime tracking and cleanup
Hot-Reload Everything: Shaders, textures, meshes, and even C++ code
Debugging Tools: Built-in profiler with GPU timing and memory tracking
Cross-API Abstraction: Unified interface for Vulkan, DirectX 12, and Metal

Roadmap

Phase 1 - Foundation (Complete)

Core engine architecture with ECS
Vulkan renderer with basic PBR
Asset loading and management system
Basic scene graph and transforms
Memory management and profiling

Phase 2 - Advanced Rendering (Current)

Phase 3 - Next Generation

Ray tracing integration (RTX/RDNA2)
Machine learning-based upscaling
Advanced post-processing pipeline
VR/AR rendering optimizations
Procedural geometry generation

Phase 4 - Production Ready

Visual scripting system
Physics integration (custom or Bullet)
Audio system with spatial audio
Networking for multiplayer games
Complete editor with visual tools

Goober Graphics