Rendering content: pl_frame, pl_renderer, and pl_queue

This example roughly builds off the previous entry, and as such will not cover the basics of how to create a window, initialize a pl_gpu and get pixels onto the screen.

Renderer

The pl_renderer set of APIs represents the highest-level interface into libplacebo, and is what most users who simply want to display e.g. a video feed on-screen will want to be using.

The basic initialization is straightforward, requiring no extra parameters:

pl_renderer renderer;

init()
{
    renderer = pl_renderer_create(pllog, gpu);
    if (!renderer)
        goto error;

    // ...
}

uninit()
{
    pl_renderer_destroy(&renderer);
}

What makes the renderer powerful is the large number of pl_render_params it exposes. By default, libplacebo provides several presets to use:

pl_render_fast_params: Disables everything except for defaults. This is the fastest possible configuration.
pl_render_default_params: Contains the recommended default parameters, including some slightly higher quality scaling, as well as dithering.
pl_render_high_quality_params: A preset of reasonable defaults for a higher-end machine (i.e. anything with a discrete GPU). This enables most of the basic functionality, including upscaling, downscaling, debanding and better HDR tone mapping.

Covering all of the possible options exposed by pl_render_params is out-of-scope of this example and would be better served by looking at the API documentation.

Frames

pl_frame is the struct libplacebo uses to group textures and their metadata together into a coherent unit that can be rendered using the renderer. This is not currently a dynamically allocated or refcounted heap object, it is merely a struct that can live on the stack (or anywhere else). The actual data lives in corresponding pl_tex objects referenced in each of the frame's planes.

bool render_frame(const struct pl_frame *image,
                  const struct pl_swapchain_frame *swframe)
{
    struct pl_frame target;
    pl_frame_from_swapchain(&target, swframe);

    return pl_render_image(renderer, image, target,
                           &pl_render_default_params);
}

Renderer state

The pl_renderer is conceptually (almost) stateless. The only thing that is needed to get a different result is to change the render params, which can be varied freely on every call, if the user desires.

The one case where this is not entirely true is when using frame mixing (see below), or when using HDR peak detection. In this case, the renderer can be explicitly reset using pl_renderer_flush_cache.

To upload frames, the easiest methods are made available as dedicated helpers in <libplacebo/utils/upload.h>, and <libplacebo/utils/libav.h> (for AVFrames). In general, I recommend checking out the demo programs for a clearer illustration of how to use them in practice.

Shader cache

The renderer internally generates, compiles and caches a potentially large number of shader programs, some of which can be complex. On some platforms (notably D3D11), these can be quite costly to recompile on every program launch.

As such, the renderer offers a way to save/restore its internal shader cache from some external location (managed by the API user). The use of this API is highly recommended:

static uint8_t *load_saved_cache();
static void store_saved_cache(uint8_t *cache, size_t bytes);

void init()
{
    renderer = pl_renderer_create(pllog, gpu);
    if (!renderer)
        goto error;

    uint8_t *cache = load_saved_cache();
    if (cache) {
        pl_renderer_load(renderer, cache);
        free(cache);
    }

    // ...
}

void uninit()
{
    size_t cache_bytes = pl_renderer_save(renderer, NULL);
    uint8_t *cache = malloc(cache_bytes);
    if (cache) {
        pl_renderer_save(renderer, cache);
        store_saved_cache(cache, cache_bytes);
        free(cache);
    }

    pl_renderer_destroy(&renderer);
}

Cache safety

libplacebo performs only minimal validity checking on the shader cache, and in general, cannot possibly guard against malicious alteration of such files. Loading a cache from an untrusted source represents a remote code execution vector.

Frame mixing

One of the renderer's most powerful features is its ability to compensate for differences in framerates between the source and display by using frame mixing to blend adjacent frames together.

Using this API requires presenting the renderer, at each vsync, with a pl_frame_mix struct, describing the current state of the vsync. In principle, such structs can be constructed by hand. To do this, all of the relevant frames (nearby the vsync timestamp) must be collected, and their relative distances to the vsync determined, by normalizing all PTS values such that the vsync represents time 0.0 (and a distance of 1.0 represents the nominal duration between adjacent frames). Note that timing vsyncs, and determining the correct vsync duration, are both left as problems for the user to solve.¹. Here could be an example of a valid struct:

(struct pl_frame_mix) {
    .num_frames = 6
    .frames = (const struct pl_frame *[]) {
        /* frame 0 */
        /* frame 1 */
        /* ... */
        /* frame 5 */
    },
    .signatures = (uint64_t[]) {
        0x0, 0x1, 0x2, 0x3, 0x4, 0x5 // 
These must be unique per frame, but always refer to the same frame. For
    example, this could be based on the frame's PTS, the frame's numerical ID
    (in order of decoding), or some sort of hash. The details don't matter,
    only that this uniquely identifies specific frames.

    },
    .timestamps = (float[]) {
        -2.4, -1.4, -0.4, 0.6, 1.6, 2.6, // 
Typically, for CFR sources, frame timestamps will always be separated in
    this list by a distance of 1.0. In this example, the vsync falls roughly
    halfway (but not quite) in between two adjacent frames (with IDs 0x2 and
    0x3).

    },
    .vsync_duration = 0.4, // 24 fps video on 60 fps display
}

Frame mixing radius

In this example, the frame mixing radius (as determined by pl_frame_mix_radius is 3.0, so we include all frames that fall within the timestamp interval of [-3, 3). In general, you should consult this function to determine what frames need to be included in the pl_frame_mix - though including more frames than needed is not an error.

Frame queue

Because this API is rather unwieldy and clumsy to use directly, libplacebo provides a helper abstraction known as pl_queue to assist in transforming some arbitrary source of frames (such as a video decoder) into nicely packed pl_frame_mix structs ready for consumption by the pl_renderer:

#include <libplacebo/utils/frame_queue.h>

pl_queue queue;

void init()
{
    queue = pl_queue_create(gpu);
}

void uninit()
{
    pl_queue_destroy(&queue);
    // ...
}

This queue can be interacted with through a number of mechanisms: either pushing frames (blocking or non-blocking), or by having the queue poll frames (via blocking or non-blocking callback) as-needed. For a full overview of the various methods of pushing and polling frames, check the API documentation.

In this example, I will assume that we have a separate decoder thread pushing frames into the pl_queue in a blocking manner:

static void decoder_thread(void)
{
    void *frame;

    while ((frame = /* decode new frame */)) {
        pl_queue_push_block(queue, UINT64_MAX, &(struct pl_source_frame) {
            .pts        = /* frame pts */,
            .duration   = /* frame duration */,
            .map        = /* map callback */,
            .unmap      = /* unmap callback */,
            .frame_data = frame,
        });
    }

    pl_queue_push(queue, NULL); // signal EOF
}

Now, in our render loop, we want to call pl_queue_update with appropriate values to retrieve the correct frame mix for each vsync:

bool render_frame(const struct pl_swapchain_frame *swframe)
{
    struct pl_frame_mix mix;
    enum pl_queue_status res;
    res = pl_queue_update(queue, &mix, pl_queue_params(
        .pts            = /* time of next vsync */,
        .radius         = pl_frame_mix_radius(&render_params),
        .vsync_duration = /* if known */,
        .timeout        = UINT64_MAX, // 
Setting this makes pl_queue_update block indefinitely until sufficiently
    many frames have been pushed into the pl_queue from our separate
    decoding thread.

    ));

    switch (res) {
    case PL_QUEUE_OK:
        break;
    case PL_QUEUE_EOF:
        /* no more frames */
        return false;
    case PL_QUEUE_ERR:
        goto error;
    // 
There is a fourth status, PL_QUEUE_MORE, which is returned only if the
    resulting frame mix is incomplete (and the timeout was reached) -
    basically this can only happen if the queue runs dry due to frames not
    being supplied fast enough.
In this example, since we are setting timeout to UINT64_MAX, we will
never get this return value.

    }


    struct pl_frame target;
    pl_frame_from_swapchain(&target, swframe);

    return pl_render_image_mix(renderer, &mix, target,
                               &pl_render_default_params);
}

Deinterlacing

The frame queue also vastly simplifies the process of performing motion-adaptive temporal deinterlacing, by automatically linking together adjacent fields/frames. To take advantage of this, all you need to do is set the appropriate field (pl_source_frame.first_frame), as well as enabling deinterlacing parameters.

However, this may change in the future, as the recent introduction of the Vulkan display timing extension may result in display timing feedback being added to the pl_swapchain API. That said, as of writing, this has not yet happened. ↩