Back to portfolio

Bristol Vision Institute - Research

Reached state-of-the-art on the BVI-RLV benchmark (29.22 dB PSNR) by re-engineering a CNN-RNN video restoration pipeline on HPC infrastructure.

Outcomes

  • Higher temporal consistency
  • Faster experiment cycles
  • More stable tensor pipelines

Stack

  • PyTorch
  • SLURM
  • Python
  • HPC

What I Learned

  • Feature alignment: Bidirectional warping aligns forward/backward features before fusion.
  • Temporal stabilization: ConvGRU passes preserve frame-to-frame detail under motion.
  • Regression prevention: Tensor-shape instrumentation catches concat and recurrent mismatches.

Implementation Notes

  • Frame batches enter forward and backward optical-flow warping blocks.
  • Bidirectional features are fused before recurrent refinement.
  • ConvGRU layers propagate temporal context across multiple passes.
  • Decoder reconstructs denoised frames and computes reconstruction losses.

Code Snippet

# CNN-RNN re-engineering loop with explicit tensor guards
for t in range(seq_len):
    fwd = warp(features[:, t], flow_fwd[:, t])          # [B, C, H, W]
    bwd = warp(features[:, seq_len - 1 - t], flow_bwd[:, t])

    fused = torch.cat([fwd, bwd], dim=1)                # [B, 2C, H, W]
    if fused.shape[1] != 2 * hidden_channels:
        raise RuntimeError(f"unexpected channels: {fused.shape}")

    state = conv_gru(fused, state)                      # temporal refinement
    out[:, t] = decoder(state)
Project image one
Project image two