Audio inpainting methods aim to reconstruct missing sections of a recording. Diffusion-based probabilistic models can generate consistent solutions on medium-length inpainting tasks of 300 milliseconds, but struggle to maintain that semantic consistency on gaps of multiple seconds. This work introduces the hybrid Similarity-Guided Diffusion Posterior Sampling (SimDPS) method to control the content of the reconstructed audio. The SimDPS method selects an auxiliary candidate through a similarity search over a corpus of contextually relevant signals, which then guides a modified diffusion-based sampling scheme toward a similar reconstruction. Subjective evaluation of 2-s inpainting tasks on piano recordings demonstrates that the SimDPS method enhances perceptual plausibility compared to unguided diffusion-based inpainting, and frequently improves upon a similarity-based method alone for moderately plausibile candidates. The results demonstrate the potential of a hybrid, similarity-based approach for guiding diffusion-based inpainting in audio.