I feel like I'm really close with this. Here's my second #Ebsynth experiment. It's incremental progress over the last one. This time, using keys every 100 frames or so. It's getting clearer. I think I need more keyframes, but it's a lot clearer than the last one. So net success!
Here, we did the original video with a deliberately incomplete cc4 model. I wanted to see if the Ebsynth/#stablediffusion combo I'm using would complete missing or broken facial expressions. And what's interesting here is that the answer is... kinda? It really greatly improved the mouth, erased some of the weird clipping that was going on, but I don't feel like it did any favors for the eyes.
The way I got here was also interesting. First round was the video, which I shot in #Vseeface, using #OBS as a recorder. The transparent background in OBS took this experiment from a failure state, to fully workable, because the default Vseeface backgrounds have a lot of weird noise in them.
Anyway, from there, I broke it out into raw frames, and wrote some quick php code to pull out key frames every 100 or so frames. And from there, I pulled out SD again to trace each images with a shared seed, giving me at least some character continuity at a low de-noising setting.
If you're interested here, my #prompt against the photo was "detailed painting, sharp, high resolution, detailed face, 10k resolution, detailed skin, sharp, crisp, realistic skin, detailed hair, silent film star, film noir, detailed eyes, beautiful woman with red hair."
My CFG level here was 4.5 and my denoising set to a very conservative 0.31 with a seed of 3420642523.
In the next iteration, I'm going to see what happens when I add a background to the frames as I record it. I want to see if I can get that ethereal AI vibe back, and I think it would be fun to experiment a bit more.
All images that came from SD were upscaled.
Note to self: Do the film edits _before_ sampling the final video next time. Doh!
Advantages over the other method with the first order and flate plane spline models: Much higher resolution. This one actually sampled in full HD, which is a huge improvement, even if the way it happens isn't quite as smart. But unlike those, there's a lot of room for process improvement here.
Ещё видео!