Much of this stuff depends on how seriously you want to take it. Most of the stuff marketed by companies for regular consumers to use on whatever hardware they have, yields low res, low fidelity output. If you're running this stuff locally, then the requirements climb very quickly as does the per-requisite knowledge required to get things acceptable enough to pass blind tests between real and fake. Hands are the tallest tale sign (and up until recently, teeth/tongues and smiles that facial expressions of shock), but even that can be solved if you're willing to do some manual work so the AI understands what something should look like.
Most of that level of work requires either lots of time, or basically the best GPU hardware you can get your hands on (VRAM being the primary limiter in most of this stuff).
Most of the AI generated stuff falls apart in terms of convincing likeness, but with upscalers, you can get stuff looking really nice even without much (if any) manual work.
View attachment 412695