I had a go with one of those AI Video Generation tools...it's quite easy to use

Started by Matty, January 02, 2025, 08:44:29

Previous topic - Next topic

Matty

It seems like it would be extremely easy to make a game like the old Dragon's Lair or Space Ace arcade games with this tech. 

Just define each scene in sequence, let it render them out, then go into any basic movie editor such as windows movie maker and splice the finished product into segments and then have a simple Javascript program that plays the video, then gives a choice of sorts somehow that allows the subsequent 'success' or 'failure' video to play and continue on from there.

I'm thinking of trying to remake a version of Leisure Suit Larry 1 as a test example case to see if I can get it to work. Basically as a starting point make 20 10 second clips and then link them with multiple choice options of a sort...could be a minigame like:

Show 4 colours out of 5 (blue/red/green/yellow/purple) and if you can correctly pick the missing colour after those 4 are shown for a second then it takes you to the 'success' option else the 'fail' option....which just shows the next 10 second clip...or similar.

Matty

Okay....so I tried to control the output by specifying I wanted 10 second clips, 50 scenes of 1 clip each, so I could split it into sections specifically to put into a form of "choose your own adventure" that re-made the old Leisure Suit Larry game by Sierra by as a CYOA instead of point and click adventure game. Unfortunately the result wasn't workable....I got this:



Derron

Learn to do local editing of videos ...  I bet if these videos would not be just "ken burns zoom in effects" and "slowmo-to-look-epic" sceens ... it would look more "movie-like".
Dunno if simply speeding up some scenes would already help.

And yes... you need to find a way to "re-do" scenes so you can choose the best one (without face morphs and all this oddities). I guess smart scene descriptions would care for the limited capabilities of these AI models.
Means 
... if you need persons in the background, set them into a dark corner so that changes/flickering/ ... are less obvious.
... avoid requesting certain amounts ("10 diamonds on a table") as these will most probably change between scenes.
...

Any element people will remember during the scene must be consistent (main character look). Yeah, easily said ... but nobody told us it comes without costs.
You pay with time (run it locally on CPU as you most probably do not have the VRAM) or with money (let others or a strong GPU calculate things).


Guess things become interesting once the AI models will do special effects for us! means analyzing given footage and eg adding fire etc.
Or if you give them a storyboard (eg mannequins/pose-figures in blender, poser, daz3d, ...) and they add the details, "render look" etc.
Guess this is what will happen rather soon... (as "draw as I sketched" is already existing).


bye
Ron

RemiD

Quote from: Matty on January 05, 2025, 04:11:50wanted 10 second clips, 50 scenes of 1 clip each, so I could split it into sections specifically to put into a form of "choose your own adventure"
what a good idea !

i will have to experiment with that !

Matty

I actually had an idea that even though I don't have a VR kit - imagine this possibility:

Person wears VR kit or uses a webcam that has eye tracking technology.
Use AI Video Generation to produce a seamlessly branching cinematic experience.
It's a choose your own adventure style game but the choices aren't visible to the player. The game instead tracks your eye position at certain parts of the video playback to decide which video to play next.
So you might be sitting at a cafe (in the VR simulation) and a woman sits down at a table near you. If your eye looks in her direction during the video at a certain time, maybe when she winks at you, then the next video has you get up and go over to her table. Otherwise if you don't, you proceed down a different route which might be that you order a meal and make a phone call.

Stuff like that....


Matty

Amazing - this one actually came out pretty darn good for an AI generated video, yes it still has some weirdness and awkwardness but it's pretty amazing for an AI generated video that took less than a few minutes to build...


Derron

The glitches in every scene (I only scanned the first 2 minutes) made it look unwatchable to me. Characters eyes move during frontal view, character itself changes during scenes, render style is changing between scenes.
Character has a heart stroke (literally) when story says his stomach is alarming ...  NO

Just because someone wears a space suit, and some "ships" are flying around, this does not make a good movie. No matter if AI based or not.
You should not upload any experiment you do  - compare what others upload and you will see that they avoid these "hickups" in the videos. Maybe they edit them manually here and there (cutting inapprobriate scenes). Uploading "anything" makes your content "blunt", like in a dollar store... you know there is a lot of garbage or rubbish quality, not even bothering to try your luck to find a gem.

I wonder that it allows to use the "golden m" (as it is trademarked) ...


So: as you suggested, maybe use the AI to generate certain "scenes" and direct things no your own. you "just" need to find a way to have consistency amongst scenes ... there are AI models which are trained to reuse a "specific" character design (eg for virtual "influencers" etc). Maybe something exists for videos here too. If that is not feasible, then extend your videos with scenes not showing characters. Eg dialogues can be "visually" cut by showing different video sequences here and then (people chat ... visual cut while they still chat ... and you see a space ship flying through a valley to approach the village in which the people still chat). 

Extending your video this way requires more efforts - but you can overcome some of the limitations of the AI models and potentially make more interesting videos.

Maybe start with 1-2 Minute videos - but make them "more convincing" ... and effectively also increase your video editing skills.


bye
Ron

Matty

Look, I can see the flaws just as well as you can - but I'm also pretty amazed that a piece of software can create all this just from some simple text - it's pretty impressive. Yes, there's warping, weird crazy sh!t and problems with the results - but look at the positive - all these videos were just created from giving a computer program a textual description of something you want that in some cases was just a few lines of text. It's pretty impressive in my opinion, even if the result is deeply flawed. 

Sometimes I think your constructive criticism is just a veil to be negative more than positive and find fault rather than see the good in things.

Derron

maybe I am more into being negative than positive. dunno.

But for me "Video AI" is "image AI" plus "chatgpt". The interesting part of video AI is ... how good it achieves to have a "memory" resulting in temporal consistency (eg characters looking the same throughout a video). The image creation and the ability of chatgpt (etc) to create "stories" (eg use them as chatbots - even for .. odd dialogues) is what I find more intriguing than the video AI (for now).

This is where next steps will have to be done.


bye
Ron

Matty

Note: I just erased all my content from online sites - so much of it seemed to be sourced illegally despite the AI generation tool claiming it was all created "pixel by pixel as originals" - total bs. Some of that turned out to be created by small budget independent film producers, probably deliberately chosen because only severe film buffs would recognise the originals being used (such as one of my good friends)

Derron

I think the AI models still "create pixel by pixel" ... but I think it is similar to a child which never has seen a cat and now asks another child how a cat looks ... they will describe the cat ... maybe the cat they have at home. A gray cat with short hair.

Now any cat the first child will draw if you ask them to do so: a gray cat (exception is: if you give them only red and green pencils :P)

If you now ask to make the cat look funnier, it will still be gray ... maybe head becomes bigger, it has yellow dots on it (ok color can really change, kids tend to ignore colors "errors" - they even almost ignore skin colors if you do not put their attention to it ... but we ignore that, else the example here does not work).


What I mean is: the AI has a limited "memory" of how things can look. They learn "certain humans" and now are able to "morph" between those things. Yeah... they talk about feature extraction etc ("a human has to have 2 eyes" etc) but the AI in this case has no clue about what those 2 features in the head "are". It has no understanding of "eyes" (but because of the mix with "chat gpt"-style models it can recite "eyes are used for..." etc). 
This limited memory can be compensated by ... more memory. More things to "memorize" means more variation - but also "more manifestation of standards". Give it 999 new white skinned characters and 1 black ... and I am pretty sure most generated characters will be white. This is also kind of natural - your observations build "your pov to the world".

So now ... to training material: most public sources of training material are:
- (illegally) scanned photos humans uploaded to somewhere
- images from media someone produced "commercially" -> sequences extracted from movies, "show/event"-pictures, press pictures, ...
- images from artists uploading their stuff in "showcases" etc ... (similar to "you" an AI can learn to "ignore" a watermark on a picture - and still grasp "what is beneath")

This is why you will see a lot of characters having a reminiscence to stars/actors/... you know.
And this is why female characters often have very "perfect" (in the sense of Western World "assumptions") body proportions. And if you request some "beautiful" character it will fit into the various categories you would place actors in ("big chest", "small chest, sportive figure" ...). It is all in the training material.

And this is why "training material" is tried to get from anywhere possible: to extend the variation, to "learn" more. But it comes with costs (training a model can costs millions). Save costs - and your training material will source on way less media.



@ erasing your content
I still think that a lot of created content can be described as "individually crafted". As said "the more you know" the more often you will re-recognize things. If you knew all "popular" films made in the last 100 years, you will in almost all "movies" of "today" recognize certain elements. "oh, this pursue is like a mix of movieA, movieB and the color grade of movieC".
But ... is that bad? Isn't this what most even learn in school: to recreate certain artistic approaches ("paint an image in the style of van gogh").

"Creativity" can also be an almost unique "mix" of things, not just the invention of "the unseen".
The issue for the first is: it becomes harder to find a _unique_ mix. Invent a music tune: chances are high that you never heard the melody you hum in your brain - but once you create the tune and put online, someone could come up and claim that it sounds sooo similar to an existing one.

My kids also "invent" things and think they are smart (they are!) ... but then you tell them, they aren't the first - but they did not know about the other things before.
It is your _limited_ knowledge of the world. And the AI memory (dunno when to call it "knowledge") is also still limited - but way way way bigger than yours (they know the complete wikipedia - do you? :D ... PS: in many languages!)

What I want to say: if you have fun creating this stuff - maybe just make the videos private (so you can still "host" it there). Think of "AI model prompting" as a skill you can improve. Learn how to ask an AI model to get specific output. Am sure "prompting" is a soft skill which gains popularity these days.


bye
Ron

Matty

Your comment is good but I want to say something about the potential illegal content used in the films which shows through in the end result/output:

1. McDonald's logo - you noticed this yourself.
2. Alien Xenomorph designs - clearly no licence for this stuff.
3. End Credits of someone else's movie appearing in one film : "Directed by Brad Bird"
4. Chinese independent film still showing subtitles and actual footage from an independent film shown at a local film festival.

So yeah - there's more than that - but that's sufficient for me to think it shouldn't be hosted online...or used...it's building off of stuff it has no legal rights to use in its output.