FactorSD

joined 1 year ago
[–] FactorSD@lemmy.dbzer0.com 1 points 1 year ago (1 children)

It does seem to work fairly well, although I will say that it doesn't fit my workflow at all so I haven't done a lot of testing. I do think there are some UI things that you could look at though. Engine and Dimensions shouldn't be minimizable lists, because the fields only take up as much space as the label does. Also, your tooltips are outrageously large, covering about 75% the width of a 1080p monitor which makes them quite hard to actually read.

[–] FactorSD@lemmy.dbzer0.com 2 points 1 year ago

It's hard to give precise figures, because there's always tricks to getting a little more or less but from my (admittedly limited) testing SDXL is significantly more demanding, and 10+GB of VRAM is probably going to be the minimum to run it. I don't remember exactly what I was doing but I run on an RTX A4500 card, and I managed to max out the 20GB of VRAM just with one SDXL process, where I can normally run a LORA training and 512x768 size images at the same time.

[–] FactorSD@lemmy.dbzer0.com 1 points 1 year ago

Protip - If an image is good but not quite perfect, stick to the same seed and use the X/Y script to run the image lots of times at different CFG levels.

[–] FactorSD@lemmy.dbzer0.com 4 points 1 year ago

A lot of the time I try to just let images come out as the AI imagines them - Just running img2img prompts, often in big batches, then picking the pictures that best reflect what I wanted.

But I do also have another process when I want something specific, which involves doing img2img to generate a pose and general composition, flipping that image into both a controlnet (for composition) and a segmentanything mask (for latent couple) and then respinning the same image with the same seed with those new constraints. When you run with the controlnet and the mask you can turn the CFG way down (3 or 4) but keep the coherence in the image so you get much more naturalistic outputs.

This is also a good way to work with LORAs that are either poorly made or don't work well together - The initial output might look really burned, but when you have the composition locked in you can run the LORAs at much lower strength and with lower CFG so they sit together better.

[–] FactorSD@lemmy.dbzer0.com 3 points 1 year ago (1 children)

The real value of SDXL isn't the higher native resolution, its the improvements in rendering fingers and text and so on. But honestly I have not yet been super impressed by SDXL, in the same way that I want to stay playing the old game with all its DLC and mods. SDXL is good, but until we have the same depth of resources available I am staying with 1.5.

[–] FactorSD@lemmy.dbzer0.com 2 points 1 year ago

The community will decide what is best by which model they support

[–] FactorSD@lemmy.dbzer0.com 3 points 1 year ago (1 children)

I am planning on cooking a LORA today - I'll give this a go and report back.

[–] FactorSD@lemmy.dbzer0.com 6 points 1 year ago

I guess YMMV on whether focused is boring or not. I agree that I never really found stimulants to be super interesting, but thats partly because it was too expensive to do coke just to work on whatever project was on my mind.

[–] FactorSD@lemmy.dbzer0.com 2 points 1 year ago

Most SD stuff requires specific versions of everything, and as you say the documentation is poor even on Windows. Try other forks, and you may get lucky.

[–] FactorSD@lemmy.dbzer0.com 2 points 1 year ago (1 children)

How is it meaningfully different to the existing Scribble and Lineart controlnets that are already working in Automatic1111?

[–] FactorSD@lemmy.dbzer0.com 1 points 1 year ago

Prompt was presumably "Shaq to the moon!"

[–] FactorSD@lemmy.dbzer0.com -4 points 1 year ago (7 children)

You really think people would spend a lifetime writing books if they couldn't make money from it?

Things which are free have no value, both economic and societal. Even when we pirate stuff, at least our society encourages creative labour.

 

There are sometimes days where the gods of SD are just mocking you. That hand was made from a depth map extracted from a real human pose, and the map very clearly show FOUR FUCKING FINGERS. And yet...

I am starting to wonder if this model was trained exclusively on people with polydactyly. If so, well played internet.

I have tried a bunch of stuff to get good hands and feet from SD, and nothing is even slightly reliable. TI and negative prompts sometimes just don't work, or even make stuff worse. Inpainting takes forever to make work well. I thought controlnet would crack it but apparently not.

How do you guys deal with dodecadactyl mutants?

view more: next ›