AI Mistress (sex scripts)

quaternion13 · Post by **quaternion13** » Thu Nov 23, 2023 3:26 pm

Found a very interesting implementation of AI for NSFW chats. Requires you to download sexacripts and a local OpenAI simulator but runs all operations locally. Works best if you have a reasonable powerful GPU. You can read more by following the link below.

https://ss.deviatenow.com/viewtopic.php?f=7&t=1116

giannitrib · Post by **giannitrib** » Sun Nov 26, 2023 11:11 pm

Thanks for this - i will try to have a play with lmstudio models to see what i can use
unfortunately i don't have a computer with 16GB+ ram and the latest & greatest GPU with 24GB+ vram
Am hoping i can use a 7b model such as xwin-mlewd-7b-v0.2.Q5_K_M.gguf
When i tried the 13b model my laptop ran out of memory and gpu resources crashing windows

Electro · Post by **Electro** » Sun Nov 26, 2023 11:41 pm

giannitrib wrote: ↑Sun Nov 26, 2023 11:11 pm Thanks for this - i will try to have a play with lmstudio models to see what i can use
unfortunately i don't have a computer with 16GB+ ram and the latest & greatest GPU with 24GB+ vram
Am hoping i can use a 7b model such as xwin-mlewd-7b-v0.2.Q5_K_M.gguf
When i tried the 13b model my laptop ran out of memory and gpu resources crashing windows

I'm able to use a 13B model with 16 gigs of RAM with GPU offloading turned off. ...you don't need 24 gigs of VRAM for a 13B model because even the entire thing could be GPU offloaded with less, but I won't get into that.

You didn't say how much RAM you have, my following suggestion figures you aren't able to offload anything to GPU dedicated VRAM at all.
In the community page for that 7b model you mentioned(xwin-mlewd 7b), it was reported by the person who tried to merge it undi95, that it was a failed merge and you'd be getting xwin 7b on it's own, the mlewd part was completely lost.

Fear not, because there are other options that I've tried myself. I suggest using TheBloke\OpenHermes-2-Mistral-7B-GGUF If you are super short on RAM, the q4_k_m would be a starting point and if your system can handle that, the q5_k_m would be a higher quality version without much of a performance difference but if you have 8 gigs of RAM and no real dedicated GPU VRAM to offload 4-6 layers to you'll be at the limit and maybe q4_k_m is all you could muster.

Don't use the OpenHermes 2.5 version, something is different and it seems to completely handle itself in a very incompatible way with role play and working with AI Mistress. The older version 2 is the one to use based on my experience.

Alternatively Toppy-M-7B-GGUF q5_k_m works okayish, it seems to be horrible at implementing the function stuff because it won't limit itself to 2 function calls but it seems to create reasonable dialog that fits the purpose.

giannitrib · Post by **giannitrib** » Mon Nov 27, 2023 12:46 am

thanks yeah i only have 8GB ram and a geforce gtx 1660 graphics card - suspect it's the lack of ram that caused the crash
i'll have a read through what you suggested and see if any other model will work

Electro · Post by **Electro** » Mon Nov 27, 2023 7:58 am

8 gigs of ram and it looks like the GeForce 1660 has 6 gigs of vram. So if you find the right number of GPU thread offloading that fills roughly 4 to 5 gigs of VRAM, you should theoretically have enough for a 13B model. I'm running with only 2 gigs of vram and 16 gigs of ram, but it also depends on how much of the ram is being taken up by the OS and other running applications. I *think* the best way to figure this out, assuming you have Windows 10 is to open task manager (Ctrl+alt+esc) and look at the GPU section in the performance tab, look at the dedicated memory quantity and in LM Studio try to set 10 layers of GPU offload, load the model and watch the number for dedicated vram and if it fills completely full and crashes, you need to bring the number down. If it's crashing because of not enough memory and the VRAM doesn't get above 5 gigs, add a few more layers of GPU offload and see if maybe you can find a balance that will run the 13B LLM.

Which quantization option did you download? If q5_k_m doesn't work, try q4_k_m or q4_k_s because the smaller file size of the LLM will be less that gets loaded into memory. Keep in mind q3 or lower on a 13B model isn't likely to produce great responses, so if a q4 won't work, I'd aim for a 7B model because those use less resources. I think the OpenHermes 2 Mistral one I mentioned earlier is about 5 gigs, so figuring everything works right, you might be able to get most of that in VRAM and possibly have better performance than me with my extra ram and newer machine because I only have 2 gigs with a low end laptop Nvidia mx350 GPU and I'm certain the GeForce 1660 is better than mine by quite a bit.

AI Mistress (sex scripts)

AI Mistress (sex scripts)

Re: AI Mistress (sex scripts)

Re: AI Mistress (sex scripts)

Re: AI Mistress (sex scripts)

Re: AI Mistress (sex scripts)

Who is online