projectmoon

joined 1 year ago
[–] projectmoon@lemm.ee 1 points 15 hours ago

Stærsta þjóðin í heimi!

[–] projectmoon@lemm.ee 15 points 21 hours ago

They can build a keyboard into it, sure. It's just UI elements and a bunch of buttons. Won't be a good keyboard, but it can be done.

[–] projectmoon@lemm.ee 5 points 1 day ago (2 children)

Skálmöld á Sauðárkróki? Hvenær ætla þau að koma?

[–] projectmoon@lemm.ee 8 points 2 days ago

Well VTR is a roleplaying game. It's similar to Vampire the Masquerade, but different setting and somewhat different mechanics. I guess it's best explained as "nutrients." Animal blood and blood from e.g. blood bags gives less Vitae (magic blood points resource) than blood harvested from living humans. And as the character becomes more powerful, eventually that "lesser" blood can't actually give them Vitae.

The vampiric curse in VTR is explicitly stated to be supernatural, though, so there's not a necessary scientific explanation for it. The curse imparts the Beast, which is the predator in all vampires.

[–] projectmoon@lemm.ee 17 points 2 days ago (2 children)

Some fiction has it. In vampire the requiem, low level vampires can survive on animal blood. But more powerful ones need human or even vampire blood.

[–] projectmoon@lemm.ee 3 points 6 days ago (2 children)

Where can I get a sub 400 AMD card with 26 GB of VRAM?

[–] projectmoon@lemm.ee 23 points 1 week ago (1 children)

https://agnos.is/posts/tech-recruitment-is-out-of-control.html

This was my experience at the beginning of 2024. It was bad enough that I had to write a blog post about it.

[–] projectmoon@lemm.ee 2 points 2 weeks ago

Have you tried Matrix?

[–] projectmoon@lemm.ee 5 points 2 weeks ago (1 children)

LLMs are statistical word association machines. Or tokens more accurately. So if you tell it to not make mistakes, it'll likely weight the output towards having validation, checks, etc. It might still produce silly output saying no mistakes were made despite having bugs or logic errors. But LLMs are just a tool! So use them for what they're good at and can actually do, not what they themselves claim they can do lol.

[–] projectmoon@lemm.ee 1 points 3 weeks ago

OpenWebUI connected tabbyUI's OpenAI endpoint. I will try reducing temperature and seeing if that makes it more accurate.

[–] projectmoon@lemm.ee 1 points 3 weeks ago (2 children)

Context was set to anywhere between 8k and 16k. It was responding in English properly, and then about halfway to 3/4s of the way through a response, it would start outputting tokens in either a foreign language (Russian/Chinese in the case of Qwen 2.5) or things that don't make sense (random code snippets, improperly formatted text). Sometimes the text was repeating as well. But I thought that might have been a template problem, because it seemed to be answering the question twice.

Otherwise, all settings are the defaults.

[–] projectmoon@lemm.ee 1 points 3 weeks ago (4 children)

I tried it with both Qwen 14b and Llama 3.1. Both were exl2 quants produced by bartowski.

 

Over the weekend (this past Saturday specifically), GPT-4o seems to have gone from capable and rather free for generating creative writing to not being able to generate basically anything due to alleged content policy violations. It'll just say "can't assist with that" or "can't continue." But 80% of the time, if you regenerate the response, it'll happily continue on its way.

It's like someone updated some policy configuration over the weekend and accidentally put an extra 0 in a field for censorship.

GPT-4 and GPT 3.5 seem unaffected by this, which makes it even weirder. Switching to GPT 4 will have none of the issues that 4o is having.

I noticed this happening literally in the middle of generating text.

See also: https://old.reddit.com/r/ChatGPT/comments/1droujl/ladies_gentlemen_this_is_how_annoying_kiddie/

https://old.reddit.com/r/ChatGPT/comments/1dr3axv/anyone_elses_ai_refusing_to_do_literally_anything/

 

Current situation: I've got a desktop with 16 GB of DDR4 RAM, a 1st gen Ryzen CPU from 2017, and an AMD RX 6800 XT GPU with 16 GB VRAM. I can 7 - 13b models extremely quickly using ollama with ROCm (19+ tokens/sec). I can run Beyonder 4x7b Q6 at around 3 tokens/second.

I want to get to a point where I can run Mixtral 8x7b at Q4 quant at an acceptable token speed (5+/sec). I can run Mixtral Q3 quant at about 2 to 3 tokens per second. Q4 takes an hour to load, and assuming I don't run out of memory, it also runs at about 2 tokens per second.

What's the easiest/cheapest way to get my system to be able to run the higher quants of Mixtral effectively? I know that I need more RAM Another 16 GB should help. Should I upgrade the CPU?

As an aside, I also have an older Nvidia GTX 970 lying around that I might be able to stick in the machine. Not sure if ollama can split across different brand GPUs yet, but I know this capability is in llama.cpp now.

Thanks for any pointers!

 

Not sure if this has been asked before or not. I tried searching and couldn't find anything. I have an issue where any pictures from startrek.website do not show up on the homepage. It seems to only affect startrek.website. Going to the link directly loads the image just fine. Is this something wrong with lemm.ee?

9
submitted 1 year ago* (last edited 1 year ago) by projectmoon@lemm.ee to c/protonprivacy@lemmy.world
 

For the past few days, the android app has been very slow. The app itself loads fine and is responsive, but it takes many seconds to load messages, sometimes up to 30 seconds. At first I thought it was a blip, but it's been going on for a few days now. Anyone else have this problem?

Edit: clearing cache in the app settings (not system settings) fixed it.

 

This has probably already been asked before, but:

The magazines of kbin federate as Lemmy communities, but is the microblog section of a kbin magazine accessible via Lemmy?

view more: next ›