Singularity

15 readers
1 users here now

Everything pertaining to the technological singularity and related topics, e.g. AI, human enhancement, etc.

founded 2 years ago
MODERATORS
26
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/Cr4zko on 2024-01-23 21:38:56+00:00.


I see lots and lots of posts here saying FDVR won't be a thing but I believe it will happen before the decade is out. What will you do with FDVR? I already have a long wishlist...

27
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/adalgis231 on 2024-01-23 21:32:16+00:00.


Understanding and reasoning about spatial relationships is a fundamental capability for Visual Question Answering (VQA) and robotics. While Vision Language Models (VLM) have demonstrated remarkable performance in certain VQA benchmarks, they still lack capabilities in 3D spatial reasoning, such as recognizing quantitative relationships of physical objects like distances or size differences. We hypothesize that VLMs' limited spatial reasoning capability is due to the lack of 3D spatial knowledge in training data and aim to solve this problem by training VLMs with Internet-scale spatial reasoning data. To this end, we present a system to facilitate this approach. We first develop an automatic 3D spatial VQA data generation framework that scales up to 2 billion VQA examples on 10 million real-world images. We then investigate various factors in the training recipe, including data quality, training pipeline, and VLM architecture. Our work features the first internet-scale 3D spatial reasoning dataset in metric space. By training a VLM on such data, we significantly enhance its ability on both qualitative and quantitative spatial VQA. Finally, we demonstrate that this VLM unlocks novel downstream applications in chain-of-thought spatial reasoning and robotics due to its quantitative estimation capability.

28
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/adalgis231 on 2024-01-23 21:29:10+00:00.


The field of AI agents is advancing at an unprecedented rate due to the capabilities of large language models (LLMs). However, LLM-driven visual agents mainly focus on solving tasks for the image modality, which limits their ability to understand the dynamic nature of the real world, making it still far from real-life applications, e.g., guiding students in laboratory experiments and identifying their mistakes. Considering the video modality better reflects the ever-changing and perceptually intensive nature of real-world scenarios, we devise DoraemonGPT, a comprehensive and conceptually elegant system driven by LLMs to handle dynamic video tasks. Given a video with a question/task, DoraemonGPT begins by converting the input video with massive content into a symbolic memory that stores \textit{task-related} attributes. This structured representation allows for spatial-temporal querying and reasoning by sub-task tools, resulting in concise and relevant intermediate results. Recognizing that LLMs have limited internal knowledge when it comes to specialized domains (e.g., analyzing the scientific principles underlying experiments), we incorporate plug-and-play tools to assess external knowledge and address tasks across different domains. Moreover, we introduce a novel LLM-driven planner based on Monte Carlo Tree Search to efficiently explore the large planning space for scheduling various tools. The planner iteratively finds feasible solutions by backpropagating the result's reward, and multiple solutions can be summarized into an improved final answer. We extensively evaluate DoraemonGPT in dynamic scenes and provide in-the-wild showcases demonstrating its ability to handle more complex questions than previous studies.

29
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/yottawa on 2024-01-23 21:23:56+00:00.


From tweet: Survey of users of Replika, an AI companion, finds that the people drawn to use it were quite lonely (90% were)

3x more people said it helped them increase their social interaction with humans than replace it. 3% of users said it stopped suicidal ideation.

30
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/czk_21 on 2024-01-23 21:00:50+00:00.

31
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/EcstaticVenom on 2024-01-23 20:45:12+00:00.


What models would you create if creating a fine-tune was easy?

If you could generate a high quality dataset of any kind, what dataset would you generate to finetune your model?

I’m trying to understand if we have a lack of data problem and hence there aren’t any domain/topic/character specific fine-tunes or if it’s more of a “we have good enough generic models and those fine-tunes aren’t necessary/too specific” thing.

Would love thoughts and opinions. Also as foundation models continue to get better, will our need of fine tuning disappear?

32
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/ForeignAffairsMag on 2024-01-23 18:34:42+00:00.

33
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/ScopedFlipFlop on 2024-01-23 18:01:10+00:00.


Definition: somebody who believes that technological innovation must be accelerated for the purpose of bringing about the singularity.

Reasoning: one very analytic redditor instructed me to differentiate between "accelerationist" and "techno-optimist". Thus, I have combined them.

34
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/johnnd on 2024-01-23 16:06:52+00:00.

35
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/CKR12345 on 2024-01-23 16:04:12+00:00.


Think of it, since GPT 4, we’ve heard from all the experts and even OpenAI themselves that we’re on this exponential curve and it’s hard to wrap our minds around how quickly the technology is advancing. With that in mind, if GPT 5 comes out this year, and is just GPT 4 with some tweaks and improvements, that would be… let’s say very bad for OpenAI and the sector of AI as a whole perhaps.

Obviously I don’t expect this to be the case, I think GPT 5 will be incredible, but it’s also worth contemplating something: Sam says AGI isn’t coming this year, if GPT 5 is exponentially better then 4, but still not AGI, what benchmarks is it not hitting, is it not AGI because OpenAI won’t let it be? Or is it not capable of being one?

At the end of the day we’re all speculating and I’m just a dude who thinks about this stuff from time to time and wanted to share my thoughts.

36
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/dennislubberscom on 2024-01-23 15:50:36+00:00.

37
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/Mk_Makanaki on 2024-01-23 14:32:07+00:00.

38
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/throwaway472105 on 2024-01-23 14:13:47+00:00.


And we still didn't get a model that overperforms GPT-4 March 2023 release.

39
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/thumbsdrivesmecrazy on 2024-01-23 14:12:44+00:00.


The article introduces a new approach to code generation by LLMs - a test-based, multi-stage, code-oriented iterative flow, that improves the performances of LLMs on code problems: Code Generation with AlphaCodium - from Prompt Engineering to Flow Engineering

Comparing results to the results obtained with a single well-designed direct prompt shows how AlphaCodium flow consistently and significantly improves the performance of LLMs on CodeContests problems - both for open-source (DeepSeek) and close-source (GPT) models, and for both the validation and test sets.

40
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/melnitr on 2024-01-23 12:42:49+00:00.


There's no video of it unfortunately, but it sounds like good news for robotics.

41
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/raucousdaucus on 2024-01-23 12:01:34+00:00.


I must live in a bubble, because I had no idea the anti-AI sentiment was so pervasive. I created a custom GPT for GPT plus to help people use scientific skepticism to analyze claims and I thought, "hey, r/skeptic would probably appreciate this." I mean, we live in an era where misinformation is rampant, and a lot of people are unfortunately ill-equipped with the tools or skills to critically evaluate the claims they encounter. Instead of appreciating an effort to help regular folks think more skeptically, I got blasted with anti-AI sentiment:

No. This will only happen if we want it (we don’t) and douches like OP keep pushing “use cases”.please stop doing this) We don’t want another god.

Boo. We don’t want more AI. We want less.

How pervasive is this? I talk about AI frequently at work and nobody has given me the impression they feel this way.

Here's my post, by the way:

(maybe throw an upvote at the person who defended it and got downvoted because of me?)

42
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/345Y_Chubby on 2024-01-23 10:53:07+00:00.

43
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/Specialist_Effort161 on 2024-01-23 09:24:49+00:00.


I recently tried out Bard's new feature that allows it to generate summaries of YouTube videos. I wanted to read a quick summary of podcasts posted on youtube to help me decide whether I want to invest time in watching a full podcast or not. However, my experience has been far from what i expected.

Half the time, Bard simply refuses to generate a summary, stating that it cannot generate text based on the video. Initially, I thought this might be due to the absence of transcripts for some videos. But to my surprise, Bard failed to provide summaries even for videos with available transcripts.I'm not sure if this is a bug in Bard's new feature or if Google is somehow limiting the use of Bard's capabilities.

I'm curious to know if anyone else has tried this feature and what your experiences have been. Is it just me facing these issues, or is this a common problem? Any insights would be appreciated.

44
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/FarWinter541 on 2024-01-23 07:01:43+00:00.


"I think we will both invent AGI sooner than most of the world thinks, and in those first few years, it will change the world much less, and in the long term, it'll change it more." -- Sam Altman

45
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/golden_negg on 2024-01-23 04:14:54+00:00.


Is anyone else having literal bad dreams about AI? I'm sick of lucid dreaming the potential brain chip / VR glitchy world a la Upload. Last night I was deemed "unhelpful" by the robot overlords and had to compete to become a stewardess. I lost because I'm not nice and was stabbed. Then I woke up.

46
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/Alone-Competition-77 on 2024-01-23 03:53:48+00:00.


From NPR Marketplace.

47
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/Rare-Force4539 on 2024-01-23 03:27:54+00:00.


The first few weeks of 2024 have been pretty quiet so far and people are starting to get impatient. Predictions are getting pushed, AI winter is here, AGI is cancelled they say.

But is this really an accurate portrayal of the situation? The hype may be forgotten but it is not gone—it is primed to explode again as soon as the next big model drops. And now the whole world is watching, not just the nerds.

2024 is going to be the biggest year in the history of AI and civilization in general. January is just the quiet before the storm. Brace yourselves…

48
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/xSNYPSx on 2024-01-23 00:52:58+00:00.


Hello, fellow Redditors and tech enthusiasts!

Recently, I've been tinkering with the concept of creating a simple robot that can be controlled using the capabilities of OpenAI's GPT-4 Vision and Open Broadcaster Software (OBS). The aim is to create a setup where GPT-4 Vision can process live video feeds, interpret the content, and issue commands to the robot in real-time, allowing for a seamless interaction between AI and a physical machine.

The Current Challenge

The idea sounds straightforward, but there's a significant hurdle that we need to overcome. As of my knowledge, the OpenAI API doesn't support live video streaming as an input for processing. Instead, it can only handle individual image frames or short video clips. This limitation requires a workaround that involves manually extracting frames from a live video, sending them to the API for analysis, and then acting on the received information.

The Vision for GPT-4 Vision and OBS Integration

If OpenAI were to introduce live streaming video capabilities to their API, the potential applications would be enormous. For our robot project, it would mean we could directly feed the video stream from OBS into the GPT-4 Vision API. The AI could then analyze the stream in real-time and instruct the robot to perform actions based on what it "sees."

For example, if the robot's camera sees an obstacle in its path, GPT-4 could command the robot to stop, turn, or navigate around the obstacle. All of this would happen fluidly, without the need for "kludges" or complicated intermediary steps.

How It Could Work

  1. Stream Capture: OBS captures the video from the robot's camera as it explores its environment.
  2. API Communication: The live video stream is sent directly to the GPT-4 Vision API.
  3. AI Processing: GPT-4 Vision processes the stream, understands the environment, and determines appropriate actions.
  4. Command Execution: The API sends back real-time commands, which are relayed to the robot's control system to perform the required actions.

The Benefits of Streamlined Integration

With direct streaming support, the latency between visual recognition and robot action would be significantly reduced. It would allow for more sophisticated and responsive behaviors from the robot, providing a more interactive and engaging experience for users and viewers alike.

Conclusion and Call to Action

The integration of GPT-4 Vision with OBS to control a simple robot is an exciting prospect, but it hinges on the ability to process live streaming video directly through the AI API. This functionality would not only benefit our project but could also unlock new possibilities in telepresence, remote operation, and live event monitoring.

I'm reaching out to the community to discuss how such an integration could be brought to life and to call on OpenAI to consider adding live video streaming capabilities to their API. It's a feature that could catalyze countless innovative projects and applications.

What are your thoughts on the potential of live video processing with AI? How could it change the game for robotics and beyond? Let's brainstorm in the comments!

49
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/haandsom1 on 2024-01-22 22:21:18+00:00.


"Far from being “stochastic parrots,” the biggest large language models seem to learn enough skills to understand the words they’re processing."

"A trained and tested LLM, when presented with a new text prompt, will generate the most likely next word, append it to the prompt, generate another next word, and continue in this manner, producing a seemingly coherent reply. Nothing in the training process suggests that bigger LLMs, built using more parameters and training data, should also improve at tasks that require reasoning to answer.

BUT THEY DO. Big enough LLMs demonstrate abilities — from solving elementary math problems to answering questions about the goings-on in others’ minds — that smaller models don’t have, even though they are all trained in similar ways."

50
 
 
This is an automated archive.

The original was posted on /r/singularity by /u/PerceptionHacker on 2024-01-22 22:18:31+00:00.

view more: ‹ prev next ›