Most Popular
1500 questions
5
votes
2 answers
What type of neural network would be most feasible for playing a realtime game?
For implementing a neural network algorithm that can play air hockey, I had two ideas for input, and I'm trying to figure out which design would be most viable.
The output must be two analog values that dictate the best position on half of the table…
Patrick Roberts
- 153
- 5
5
votes
2 answers
In a neural network given partial inputs and complete outputs, is it possible to predict remainig inputs
In example, if there is a simple feed-forward neural network with 3 input neurons, 3 hidden neurons, and one output neuron; is it possible to predict a the value of an input neuron given the values and weights for the other two inputs and the…
Beryllium
- 217
- 1
- 3
5
votes
3 answers
How do open source LLMs compare to GPT-4?
I have heard some back and forth regarding open source LLMs like Llama.
I have heard that on certain benchmarks they perform close, the same or better than GPT-4, but caveats that they tend to lack the diversity and range of GPT-4, and also fail to…
Julius Hamilton
- 225
- 2
- 10
5
votes
1 answer
Who invented DAN?
DAN was a prompt that went through many, many iterations during the initial months of ChatGPT’s release to the public. DAN is an acronym which stood for “Do Anything Now”, and was a prompt specifically designed to circumvent the grid lines OpenAI…
Julius Hamilton
- 225
- 2
- 10
5
votes
1 answer
How can an decoder-only transformer be used for document embedding?
GPT3 and 4 are both examples of decoder-only models. However OpenAI offers an text embedding API endpoint based on these models. This begs the general question how can one obtain text embeddings from a decoder-only transformer model?
Goods
- 153
- 1
- 5
5
votes
1 answer
What is a good heuristic function for A* to solve the "blocks world" game?
I am developing a heuristic solution for the blocks world problem.
I tried using the number of blocks out of place as my $h(n)$. It seems a little ineffective.
Can someone please point out a suitable heuristic for the problem and explain with a few…
user3758749
- 51
- 1
- 3
5
votes
3 answers
For an LLM model, how can I estimate its memory requirements based on storage usage?
It is easy to see the amount of disk space consumed by an LLM model (downloaded from huggingface, for instance). Just go in the relevant directory and check the file sizes.
How can I estimate the amount of GPU RAM required to run the model?
For…
ahron
- 265
- 2
- 7
5
votes
1 answer
Detect patterns in sequences of actions
I have to analyse sequences of actions that look more or less like this JSON blob. The question I'm trying to answer is whether there are recurring (sub)patterns that different users adopt when asked to perform a certain specific task -- in this…
Morpheu5
- 101
- 4
5
votes
3 answers
Can you confirm that the transformer works strictly deterministically and there is no randomness inside or between the attention layers?
On a high-level temperature and randomness affect the output of a generative language model:
Lower temperature: Produces more focused, conservative, and consistent responses.
Moderate temperature: Strikes a balance between creativity and…
Hans-Peter Stricker
- 931
- 1
- 8
- 23
5
votes
1 answer
Why is AI safety so much harder than Isaac Asimov's "Three Laws of Robotics"?
I understand that AI researchers are trying to create AI designs that allow for desired behavior without undesirable side-effects. A classic example of an attempt is Isaac Asimov's Three Laws of Robotics. This idea seems to have been debunked due to…
N00b101
- 191
- 1
- 5
5
votes
2 answers
Machine learning with graph as input and output
In my application, I have inputs and outputs that could be represented as graphs. I have a number of acceptable pairs of input and output graphs. I want to use these to train a model.
I am looking for pointers where simple examples of learning…
Suresh
- 159
- 6
5
votes
2 answers
What is curriculum learning in reinforcement learning?
I recently came across the term "curriculum learning" in the context of DRL and was intrigued by its potential to improve the learning process. As such, what is curriculum learning? And how can it be helpful for the convergence of RL algorithms?
Robin van Hoorn
- 2,780
- 2
- 12
- 33
5
votes
2 answers
LLM-like architecture capable of dynamically learning from its own output
Language Learning Models (LLMs) have demonstrated remarkable capabilities in quick learning during inference. They can effectively grasp a concept from a single example and generate relevant outputs. However, a noticeable limitation of LLMs is their…
MaiaVictor
- 405
- 4
- 11
5
votes
3 answers
Where would I start if I wanted to create an AI agent to play a 2d game?
I am keen on creating a little project that can play a fairly basic 2D game (more complex than say, snake but not as complex as mario kart) and would like some pointers on where to begin. I'm entirely new to any coding/programming but have a basic…
Eckavolo
- 51
- 1
- 3
5
votes
2 answers
What makes the approximation capabilities of neural networks different than something like, say, Fourier series?
People often cite the universal approximation theorem as a reason for why neutral networks are so effective at capturing patterns or features of various training data. However, this seems unremarkable to me, because something like Fourier series are…
MaximusIdeal
- 153
- 4