Most Popular

1500 questions
5
votes
0 answers

Why did fuzzy logic fall out of fashion?

Fuzzy logic seemed like an active area of research in machine learning and data mining back when I was in grad school (early 2000s). Fuzzy inference systems, fuzzy c-means, fuzzy versions of the various neural network and support vector machine…
Alex S King
  • 251
  • 1
  • 6
5
votes
2 answers

Why did the L1/L2 regularization technique not improve my accuracy?

I am training a multilayer neural network with 146 samples (97 for the training set, 20 for the validation set, and 29 for the testing set). I am using: automatic differentiation, SGD method, fixed learning rate + momentum term, logistic…
5
votes
2 answers

Can artificial intelligence applications be hacked?

Can artificial intelligence (or machine learning) applications or agents be hacked, given that they are software applications, or are all AI applications secure?
ME.
  • 115
  • 1
  • 6
5
votes
1 answer

How to constraint the output value of a neural network?

I am training a deep neural network. There is a constraint on the output value of the neural network (e.g. the output has to be between 0 and 180). I think some possible solutions are using sigmoid, tanh activation at the end of the layer. Are there…
5
votes
1 answer

Which model should I use to determine the similarity between predefined sentences and new sentences?

The Levenshtein algorithm and some ratio and proportion may handle this use case. Based on the pre-defined sequence of statements, such as "I have a dog", "I own a car" and many more, I must determine if an another input statement such as "I have a…
5
votes
0 answers

Does overfitting imply an upper bound on model size/complexity?

Suppose that I have a model M that overfits a large dataset S such that the test error is 30%. Does that mean that there will always exist a model that is smaller and less complex than M that will have a test error less than 30% on S (and does not…
5
votes
3 answers

Which neural network to use for optical mark recognition?

I've created a neural net using the ConvNetSharp library which has 3 fully connected hidden layers. The first having 35 neurons and the other two having 25 neurons each, each layer with a ReLU layer as the activation function layer. I'm using this…
5
votes
1 answer

How can my Q-learning agent trained to solve a specific maze generalize to other mazes?

I implemented Q-learning to solve a specific maze. However, it doesn't solve other mazes. How could my Q-learning agent be able to generalize to other mazes?
5
votes
0 answers

What are the ways to calculate the error rate of a deep Convolutional Neural Network, when the network produces different results using the same data?

I am new to the object recognition community. Here I am asking about the broadly accepted ways to calculate the error rate of a deep CNN when the network produces different results using the same data. 1. Problem introduction Recently I was trying…
5
votes
1 answer

Should the reward or the Q value be clipped for reinforcement learning

When extending reinforcement learning to the continuous states, continuous action case, we must use function approximators (linear or non-linear) to approximate the Q-value. It is well known that non-linear function approximators, such as neural…
5
votes
2 answers

How do I calculate the gradient of the hinge loss function?

With reference to the research paper entitled Sentiment Embeddings with Applications to Sentiment Analysis, I am trying to implement its sentiment ranking model in Python, for which I am required to optimize the following hinge loss function:…
5
votes
4 answers

Traffic signs dataset

I'm looking for annotated dataset of traffic signs. I was able to find Belgium, German and many more traffic signs datasets. The only problem is these datasets contain only cropped images, like this: While i need (for YOLO -- You Only Look Once…
5
votes
2 answers

How fast is TensorFlow compared to self written neural nets?

I made my first neural net in C++ without any libraries. It was a net to recognize numbers from the MNIST dataset. In a 784 - 784 - 10 net with sigmoid function and 5 epochs with every 60000 samples, it took about 2 hours to train. It was probably…
Evator
  • 163
  • 2
  • 7
5
votes
1 answer

How does L2 regularization make weights smaller?

I'm learning logistic regression and $L_2$ regularization. The cost function looks like below. $$J(w) = -\displaystyle\sum_{i=1}^{n} (y^{(i)}\log(\phi(z^{(i)})+(1-y^{(i)})\log(1-\phi(z^{(i)})))$$ And the regularization term is added. ($\lambda$ is a…
5
votes
1 answer

Do good approximations produce good gradients?

Let’s say I have a neural net doing classification and I’m doing stochastic gradient descent to train it. If I know that my current approximation is a decent approximation, can I conclude that my gradient is a decent approximation of the gradient of…
Stella Biderman
  • 331
  • 1
  • 14