4

Let $\mathcal{S}$ be the training data set, where each input $u^i \in \mathcal{S}$ has $d$ features.

I want to design an ANN so that the cost function below is minimized (the sum of the square of pairwise differences between model outputs) and the given constraint is satisfied, where $w$ is the ANN model parameter vector.

\begin{align} \min _{w}& \sum_{\{i, j\} \in \mathcal{S}}\left(f\left(w, u^{i}\right)-f\left(w, u^{j}\right)\right)^{2} \\ &f\left(w, u^{i}\right) \geq q_{\min }, \quad i \in \mathcal{S} \end{align}

What kind of ANN is suitable for this purpose?

nbro
  • 42,615
  • 12
  • 119
  • 217
user3489173
  • 309
  • 7

2 Answers2

0

If I understand your query correctly, you want to create a latent space that groups similar objects. You should then probably look for Siamese networks. However, your loss function will need another term to increase dissimilarity between different labels. Otherwise, as pointed out by Mike NZ, the net would collapse(yes, it is possible). Perhaps this will give some insights.

Note that the above method is not completely unsupervised. There are, in fact, a few claims of unsupervised classifications via clustering, although your objective function would look very different. You could go through this paper(called SCAN) for more details.

Hope it helps.

[Edit]

If you want a (lower-dimensional)representation of objects themselves, browsing through this could help. For a complex problem, linear reductions like PCA, although helpful, aren't probably what you're looking for. Here you can try training autoencoders. The loss function would work, along with some regularization term.

0

ChatGPT May 12 Version says:

The cost function you have specified is a pairwise loss function that penalizes the difference between the output of the ANN for any two input instances. It is similar to the contrastive loss function used in Siamese Networks. Siamese Networks are a type of neural network architecture that consists of two or more identical subnetworks that share the same weights and are used to compare two inputs.

Therefore, a Siamese Network is a suitable choice for minimizing the cost function and satisfying the given constraint. In particular, you can use a Siamese Network with a shared encoder that maps each input instance to a fixed-length vector representation, followed by a distance metric that computes the pairwise distance between the encoded representations. The distance metric can be implemented using a fully connected layer or a custom loss function.

To satisfy the given constraint, you can add a constraint layer to the network that enforces a minimum output value for each input instance. The constraint layer can be implemented using a fully connected layer with a fixed bias term or a custom activation function that saturates at the minimum output value.

Overall, the architecture of the Siamese Network depends on the specifics of the problem, including the dimensionality of the input space and the desired level of accuracy. You may need to experiment with different architectures and hyperparameters to achieve the desired performance.

Brian O'Donnell
  • 1,997
  • 9
  • 23