0

I would like to build a neural network that takes a natural number and generates a one-hot encoding vector corresponding to that number.

Example: $2 \rightarrow (0,0,1,0,\dots)$

More formally, I want it to take an input $i \in [0, \dots, K]$ and produce an output vector $(o_0, \dots, o_j, \dots, o_k)$, where all $o_j$ are 0, except $o_i$, which is 1 (i.e. one-hot encoding).

However, I am not sure on what the specific architecture should be. I believe it should have more than one layer, since there is no linear relationship between the input and the output.

I have tried several architectures, but the best performance I have been able to get is 33%. I am using the following code:

import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
import random
random.seed(0)
torch.manual_seed(0)

class Net(nn.Module): def init(self): super(Net, self).init() self.fc1 = nn.Linear(1, 55) self.fc2 = nn.Linear(55, 30) self.fc3 = nn.Linear(30, 25) self.relu = nn.ReLU() self.dropout = nn.Dropout(p=0.2) self.softmax = nn.Softmax(dim=0)

def forward(self, x):
    x = self.fc1(x)
    x = self.relu(x)
    x = self.dropout(x)
    x = self.fc2(x)
    x = self.relu(x)
    x = self.dropout(x)
    x = self.fc3(x)
    x = self.softmax(x)
    return x

net = Net()

A list of 10000 random numbers between 0 and 24

dataset = pd.DataFrame([random.randint(0, 24) for _ in range(10000)]) dataset['label'] = dataset[0] # The label is the same as the input

train, test = train_test_split(dataset, test_size=0.2)

criterion = nn.CrossEntropyLoss() optimizer = optim.Adam(net.parameters(), lr=0.001)

for epoch in range(5): # Train running_loss = 0.0 for i in range(len(train)): inputs = train.iloc[i, 1:].tolist() inputs = torch.tensor(inputs, dtype=torch.float) labels = train.iloc[i, 0] labels = torch.tensor(labels, dtype=torch.long)

    optimizer.zero_grad()
    outputs = net(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()

# Evaluate
correct = 0
total = 0
with torch.no_grad():
    for i in range(len(test)):
        inputs = test.iloc[i, 1:].tolist()
        inputs = torch.tensor(inputs, dtype=torch.float)
        labels = test.iloc[i, 0]
        labels = torch.tensor(labels, dtype=torch.long)
        outputs = net(inputs)
        predicted = outputs.argmax()
        total += 1
        correct += (predicted == labels)

print('Accuracy of the network on the test set: {:.2f}%'.format(100 * correct / total))

I believe my task is pretty simple, but I just can't think of the right architecture to do it. Do you have any ideas? :)

Aldan Creo
  • 101
  • 3

0 Answers0