1

I am currently trying to build a DQN agent that plays the game UNO The observation it gets looks like this:

{
   'agent_cards': BoundedArraySpec(shape=(108,2), dtype=np.int32, minimum=[0,0] , maximum=[5,15], name='agent_cards'),
   'top_discard_card': (
       BoundedArraySpec(shape=(2,), dtype=np.int32, minimum=0, maximum=5, name='top_discard_card_color'),
       BoundedArraySpec(shape=(2,), dtype=np.int32, minimum=0, maximum=15, name='top_discard_card_type')
   ),
   'enemy_card_amounts': BoundedArraySpec(shape=(3,), dtype=np.int32, minimum=0, maximum=108, name='enemy_card_amounts'),
   'turn_order_reversed': BoundedArraySpec(shape=(1,), dtype=np.int32, minimum=0, maximum=1, name='turn_order_reversed')
}

and my action spec looks like this

BoundedArraySpec(shape=(), dtype=np.int32, minimum=0, maximum=107, name='card_input')

The idea is to give the agent a list of up to 108 cards (the max hand limit) and specify their color: (0-5 for none, red green blue yellow and wild) as well as their type: (0 for none, 1-10 for 0-9, 11-15 for draw 2, reverse, skip, wild and draw 4)

The agent also receives a tuple of the data for the card on top of the discard pile, as well as the number of cards the other 3 players have, and whether the turn order has been reversed.

The current action space has the agent select a value from 0 to 107 corresponding to a card in it's hand. This is the card it will attempt to play onto the discard pile.

My main question is how should I encode the data of the cards to the agent. Is my current method a good fit for machine learning, or is there another encoding scheme that would fit my agent better?

In case it matters my preprocessing layers are:

preprocessing_layers = {
      'agent_cards': keras.Sequential([
          keras.layers.Reshape((216,)),
          TypeCastLayer(dtype="float32")
      ]),
      'top_discard_card': keras.Sequential([
          keras.layers.Concatenate(axis=-1, name="con2"),
          TypeCastLayer()
      ]),
      'enemy_card_amounts': keras.Sequential([TypeCastLayer()]),
      'turn_order_reversed': keras.Sequential([TypeCastLayer()])
  }

(note that TypeCastLayer simply converts int32 to float32) And my preprocessing combiner is keras.layers.Concatenate(axis=-1, name="con3")

A link to the colab can be found here: https://colab.research.google.com/drive/1qZuIXChYpNJ4a9nKZDNIL_tdU2hF359N#scrollTo=V5AbtOYLakmk

(It currently gives an error about 'NoneType' object has no attribute 'is_first' But that is because I have not yet programmed in the actual _step method yet.)

0 Answers0