0

I am completely new to computer vision and I am working on a small hobby project. The goal is to use camera footage of a foosball table to map the image to already well defined object geometry with few moving parts.

To illustrate the exact problem the input could be this: Video footage of a foos match

I am looking to identify all the features of the image and map them to an exact model that could be rendered like so: Geometry features mapped on the model It shows the position of the ball and player rods.

I would already have an exact definition of what the image contains like:

  • Exact size of the playing field
  • Exact positions of player rods relative to the length of the field
  • The player sizes and positions on the bar
  • Every detail of the table is described already in the model and can not deviate from it

Some limitations:

  • The image would always be at an angle. So part of the playing field is cropped and would have to be extrapolated.
  • The lighting can vary from one set-up to another.

There are only a few moving parts that need to pinned pointed- the position of the rods (can for now disregard the rotation) and the position of the ball.

It sounds quite complicated but given that I already can tell exactly what I am looking for maybe it can make the task easier.

What would be the most straight-forward approach to solving this problem? Any suggestions or further reading would be greatly appreciated.

My current ideas would be to maybe identify a few anchor points from the image. For example if I would identify the top edge of the playing field - where the green playing pitch ends. And If I were to then be able to identify the position of the top player rod - the goalie rod. Measuring the distance between these lines would allow me to calculate the height of the camera angle because I know the exact spatial relationship between these two features in the pre-defined table model/geometry. With this data I would be able to calculate the positions where all the other features in the image could be located. If I could narrow down areas in the picture that need to be looked at for special features of the table I could maybe add additional constraints on the things that need to be looked at and simplify the task further.

Another idea I had was using the table surface image to identify the table orientation. For instance given the above input picture I might also have a hardcoded reference picture like this: Reference image of the playing area

Can this reference picture be used to find it in the input image in a skewed / cropped way and infer the table position from this overlaying of the two images?

apriede
  • 1
  • 2

0 Answers0