Questions tagged [yolo]

For questions related to the family of models known as YOLO (which stands for "You Only Look Once"), which were proposed by Joseph Redmon et al. There are at least three YOLO models (versions 1, 2, and 3).

83 questions
11
votes
3 answers

Is it difficult to learn the rotated bounding box for a (rotated) object?

I have checked out many methods and papers, like YOLO, SSD, etc., with good results in detecting a rectangular box around an object, However, I could not find any paper that shows a method that learns a rotated bounding box. Is it difficult to learn…
9
votes
1 answer

In YOLO, what exactly do the values associated with each anchor box represent?

I'm going through Andrew NG's course, which talks about YOLO, but he doesn't go into the implementation details of anchor boxes. After having looked through the code, each anchor box is represented by two values, but what exactly are these values…
7
votes
1 answer

What are the differences between Yolo v1 and CenterNet?

I recently read a new paper (late 2019) about a one-shot object detector called CenterNet. Apart from this, I'm using Yolo (V3) one-shot detector, and what surprised me is the close similarity between Yolo V1 and CenterNet. First, both frameworks…
7
votes
2 answers

What's the role of bounding boxes in object detection?

I'm quite new to the field of computer vision and was wondering what are the purposes of having the boundary boxes in object detection. Obviously, it shows where the detected object is, and using a classifier can only classify one object per image,…
5
votes
1 answer

In YOLO, when is $\mathbb{1}_{i j}^{\mathrm{obj}} = 1$, and what are the ground-truth labels for $x_i$ and $y_i$?

I'm trying to implement a custom version of the YOLO neural network. Originally, it was described in the paper You Only Look Once: Unified, Real-Time Object Detection (2016). I have some problems understanding the loss function they used. Basic…
4
votes
1 answer

What are the main differences between YOLOv3 and RetinaNet object detection algorithms?

I am looking at a certain project that compares performance on a certain dataset for an object detection problem using YOLOv3 and RetinaNet (or the "SSD_ResNet50_FPN" from TF Model Zoo). Both YOLOv3 and RetinaNet seem to have similar features like…
4
votes
1 answer

What is a unified neural network model?

In many articles (for example, in the YOLO paper, this paper or this one), I see the term "unified" being used. I was wondering what the meaning of "unified" in this case is.
4
votes
1 answer

How can I incrementally train a Yolo model without catastrophic forgetting?

I have successfully trained a Yolo model to recognize k classes. Now I want to train by adding k+1 class to the pre-trained weights (k classes) without forgetting previous k classes. Ideally, I want to keep adding classes and train over the previous…
3
votes
3 answers

Would YOLO be able to detect objects in "different" positions?

I have the following question about You Only Look Once (YOLO) algorithm, for object detection. I have to develop a neural network to recognize web components in web applications - for example, login forms, text boxes, and so on. In this context, I…
3
votes
2 answers

What YOLO algorithm can I use for images with noise as I will implement it in real time?

I want to detect drivers with or without seatbelts at crossroads. For that, as it is real-time, I am going to use the YOLO algorithm/model. For training data sets (the images) I need to collect, I placed a camera. By recording it and collecting…
3
votes
2 answers

How can I detect thin objects (like pens and pencils) without a bounding box but only 2 endpoints and the orientation?

I am looking to detect thin objects, like pens, pencils, and surgical instruments. The bounding box is not important, but I am looking to see if I can train a model to detect both the object as well as its orientation. Typical object detection…
3
votes
4 answers

Labeling policy for airplane detecting YOLO

I am training my YOLO to detect airplanes and drones. in some of the pictures it is impossible to detect that the object is indeed an airplane, and it even looks like a drone (pictures are taken from very far away), but I know from the context that…
3
votes
1 answer

YOLO - are the anchor boxes used only in training?

another question in YOLO. I've red about how YOLO adjusts anchor boxes by offsets to create the final bounding boxes. What I do not understand, is when YOLO does it. Is it being done only during the training process, or also during the common use of…
Igor
  • 303
  • 1
  • 11
3
votes
1 answer

How to treat (label and process) edge case inputs in machine learning?

In every computer vision project, I struggle with labeling guidelines for border cases. Benchmark datasets don't have this problem, because they are 'cleaned', but in real life unsure cases often constitute the majority of data. Is 15% of a cat's…
3
votes
0 answers

How are Ground truth provided to each Pyramid map in RetinaNet or YOLOv3 Paper? How is the mapping of Feature Pyramids done to Ground Truth

SO the YOLO V3 and RetinaNet both uses the Feature pyramids which look something like this: (except b and e which have one output) I'm just confuse how the predictions and training is done? Do we have to give EACH feature map a different Y label? IF…
1
2 3 4 5 6