For questions related to object detection (where objects can be e.g. humans, dogs, houses, etc.), whose meaning or definition can vary depending on the context. OD can refer to the task of locating (i.e. finding the coordinates) an object in an image (so, in this case, it would be a synonym for object localization) or the task of locating the object and classifying it (i.e. object localization + object classification).
Questions tagged [object-detection]
233 questions
12
votes
0 answers
Extending FaceNet’s triplet loss to object recognition
FaceNet uses a novel loss metric (triplet loss) to train a model to output embeddings (128-D from the paper), such that any two faces of the same identity will have a small Euclidean distance, and such that any two faces of different identities will…
rossignol
- 121
- 4
11
votes
3 answers
Is it difficult to learn the rotated bounding box for a (rotated) object?
I have checked out many methods and papers, like YOLO, SSD, etc., with good results in detecting a rectangular box around an object, However, I could not find any paper that shows a method that learns a rotated bounding box.
Is it difficult to learn…
Ankish Bansal
- 253
- 1
- 2
- 8
7
votes
1 answer
What are the differences between Yolo v1 and CenterNet?
I recently read a new paper (late 2019) about a one-shot object detector called CenterNet. Apart from this, I'm using Yolo (V3) one-shot detector, and what surprised me is the close similarity between Yolo V1 and CenterNet.
First, both frameworks…
Louis Lac
- 318
- 2
- 9
7
votes
2 answers
What's the role of bounding boxes in object detection?
I'm quite new to the field of computer vision and was wondering what are the purposes of having the boundary boxes in object detection.
Obviously, it shows where the detected object is, and using a classifier can only classify one object per image,…
Cody Chung
- 173
- 1
- 5
6
votes
1 answer
Formal definition of the Object Detection problem
For many problems in computer science, there is a formal, mathematical problem defition.
Something like: Given ..., the problem is to ...
How can the Object Detection problem (i.e. detecting objects on an image) be formally defined?
Given a set of…
JavAlex
- 75
- 5
6
votes
1 answer
How does the region proposal method work in Fast R-CNN?
I read so many articles and the Fast R-CNN paper, but I'm still confused about how the region proposal method works in Fast R-CNN.
As you can see in the image below, they say they used a proposal method, but it is not specified how it works.
What…
ozoubia
- 61
- 2
6
votes
0 answers
Are there any easy ways to create annotated training images for object detection?
For the purposes of object detection, are there any easy ways to create annotated training images? For example, if we have $10,000$ images and want to draw bounding boxes on 2 objects for each image, do we have to physically draw those boxes? Is…
James
- 71
- 1
5
votes
1 answer
Why object detection algorithms are poor in optical character recognition?
OCR is still a very hard problem. We don't have universal powerful solutions. We use the CTC loss function
An Intuitive Explanation of Connectionist Temporal Classification | Towards Data Science
Sequence Modeling
With CTC | Distill
which is very…
user40943
5
votes
1 answer
Should I train different models for detecting subsets of objects?
Suppose we have $1000$ products that we want to detect. For each of these products, we have $500$ training images/annotations. Thus we have $500,000$ training images/associated annotations. If we want to train a good object detection algorithm to…
NebulousReveal
- 1
- 5
5
votes
1 answer
Do models train better if the labelling information is more specific (or dense)?
I'm working on a project where there is a limited dataset of videos (about 200). We want to train a model that can detect a single class in the videos. That class can be of multiple different types of shapes (thin wire, a huge area of the screen,…
NateW
- 153
- 6
5
votes
4 answers
Can bounding boxes further improve the performance of a CNN classifier?
Suppose I have a standard image classification problem (i.e. CNN is shown a single image and predicts a single classification for it). If I were to use bounding boxes to surround the target image (i.e. convert this into an object detection problem),…
user4779
- 203
- 1
- 5
5
votes
1 answer
Which neural network can count the number of objects in an image?
I'm looking for a neural network architecture that excels in counting objects. For example, CNN that can output the number of balls (or any other object) in a given image.
I already found articles about crowd counting. I'm looking for articles about…
ron653
- 83
- 1
- 8
4
votes
1 answer
What loss function should one use for object detection, knowing that the input image contains exactly one target object?
What loss function should one use, knowing that the input image contains exactly one target object?
I am currently using MSE to predict the center of ROI coordinates and its width and height. All values are relative to image size. I think that such…
don_pablito
- 333
- 1
- 11
4
votes
1 answer
Are vision transformers scale invariant like CNNs?
I was trying to implement a vision transformer (RT-DETR) for object detection. I trained the model on 640x640 px images and tested it on a 2000x2000 px image containing many objects - the outputs did not make sense.
From CNN-based object detection…
Lockhart
- 143
- 4
4
votes
1 answer
How to add negative samples for object detection?
My question is: how to add certain negative samples to the training dataset to suppress those samples that are recognized as the object.
For example, if I want to train a car detector. All my training images are outdoor images with at least one car.…
fnhdx
- 143
- 1
- 4