A generic tool to finetune a DETR model, which is the *1. finetune_detr.ipynb* notebook. This notebook is practically the same as the one developed by [AIVC lab](https://github.com/aivclab/detr.git)
A generic tool to train/finetune a DETR model, which is the [1. finetune_detr.ipynb](./1. finetune_detr.ipynb) notebook. This notebook is practically the same as the one developed by [AIVC lab](https://github.com/aivclab/detr.git)
# Inference / Testing
...
...
@@ -101,9 +101,9 @@ Three key scripts for inference:
```
# Evaluation
1. A generic tool to analyse results by comaring ground truth against predictions in terms of Precision, Recall and F1-score. This is done in the notebook *2. Results_analysis* notebook.
1. A generic tool to analyse results by comaring ground truth against predictions in terms of Precision, Recall and F1-score. This is done in the notebook [2. Results_analysis](2. Results_analysis.ipynb) notebook.
2. A generic tool to analyse results at multiple confidence thresholds to identify the most suitable compromise between Precision and Recall. This is done in the notebook *3. Results_analysis_multiple_conf_thresh* notebook.
2. A generic tool to analyse results at multiple confidence thresholds to identify the most suitable compromise between Precision and Recall. This is done in the notebook [3. Results_analysis_multiple_conf_thresh](3. Results_analysis_multiple_conf_thresh.ipynb) notebook.
$`\textcolor{red}{\text{Note: From here on, the documentation relates to original DETR README, and has valueable general information about the package.}}`$
...
...
@@ -116,7 +116,7 @@ The main script, main.py, has a variety of parameters available that can be chan
PyTorch training code and pretrained models for **DETR** (**DE**tection **TR**ansformer).
We replace the full complex hand-crafted object detection pipeline with a Transformer, and match Faster R-CNN with a ResNet-50, obtaining **42 AP** on COCO using half the computation power (FLOPs) and the same number of parameters. Inference in 50 lines of PyTorch.
![DETR](.github/DETR.png)
![DETR](./example_images/DETR.png)
**What it is**. Unlike traditional computer vision techniques, DETR approaches object detection as a direct set prediction problem. It consists of a set-based global loss, which forces unique predictions via bipartite matching, and a Transformer encoder-decoder architecture.
Given a fixed small set of learned object queries, DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions in parallel. Due to this parallel nature, DETR is very fast and efficient.