Instantiate one of these objects by giving it the name of the .cfg and .weights file, then call predict() as often as necessary to determine what the images contain. More...
Classes | |
struct | PredictionResult |
Structure used to store interesting information on predictions. More... | |
Public Types | |
enum | ESort { ESort::kUnsorted, ESort::kAscending, ESort::kDescending } |
typedef std::map< std::string, std::string > | MStr |
Map of strings where both the key and the value are std::string . More... | |
typedef std::vector< std::string > | VStr |
Vector of text strings. Typically used to store the class names. More... | |
typedef std::vector< cv::Scalar > | VColours |
Vector of colours to use by annotate(). More... | |
typedef std::map< int, float > | MClassProbabilities |
Map of a class ID to a probability that this object belongs to that class. More... | |
typedef std::vector< PredictionResult > | PredictionResults |
A vector of predictions for the image analyzed by predict(). More... | |
Public Member Functions | |
virtual | ~DarkHelp () |
Destructor. This automatically calls reset() to release memory allocated by the neural network. More... | |
DarkHelp () | |
Constructor. When using this constructor, the neural network remains uninitialized until init() is called. More... | |
DarkHelp (const std::string &cfg_filename, const std::string &weights_filename, const std::string &names_filename="", const bool verify_files_first=true) | |
Constructor. More... | |
virtual std::string | version () const |
Get a version string for the DarkHelp library. E.g., could be 1.0.0-123 . More... | |
virtual DarkHelp & | init (const std::string &cfg_filename, const std::string &weights_filename, const std::string &names_filename="", const bool verify_files_first=true) |
Initialize ("load") the darknet neural network. More... | |
virtual void | reset () |
The opposite of init(). This is automatically called by the destructor. More... | |
virtual PredictionResults | predict (const std::string &image_filename, const float new_threshold=-1.0f) |
Use the neural network to predict what is contained in this image. More... | |
virtual PredictionResults | predict (cv::Mat mat, const float new_threshold=-1.0f) |
Use the neural network to predict what is contained in this image. More... | |
virtual PredictionResults | predict_tile (cv::Mat mat, const float new_threshold=-1.0f) |
Similar to predict(), but automatically breaks the images down into individual tiles if it is significantly larger than the network dimensions. More... | |
virtual cv::Mat | annotate (const float new_threshold=-1.0f) |
Takes the most recent prediction_results, and applies them to the most recent original_image. More... | |
virtual std::string | duration_string () |
Return the duration as a text string which can then be added to the image during annotation. More... | |
virtual cv::Size | network_size () |
Determine the size of the network. For example, 416x416, or 608x480. More... | |
Static Public Member Functions | |
static VColours | get_default_annotation_colours () |
Obtain a vector of at least 25 different bright colours that may be used to annotate images. More... | |
static MStr | verify_cfg_and_weights (std::string &cfg_filename, std::string &weights_filename, std::string &names_filename) |
Look at the names and/or the contents of all 3 files and swap the filenames around if necessary so the .cfg, .weights, and .names are assigned where they should be. More... | |
static size_t | edit_cfg_file (const std::string &cfg_filename, MStr m) |
This is used to insert lines into the [net] section of the configuration file. More... | |
Public Attributes | |
void * | net |
The Darknet network, but stored as a void* pointer so we don't have to include darknet.h. More... | |
VStr | names |
A vector of names corresponding to the identified classes. More... | |
std::chrono::high_resolution_clock::duration | duration |
The length of time it took to initially load the network and weights (after the DarkHelp object has been constructed), or the length of time predict() took to run on the last image to be processed. More... | |
float | threshold |
Image prediction threshold. More... | |
float | hierarchy_threshold |
Used during prediction. More... | |
float | non_maximal_suppression_threshold |
Non-Maximal Suppression (NMS) threshold suppresses overlapping bounding boxes and only retains the bounding box that has the maximum probability of object detection associated with it. More... | |
PredictionResults | prediction_results |
A copy of the most recent results after applying the neural network to an image. This is set by predict(). More... | |
bool | names_include_percentage |
Determines if the name given to each prediction includes the percentage. More... | |
bool | annotation_auto_hide_labels |
Hide the label if the size of the text exceeds the size of the prediction. More... | |
float | annotation_shade_predictions |
Determines the amount of "shade" used when drawing the prediction rectangles. More... | |
bool | include_all_names |
Determine if multiple class names are included when labelling an item. More... | |
VColours | annotation_colours |
The colours to use in annotate(). More... | |
cv::HersheyFonts | annotation_font_face |
Font face to use in annotate(). Defaults to cv::HersheyFonts::FONT_HERSHEY_SIMPLEX . More... | |
double | annotation_font_scale |
Scaling factor used for the font in annotate(). Defaults to 0.5 . More... | |
int | annotation_font_thickness |
Thickness of the font in annotate(). Defaults to 1 . More... | |
int | annotation_line_thickness |
Thickness of the lines to draw in annotate(). Defaults to 2 . More... | |
bool | annotation_include_duration |
If set to true then annotate() will call duration_string() and display on the top-left of the image the length of time predict() took to process the image. More... | |
bool | annotation_include_timestamp |
If set to true then annotate() will display a timestamp on the bottom-left corner of the image. More... | |
bool | fix_out_of_bound_values |
Darknet sometimes will return values that are out-of-bound, especially when working with low thresholds. More... | |
cv::Mat | original_image |
The most recent image handled by predict(). More... | |
cv::Mat | annotated_image |
The most recent output produced by annotate(). More... | |
ESort | sort_predictions |
Determines if the predictions will be sorted the next time predict() is called. More... | |
bool | enable_debug |
This enables some non-specific debug functionality within the DarkHelp library. More... | |
bool | enable_tiles |
Determines if calls to predict() are sent directly to Darknet, or processed first by predict_tile() to break the image file into smaller sections. More... | |
size_t | horizontal_tiles |
The number of horizontal tiles the image was split into by predict_tile() prior to calling predict(). More... | |
size_t | vertical_tiles |
The number of vertical tiles the image was split into by predict_tile() prior to calling predict(). More... | |
cv::Size | tile_size |
The size that was used for each individual tile by predict_tile(). More... | |
bool | modify_batch_and_subdivisions |
When training, the "batch=..." and "subdivisions=..." values in the .cfg file are typically set to a large value. More... | |
std::set< int > | annotation_suppress_classes |
Determines which classes to suppress during the call to annotate(). More... | |
bool | combine_tile_predictions |
When tiling is enabled, objects may span multiple tiles. More... | |
float | tile_edge_factor |
This value controls how close to the edge of a tile an object must be to be considered for re-combining when both tiling and recombining have been enabled. More... | |
float | tile_rect_factor |
This value controls how close the rectangles needs to line up on two tiles before the predictions are combined. More... | |
Protected Member Functions | |
PredictionResults | predict_internal (cv::Mat mat, const float new_threshold=-1.0f) |
Used by all the other predict() calls to do the actual network prediction. More... | |
DarkHelp & | name_prediction (PredictionResult &pred) |
Give a consistent name to the given production result. More... | |
Instantiate one of these objects by giving it the name of the .cfg and .weights file, then call predict() as often as necessary to determine what the images contain.
For example:
Instead of calling annotate(), you can get the detection results and iterate through them:
Instead of writing your own loop, you can also use the std::ostream
operator<<()
like this:
typedef std::map<std::string, std::string> DarkHelp::MStr |
Map of strings where both the key and the value are std::string
.
typedef std::vector<std::string> DarkHelp::VStr |
Vector of text strings. Typically used to store the class names.
typedef std::vector<cv::Scalar> DarkHelp::VColours |
Vector of colours to use by annotate().
typedef std::map<int, float> DarkHelp::MClassProbabilities |
Map of a class ID to a probability that this object belongs to that class.
The key is the zero-based index of the class, while the value is the probability that the object belongs to that class.
typedef std::vector<PredictionResult> DarkHelp::PredictionResults |
A vector of predictions for the image analyzed by predict().
Each PredictionResult entry in the vector represents a different object in the image.
|
strong |
Enumerator | |
---|---|
kUnsorted | Do not sort predictions. |
kAscending | Sort predictions using PredictionResult::best_probability in ascending order (low values first, high values last). |
kDescending | Sort predictions using PredictionResult::best_probability in descending order (high values first, low values last). |
|
virtual |
Destructor. This automatically calls reset() to release memory allocated by the neural network.
DarkHelp::DarkHelp | ( | ) |
Constructor. When using this constructor, the neural network remains uninitialized until init() is called.
DarkHelp::DarkHelp | ( | const std::string & | cfg_filename, |
const std::string & | weights_filename, | ||
const std::string & | names_filename = "" , |
||
const bool | verify_files_first = true |
||
) |
Constructor.
This constructor automatically calls init().
verify_files_first
is set to true
(the default value). This is because init() will call verify_cfg_and_weights() to correctly determine which is the
.cfg,
.weights, and
.names file, and swap the names around as necessary so Darknet is given the correct filenames.
|
virtual |
Get a version string for the DarkHelp library. E.g., could be 1.0.0-123
.
|
virtual |
Initialize ("load") the darknet neural network.
If verify_files_first
has been enabled (the default) then this method will also call the static method verify_cfg_and_weights() to perform some last-minute validation prior to darknet loading the neural network.
std::runtime_error | if the call to darknet's load_network_custom() has failed. |
|
virtual |
The opposite of init(). This is automatically called by the destructor.
|
virtual |
Use the neural network to predict what is contained in this image.
This results in a call to either predict_internal() or predict_tile() depending on how enable_tiles has been set.
[in] | image_filename | The name of the image file to load from disk and analyze. The member original_image will be set to this image. If the image is larger or smaller than the dimensions of the neural network, then Darknet will stretch the image to match the exact size of the neural network. Stretching the image does not maintain the the aspect ratio. |
[in] | new_threshold | Which threshold to use. If less than zero, the previous threshold will be applied. If >= 0, then threshold will be set to this new value. The threshold must be either -1, or a value between 0.0 and 1.0 meaning 0% to 100%. |
std::invalid_argument | if the image failed to load. |
|
virtual |
Use the neural network to predict what is contained in this image.
This results in a call to either predict_internal() or predict_tile() depending on how enable_tiles has been set.
[in] | mat | A OpenCV2 image which has already been loaded and which needs to be analyzed. The member original_image will be set to this image. If the image is larger or smaller than the dimensions of the neural network, then Darknet will stretch the image to match the exact size of the neural network. Stretching the image does not maintain the the aspect ratio. |
[in] | new_threshold | Which threshold to use. If less than zero, the previous threshold will be applied. If >= 0, then threshold will be set to this new value. The threshold must be either -1, or a value between 0.0 and 1.0 meaning 0% to 100%. |
std::invalid_argument | if the image is empty. |
|
virtual |
Similar to predict(), but automatically breaks the images down into individual tiles if it is significantly larger than the network dimensions.
This is explained in details in Image Tiling.
false
(which is the default).Here is a visual representation of a large image broken into 4 tiles for processing by Darknet. It is important to understand that neither the individual image tiles nor their results are returned to the caller. DarkHelp only returns the final results once each tile has been processed and the vectors have been merged together.
std::invalid_argument | if the image is empty. |
|
virtual |
Takes the most recent prediction_results, and applies them to the most recent original_image.
The output annotated image is stored in annotated_image as well as returned to the caller.
This is an example of what an annotated image looks like:
[in] | new_threshold | Which threshold to use. If less than zero, the previous threshold will be applied. If >= 0, then threshold will be set to this new value. |
Turning down the threshold in annotate() wont bring back predictions that were excluded due to a higher threshold originally used with predict(). Here is an example:
In the previous example, when annotate() is called with the lower threshold of 25%, the predictions had already been capped at 75%. This means any prediction between >= 25% and < 75% were excluded from the prediction results. The only way to get those predictions is to re-run predict() with a value of 0.25.
1
.std::logic_error | if an attempt is made to annotate an empty image |
|
virtual |
Return the duration as a text string which can then be added to the image during annotation.
For example, this might return "912 microseconds"
or "375 milliseconds"
.
|
virtual |
Determine the size of the network. For example, 416x416, or 608x480.
std::logic_error | if the neural network has not yet been initialized. |
|
static |
Obtain a vector of at least 25 different bright colours that may be used to annotate images.
OpenCV uses BGR, not RGB. For example:
"{0, 0, 255}"
is pure red "{255, 0, 0}"
is pure blueThe colours returned by this function are intended to be used by OpenCV, and thus are in BGR format.
Default colours returned by this method are:
Index | RGB Hex | Name |
---|---|---|
0 | FF355E | Radical Red |
1 | 299617 | Slimy Green |
2 | FFCC33 | Sunglow |
3 | AF6E4D | Brown Sugar |
4 | FF00FF | Pure magenta |
5 | 50BFE6 | Blizzard Blue |
6 | CCFF00 | Electric Lime |
7 | 00FFFF | Pure cyan |
8 | 8D4E85 | Razzmic Berry |
9 | FF48CC | Purple Pizzazz |
10 | 00FF00 | Pure green |
11 | FFFF00 | Pure yellow |
12 | 5DADEC | Blue Jeans |
13 | FF6EFF | Shocking Pink |
14 | AAF0D1 | Magic Mint |
15 | FFC000 | Orange |
16 | 9C51B6 | Purple Plum |
17 | FF9933 | Neon Carrot |
18 | 66FF66 | Screamin' Green |
19 | FF0000 | Pure red |
20 | 4B0082 | Indigo |
21 | FF6037 | Outrageous Orange |
22 | FFFF66 | Laser Lemon |
23 | FD5B78 | Wild Watermelon |
24 | 0000FF | Pure blue |
|
static |
Look at the names and/or the contents of all 3 files and swap the filenames around if necessary so the .cfg,
.weights, and
.names are assigned where they should be.
This is necessary because darknet tends to segfault if it is given the wrong filename. (For example, if it mistakenly tries to parse the .weights file as a
.cfg file.) This function does a bit of sanity checking, determines which file is which, and also returns a map of debug information related to each file.
On input, it doesn't matter which file goes into which parameter. Simply pass in the filenames in any order.
On output, the .cfg,
.weights, and
.names will be set correctly. If needed for display purposes, some additional information is also passed back using the
MStr
string map, but most callers should ignore this.
std::invalid_argument | if at least 2 unique filenames have not been provided |
std::runtime_error | if the size of the files cannot be determined (one or more file does not exist?) |
std::invalid_argument | if the cfg file doesn't exist |
std::invalid_argument | if the cfg file doesn't contain "[net]" near the top of the file |
std::invalid_argument | if the configuration file does not have a line that says "classes=..." |
std::invalid_argument | if the weights file doesn't exist |
std::invalid_argument | if weights file has an invalid version number (or weights file is from an extremely old version of darknet?) |
std::runtime_error | if the number of lines in the names file doesn't match the number of classes in the configuration file |
|
static |
This is used to insert lines into the [net] section of the configuration file.
Pass in a map of key-value pairs, and if the key exists it will be modified. If the key does not exist, then it will be added to the bottom of the [net] section.
For example, this is used by init() when modify_batch_and_subdivisions is enabled.
std::invalid_argument | if the cfg file does not exist or cannot be opened |
std::runtime_error | if a valid start and end to the [net] section wasn't found in the .cfg file |
std::runtime_error | if we cannot write a new .cfg file |
std::runtime_error | if we cannot rename the .cfg file |
|
protected |
Used by all the other predict() calls to do the actual network prediction.
This uses the image stored in original_image.
std::logic_error | if the network is invalid. |
std::logic_error | if the image is invalid. |
|
protected |
Give a consistent name to the given production result.
This gets called by both predict_internal() and predict_tile() and is intended for internal use only.
void* DarkHelp::net |
The Darknet network, but stored as a void* pointer so we don't have to include darknet.h.
VStr DarkHelp::names |
A vector of names corresponding to the identified classes.
This is typically setup in the constructor, but can be manually set afterwards.
std::chrono::high_resolution_clock::duration DarkHelp::duration |
The length of time it took to initially load the network and weights (after the DarkHelp object has been constructed), or the length of time predict() took to run on the last image to be processed.
If using predict_tile(), then this will store the sum of all durations across the entire set of tiles.
float DarkHelp::threshold |
Image prediction threshold.
Defaults to 0.5
.
Quote:
[...] threshold is what is used to determine whether or not there is an object in the predicted bounding box. The network predicts an explicit 'objectness' score separate from the class predictions that if above the threshold indicates that a bounding box will be returned. [source]
float DarkHelp::hierarchy_threshold |
Used during prediction.
Defaults to 0.5
.
Quote:
[...] the network traverses the tree of candidate detections and multiples through the conditional probabilities for each item, e.g. object * animal * feline * house cat. The hierarchical threshold is used in this second step, completely after and separate from whether there is an item or not, to decide whether following the tree further to a more specific class is the right action to take. When this threshold is 0, the tree will basically follow the highest probability branch all the way to a leaf node. [source]
hierchy_threshold
. The typo in the name was fixed in December 2019. float DarkHelp::non_maximal_suppression_threshold |
Non-Maximal Suppression (NMS) threshold suppresses overlapping bounding boxes and only retains the bounding box that has the maximum probability of object detection associated with it.
Defaults to 0.45
.
Quote:
[...] nms works by looking at all bounding boxes that made it past the 'objectness' threshold and removes the least confident of the boxes that overlap with each other above a certain IOU threshold [source]
(IOU – "intersection over union" – is a ratio that describes how much two areas overlap, where 0.0 means two areas don't overlap at all, and 1.0 means two areas perfectly overlap.)
PredictionResults DarkHelp::prediction_results |
A copy of the most recent results after applying the neural network to an image. This is set by predict().
bool DarkHelp::names_include_percentage |
Determines if the name given to each prediction includes the percentage.
For example, the name for a prediction might be "dog"
when this flag is set to false
, or it might be "dog 98%"
when set to true
. Defaults to true
.
Examples:
Setting | Image |
---|---|
names_include_percentage=true | ![]() |
names_include_percentage=false | ![]() |
bool DarkHelp::annotation_auto_hide_labels |
Hide the label if the size of the text exceeds the size of the prediction.
This can help "clean up" some images which contain many small objects. Set to false
to always display every label. Set to true
if DarkHelp should decide whether a label must be shown or hidden. Defaults to true
.
Examples:
Setting | Image |
---|---|
auto_hide_labels=true | ![]() |
auto_hide_labels=false | ![]() |
float DarkHelp::annotation_shade_predictions |
Determines the amount of "shade" used when drawing the prediction rectangles.
When set to zero, the rectangles are not shaded. When set to 1.0, prediction recangles are completely filled. Values in between are semi-transparent. For example, the default value of 0.25 means the rectangles are filled at 25% opacity.
Examples:
Setting | Image |
---|---|
shade_predictions=0.0 | ![]() |
shade_predictions=0.25 | ![]() |
shade_predictions=0.50 | ![]() |
shade_predictions=0.75 | ![]() |
shade_predictions=1.0 | ![]() |
bool DarkHelp::include_all_names |
Determine if multiple class names are included when labelling an item.
For example, if an object is 95% car or 80% truck, then the label could say "car, truck"
when this is set to true
, and simply "car"
when set to false
. Defaults to true
.
VColours DarkHelp::annotation_colours |
The colours to use in annotate().
Defaults to get_default_annotation_colours().
Remember that OpenCV uses BGR, not RGB. So pure red is "(0, 0, 255)"
.
cv::HersheyFonts DarkHelp::annotation_font_face |
Font face to use in annotate(). Defaults to cv::HersheyFonts::FONT_HERSHEY_SIMPLEX
.
double DarkHelp::annotation_font_scale |
Scaling factor used for the font in annotate(). Defaults to 0.5
.
int DarkHelp::annotation_font_thickness |
Thickness of the font in annotate(). Defaults to 1
.
int DarkHelp::annotation_line_thickness |
Thickness of the lines to draw in annotate(). Defaults to 2
.
bool DarkHelp::annotation_include_duration |
If set to true
then annotate() will call duration_string() and display on the top-left of the image the length of time predict() took to process the image.
Defaults to true
.
When enabed, the duration may look similar to this:
bool DarkHelp::annotation_include_timestamp |
If set to true
then annotate() will display a timestamp on the bottom-left corner of the image.
Defaults to false
.
When enabled, the timestamp may look similar to this:
bool DarkHelp::fix_out_of_bound_values |
Darknet sometimes will return values that are out-of-bound, especially when working with low thresholds.
For example, the X
or Y
coordinates might be less than zero, or the width
and height
might extend beyond the edges of the image. When fix_out_of_bound_values
is set to true
(the default) then the results (prediction_results) after calling predict() will be capped so all values are positive and do not extend beyond the edges of the image. When set to false
, the exact values as returned by darknet will be used. Defaults to true
.
cv::Mat DarkHelp::original_image |
The most recent image handled by predict().
cv::Mat DarkHelp::annotated_image |
The most recent output produced by annotate().
ESort DarkHelp::sort_predictions |
Determines if the predictions will be sorted the next time predict() is called.
When set to ESort::kUnsorted, the predictions are in the exact same order as they were returned by Darknet. When set to ESort::kAscending or ESort::kDescending, the predictions will be sorted according to PredictionResult::best_probability.
If annotations will be drawn on the image for visual consumption, then it is often preferable to have the higher probability predictions drawn last so they appear "on top". Otherwise, lower probability predictions may overwrite or obscure the more important ones. This means using ESort::kAscending (the default).
If you want to process only the first few predictions instead of drawing annotations, then you may want to sort using ESort::kDescending to ensure you handle the most likely predictions first.
Defaults to ESort::kAscending.
bool DarkHelp::enable_debug |
This enables some non-specific debug functionality within the DarkHelp library.
The exact results of enabling this is undocumented, and will change or may be completely removed without prior notice. It is not meant for the end-user, but instead is used for developers debugging DarkHelp and Darknet. Default value is false
.
bool DarkHelp::enable_tiles |
Determines if calls to predict() are sent directly to Darknet, or processed first by predict_tile() to break the image file into smaller sections.
This flag is only checked when predict() is called. If you call predict_tile() directly, then it bypasses the check for enable_tiles
and DarkHelp will assume that the image is a candidate for tiling.
Both predict() and predict_tile() will set the values tile_size, vertical_tiles, and horizontal_tiles once they have finished running. The caller can then reference these to determine what kind of tiling was used. Even when an image is not tiled, these variables will be set; for example, tile_size may be set to 1x1, and the horizontal and vertical sizes will match the neural network dimensions.
The default value for enable_tiles
is false
, meaning that calling predict() wont automatically result in image tiling.
size_t DarkHelp::horizontal_tiles |
The number of horizontal tiles the image was split into by predict_tile() prior to calling predict().
This is set to 1
if calling predict(). It may be > 1 if calling predict_tile() with an image large enough to require multiple tiles.
size_t DarkHelp::vertical_tiles |
The number of vertical tiles the image was split into by predict_tile() prior to calling predict().
This is set to 1
if calling predict(). It may be > 1 if calling predict_tile() with an image large enough to require multiple tiles.
cv::Size DarkHelp::tile_size |
The size that was used for each individual tile by predict_tile().
This will be the size of the network when calling predict().
For example, if the network is 416x416
, and the image used with predict_tile() measures 1280x960
, then:
3
2
"(427, 480)"
bool DarkHelp::modify_batch_and_subdivisions |
When training, the "batch=..."
and "subdivisions=..."
values in the .cfg file are typically set to a large value.
But when loading a neural network for inference as DarkHelp is designed to help with, both of those values in the .cfg should be set to "1"
. When modify_batch_and_subdivisions
is enabled, DarkHelp will edit the configuration file once DarkHelp::init() is called. This ensures the values are set as needed prior to Darknet loading the .cfg file.
The default value for modify_batch_and_subdivisions
is true
, meaning the .cfg file will be modified. If set to false
, DarkHelp will not modify the configuration file.
Example use:
std::set<int> DarkHelp::annotation_suppress_classes |
Determines which classes to suppress during the call to annotate().
Any prediction returned by Darknet for a class listed in this std::set
will be ignored: no bounding box will be drawn, and no label will be shown. The set may be modified at any point and will take effect the next time annotate() is called.
It is initialized by init() to contain any classes where the label name begins with the text "dont_show"
as described in https://github.com/AlexeyAB/darknet/issues/2122.
For example, when considering this annotated image:
If the .names file is modified in this manner:
Then the annotated image will look like this:
annotation_suppress_classes
. bool DarkHelp::combine_tile_predictions |
When tiling is enabled, objects may span multiple tiles.
When this flag is set to true
, DarkHelp will attempt to combine predictions that cross two or more tiles into a single prediction. This has no impact when tiling is off, or when the image processed fits within a single tile. Default is true
.
Image | Description |
---|---|
![]() | For example, the image on the left is a portion of a much larger image. |
![]() | The blue horizontal line in this image shows the location of the boundary between two image tiles. |
![]() | When combine_tile_predictions=false , the predictions which are split by a tile boundary look like this. |
![]() | When combine_tile_predictions=true , the predictions split by a tile boundary are re-combined into a single object. |
float DarkHelp::tile_edge_factor |
This value controls how close to the edge of a tile an object must be to be considered for re-combining when both tiling and recombining have been enabled.
The smaller the value, the closer the object must be to the edge of a tile. The factor is multiplied by the width and height of the detected object.
Possible range to consider would be 0.01
to 0.5
. If set to zero, then the detected object must be right on the tile boundary to be considered. The default is 0.25
.
float DarkHelp::tile_rect_factor |
This value controls how close the rectangles needs to line up on two tiles before the predictions are combined.
This is only used when both tiling and recombining have been enabled. As the value approaches 1.0
, the closer the rectangles have to be "perfect" to be combined. Values below 1.0
are impossible, and predictions will never be combined. Possible range to consider might be 1.10
to 1.50
or even higher; this also depends on the nature/shape of the objects detected and how the tiles are split up. For example, if the objects are pear-shaped, where the smaller end is on one tile and the larger end on another tile, you may need to increase this value as the object on the different tiles will be of different sizes. The default is 1.20
.