# MILDMS: Multiple Instance Learning via DD Constraint and Multiple Part Similarity

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- We made the first attempt to convert MIL to multiple part similarity problem and analyzed their relationships.
- Our most positive and negative instance similarity with multiple part similarities combination method has shown to achieve significant improvements in robust MIL where noisy labels are provided.
- The one target concept and multiple target concepts problems in MIL can be tackled in one framework. Meanwhile, we combine the hand-crafted and CNN features into our framework, which can provide more powerful feature representation ability.
- Experiments on MIL dataset MUSK, PASCAL VOC, and COREL, etc., show that our proposal can outperform the state-of-the-art baselines including traditional MIL and deep MIL algorithms.

## 2. Related Works

## 3. Algorithm Overview

#### 3.1. The Analysis of MIL and Multiple Part Similarity

#### 3.2. Overview of Our Proposed MILDMS Algorithm

## 4. MILDMS Algorithm

#### 4.1. Review of Diverse Density (DD) Algorithm

#### 4.2. Target Concept Locating

#### 4.3. De-Noising Mislabeled Instances

#### 4.4. Multiple Part Similarity Kernel Construction

#### 4.5. MIL Kernel Construction

_{i}and B

_{j}, C

_{i}and C

_{j}are two sets that satisfy ${B}_{i}={C}_{i}\cup \left\{{x}_{i{h}_{i}^{+}},{x}_{i{h}_{i}^{-}}\right\}$ and ${B}_{j}={C}_{j}\cup \left\{{x}_{j{h}_{i}^{+}},{x}_{j{h}_{i}^{-}}\right\}$, ${k}_{p}({x}_{i{h}_{i}^{+}},{x}_{j{h}_{j}^{+}})$ and ${k}_{n}({x}_{i{h}_{i}^{-}},{x}_{j{h}_{j}^{-}})$ are the kernel functions used to compute the most positive and negative instance similarity, and ${k}_{o}({C}_{i},{C}_{j})$ is the predefined multiple part similarity kernel.

#### 4.6. Feature Extraction

#### 4.7. Algorithm View

Algorithm 1. Learning MIL Classifier |

Repeat iterationsFork^{+} = 1: MFor k^{−} = 1: M - Step 1:
- Solve the optimization (Equation (10)) to obtain potential target concepts P, Q and indicator matrix δ, ξ
- Step 2:
- Use C3 in Equation (13) to select the most positive and negative target concepts ${\left\{{t}_{i}\right\}}_{i=1}^{2}$
- Step 3:
- Denoise the mislabeled instance by applying Equation (12)
- Step 4:
- Compute the DD function using Equation (12) to check the constraints C4 in Equation (8)
EndEndUntil the constraint C4 is metFor (each training bag B_{i} in set B)
- Step 1:
- Choose the most positive and negative instances ${x}_{{ih}^{+}}$, ${x}_{{ih}^{-}}$ using Equation (13)
- Step 2:
- Optimize Equation (14) to obtain optimal flow and construct multiple part kernel using Equation (15)
EndOptimize Equation (18) to obtain the decision variables ${\alpha}^{\ast},{d}^{\ast}\mathrm{and}{b}^{\ast}$ |

Algorithm 2. Classifying MIL Bag |

For (each unknown bag B_{i} in set U)- Step 1:
- Choose the most positive and negative instances ${x}_{{ih}^{+}}$, ${x}_{{ih}^{-}}$ using Equation (13))
- Step 2:
- Optimize Equation (14) to obtain optimal flow and construct multiple part kernel using Equation (15))
- Step 3:
- Use Equation (19) to predict Bag B
_{i}’s label)
End |

## 5. Experiments

#### 5.1. Standard MIL Dataset

#### 5.2. Object Detection

#### 5.3. Image Classification Application

#### 5.4. Sensitivity to Labeling Noise

## 6. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

**Table 1.**Multiple instance learning (MIL) data sets, with the respective number of positive and negative bags, features, and instances.

Data Set | Positive Bags | Negative Bags | Features | Instances |
---|---|---|---|---|

MUSK1 | 47 | 45 | 166 | 476 |

MUSK2 | 39 | 63 | 166 | 6598 |

Elephant | 100 | 100 | 230 | 1391 |

Fox | 100 | 100 | 230 | 1220 |

Tiger | 100 | 100 | 230 | 1320 |

**Table 2.**Results (in %) of proposed MILDMS, and many of the MIL algorithms available in the literature, with results provided for both MUSK and Images data sets.

Algorithm/Data Set | MUSK1 | MUSK2 | Elephant | Tiger | Fox | Average |
---|---|---|---|---|---|---|

MILDMS | 91.9 | 92.2 | 84.40 | 87.10 | 66.80 | 84.48 |

APR [5] | 92.4 | 89.2 | / | / | / | 90.80 |

SMILE [20] | 91.3 | 91.6 | 84.30 | 86.50 | 65.90 | 83.92 |

MILES [12] | 87.7 | 86.3 | 84.10 | 80.70 | 63.00 | 80.54 |

DD [6] | 84.8 | 84.9 | 84.80 | 81.50 | 64.30 | 81.52 |

DD-SVM [11] | 85.8 | 91.3 | 82.90 | 75.80 | 55.00 | 77.76 |

Attention [25] | 89.1 | 84.1 | 83,20 | 73,40 | 49,10 | 77.28 |

Gated-Attention [25] | 89.8 | 86.2 | 81.45 | 72.20 | 59.40 | 78.23 |

MI-Net with DS [24] | 89.8 | 85.1 | 78.30 | 72.10 | 56.10 | 75.24 |

**Table 3.**Results (in %) for COREL 2000: average accuracy, and the 95%-confidence interval (in brackets).

