Scan: Overlapping slices should be grouped in the scan generated issues #1845

kevinmessiaen · 2024-03-14T07:55:31Z

🚀 Feature Request

If a slice is completely contained into another slice, we should just report the biggest one.

🔈 Motivation

It will makes the scan report more concise and avoid duplication. Furthermore it takes time and memory to check for those sub slices and it doesn't really provide any value.

abhibongale · 2024-06-01T14:03:29Z

Hi @kevinmessiaen ,

I noticed this issue and would like to contribute to it. Is it still open and relevant? If so, I would appreciate any guidance or additional information that could help me get started.

Thank you!

kevinmessiaen · 2024-06-04T04:06:13Z

Hello @abhibongale

Yes the issue is still relevant, we would appreciate your contribution on this one!

Basically in the Scanner (giskard.scanner.scanner.py) we run a bunch of evaluators depending of the model type.

For the regression and classification models, the detectors will be using the SliceFinder (giskard.slicing.slice_finder.py) to generate some slices that will then be tested. Some of those slices might be overlapping (ei. We can have a slice for the car sub-category that is inside the slice for the transportation category). This is fine since the dataset might have issue for only one of those categories.

However we can have some cases where the whole transportation category contains an issue (meaning that the car and other sub-categories would also contains this issue). That's why we want to filter the sub-slices from the scan report in order to improve it.

I think you can start by having a look at the PerformanceDetector (giskard.scanner.performance.performance_bias_detector.py)

kevinmessiaen added enhancement New feature or request good first issue Good for newcomers labels Mar 14, 2024

kevinmessiaen assigned abhibongale Jun 4, 2024

kevinmessiaen unassigned abhibongale Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scan: Overlapping slices should be grouped in the scan generated issues #1845

Scan: Overlapping slices should be grouped in the scan generated issues #1845

kevinmessiaen commented Mar 14, 2024 •

edited

Loading

abhibongale commented Jun 1, 2024

kevinmessiaen commented Jun 4, 2024

Scan: Overlapping slices should be grouped in the scan generated issues #1845

Scan: Overlapping slices should be grouped in the scan generated issues #1845

Comments

kevinmessiaen commented Mar 14, 2024 • edited Loading

🚀 Feature Request

🔈 Motivation

abhibongale commented Jun 1, 2024

kevinmessiaen commented Jun 4, 2024

kevinmessiaen commented Mar 14, 2024 •

edited

Loading