- The document discusses evaluating "why-not" queries against scientific workflow provenance. Why-not queries help understand why a data item was not returned by a workflow execution.
- It proposes a solution for evaluating why-not queries in workflows with black-box modules that do not preserve attribute information from inputs. The solution explores workflow modules from sink to source to identify "picky" modules responsible for a data item not appearing in results.
- To identify picky modules, it harvests information from the web by searching for traces of scientific module invocations to find valid candidate inputs and determine if a module accepts them or is likely picky. It conducts an experiment using real workflows to test the effectiveness of