Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. The Importance of Parameters in Database Queries
 
conference paper

The Importance of Parameters in Database Queries

Grohe, Martin
•
Kimelfeld, Benny
•
Lindner, Peter  
Show more
Cormode, Graham
•
Shekelyan, Michael
March 1, 2024
Leibniz International Proceedings in Informatics, LIPIcs
27 International Conference on Database Theory

We propose and study a framework for quantifying the importance of the choices of parameter values to the result of a query over a database. These parameters occur as constants in logical queries, such as conjunctive queries. In our framework, the importance of a parameter is its Shap score. This score is a popular instantiation of the game-theoretic Shapley value to measuring the importance of feature values in machine learning models. We make the case for the rationale of using this score by explaining the intuition behind Shap, and by showing that we arrive at this score in two different, apparently opposing, approaches to quantifying the contribution of a parameter. The application of the Shap score requires two components in addition to the query and the database: (a) a probability distribution over the combinations of parameter values, and (b) a utility function that measures the similarity between the result for the original parameters and the result for hypothetical parameters. The main question addressed in the paper is the complexity of calculating the Shap score for different distributions and similarity measures. We first address the case of probabilistically independent parameters. The problem is hard if we consider a fragment of queries that is hard to evaluate (as one would expect), and even for the fragment of acyclic conjunctive queries. In some cases, though, one can efficiently list all relevant parameter combinations, and then the Shap score can be computed in polynomial time under reasonable general conditions. Also tractable is the case of full acyclic conjunctive queries for certain (natural) similarity functions. We extend our results to conjunctive queries with inequalities between variables and parameters. Finally, we discuss a simple approximation technique for the case of correlated parameters.

  • Details
  • Metrics
Type
conference paper
DOI
10.4230/LIPIcs.ICDT.2024.14
Scopus ID

2-s2.0-85188615157

Author(s)
Grohe, Martin

Rheinisch-Westfälische Technische Hochschule Aachen

Kimelfeld, Benny

Technion - Israel Institute of Technology

Lindner, Peter  

École Polytechnique Fédérale de Lausanne

Standke, Christoph

Rheinisch-Westfälische Technische Hochschule Aachen

Editors
Cormode, Graham
•
Shekelyan, Michael
Date Issued

2024-03-01

Publisher

Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

Published in
Leibniz International Proceedings in Informatics, LIPIcs
ISBN of the book

9783959773126

Book part number

290

Article Number

14

Subjects

query parameters

•

SHAP score

•

Shapley value

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
DATA  
Event nameEvent acronymEvent placeEvent date
27 International Conference on Database Theory

Paestum, Italy

2024-03-25 - 2024-03-28

Available on Infoscience
January 26, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/244743
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés