15 июн. 2022 г.

How to find useless foreign keys in a database

With times, old databases tend to overgrow with unnecessary or just empty fields. Whilst regular fields are not much of a concern, those with foreign key constraints, especially in tables with millions of records, would needlessly inflate the database file and put performance penalty on every insert/update/delete operation, as appropriate index need to be updated.

For example, an index on a field that holds nothing more than NULL values over a table with 100 000 000 records has a size approximately of 600 MB.

The query below helps to spot fields with foreign keys, which contain no more than a given count of unique values (by default, the variable maxuniqvalues is set to 1 to find all fields which are empty or contain exactly one value). Additionally, the condition on minimal record count in a table in question could be set through variable minreccnt.

The result data set includes the following columns: name of the relation, number of records in the relation, name of the field, value (only the first value is shown), and a list of objects dependent on the field. The latter will help a lot if the field to be deleted later.

  EXECUTE BLOCK
  RETURNS(
    rn VARCHAR(31),
    reccnt INTEGER,
    fkfieldname VARCHAR(31),
    val INTEGER,
    dependent VARCHAR(8192)
  )
AS
  DECLARE VARIABLE minreccnt DOUBLE PRECISION = 1000000;
  DECLARE VARIABLE maxuniqvalues DOUBLE PRECISION = 1;
BEGIN
  FOR
    SELECT
      rc.rdb$relation_name,
      CAST((1 / idx.rdb$statistics) AS INTEGER),
      idxsfk.rdb$field_name,
      (SELECT
         LIST(TRIM(d.rdb$dependent_name))
         FROM rdb$dependencies d
         WHERE 
           d.rdb$depended_on_name = rc.rdb$relation_name
           AND 
           d.rdb$field_name = idxsfk.rdb$field_name)
    FROM
      rdb$relation_constraints rc
      JOIN rdb$indices idx
        ON idx.rdb$index_name = rc.rdb$index_name
      JOIN rdb$index_segments idxs
        ON idxs.rdb$index_name = idx.rdb$index_name
        AND idxs.rdb$field_position = 0
      JOIN rdb$relation_constraints rcfk
        ON rcfk.rdb$constraint_type = 'FOREIGN KEY'
        AND rcfk.rdb$relation_name = rc.rdb$relation_name
      JOIN rdb$indices idxfk
        ON idxfk.rdb$index_name = rcfk.rdb$index_name
      JOIN rdb$index_segments idxsfk
        ON idxsfk.rdb$index_name = idxfk.rdb$index_name
        AND idxsfk.rdb$field_position = 0
    WHERE
      rc.rdb$constraint_type = 'PRIMARY KEY'
      AND
      (idx.rdb$statistics > 0 
        AND idx.rdb$statistics < (1.0 / :minreccnt))
      AND
      idxfk.rdb$statistics >= (1.0 / :maxuniqvalues)
    ORDER BY
      idx.rdb$statistics ASC
    INTO
      :rn, :reccnt, :fkfieldname, :dependent
  DO BEGIN
    EXECUTE STATEMENT 
      'SELECT FIRST 1 ' || 
      :fkfieldname || 
      ' FROM ' || 
      :rn
      INTO :val;
    SUSPEND;
  END
END