博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
PostgreSQL 源码解读(45)- 查询语句#30(query_planner函数#6)
阅读量:2505 次
发布时间:2019-05-11

本文共 42881 字,大约阅读时间需要 142 分钟。

先前的章节已介绍了函数query_planner中子函数deconstruct_jointree函数的实现逻辑以及等价类的基本概念和数据结构等,本节介绍函数reconsider_outer_join_clauses和generate_base_implied_equalities的主要实现逻辑。

query_planner代码片段:

//...     /*      * Examine the targetlist and join tree, adding entries to baserel      * targetlists for all referenced Vars, and generating PlaceHolderInfo      * entries for all referenced PlaceHolderVars.  Restrict and join clauses      * are added to appropriate lists belonging to the mentioned relations. We      * also build EquivalenceClasses for provably equivalent expressions. The      * SpecialJoinInfo list is also built to hold information about join order      * restrictions.  Finally, we form a target joinlist for make_one_rel() to      * work from.      */     build_base_rel_tlists(root, tlist);//构建"base rels"的投影列      find_placeholders_in_jointree(root);//处理jointree中的PHI      find_lateral_references(root);//处理jointree中Lateral依赖      joinlist = deconstruct_jointree(root);//分解jointree     /*      * Reconsider any postponed outer-join quals now that we have built up      * equivalence classes.  (This could result in further additions or      * mergings of classes.)      */     reconsider_outer_join_clauses(root);//已创建等价类,那么需要重新考虑被下推后处理的外连接表达式      /*      * If we formed any equivalence classes, generate additional restriction      * clauses as appropriate.  (Implied join clauses are formed on-the-fly      * later.)      */     generate_base_implied_equalities(root);//等价类构建后,生成因此外加的约束语句      //...

一、重要的数据结构

RelOptInfo

与上节一样,RelOptInfo结构体贯彻逻辑优化和物理优化过程的始终,需不时Review.

typedef struct RelOptInfo {     NodeTag     type;//节点标识      RelOptKind  reloptkind;//RelOpt类型      /* all relations included in this RelOptInfo */     Relids      relids;         /*Relids(rtindex)集合 set of base relids (rangetable indexes) */      /* size estimates generated by planner */     double      rows;           /*结果元组的估算数量 estimated number of result tuples */      /* per-relation planner control flags */     bool        consider_startup;   /*是否考虑启动成本?是,需要保留启动成本低的路径 keep cheap-startup-cost paths? */     bool        consider_param_startup; /*是否考虑参数化?的路径 ditto, for parameterized paths? */     bool        consider_parallel;  /*是否考虑并行处理路径 consider parallel paths? */      /* default result targetlist for Paths scanning this relation */     struct PathTarget *reltarget;   /*扫描该Relation时默认的结果 list of Vars/Exprs, cost, width */      /* materialization information */     List       *pathlist;       /*访问路径链表 Path structures */     List       *ppilist;        /*路径链表中使用参数化路径进行 ParamPathInfos used in pathlist */     List       *partial_pathlist;   /* partial Paths */     struct Path *cheapest_startup_path;//代价最低的启动路径     struct Path *cheapest_total_path;//代价最低的整体路径     struct Path *cheapest_unique_path;//代价最低的获取唯一值的路径     List       *cheapest_parameterized_paths;//代价最低的参数化?路径链表      /* parameterization information needed for both base rels and join rels */     /* (see also lateral_vars and lateral_referencers) */     Relids      direct_lateral_relids;  /*使用lateral语法,需依赖的Relids rels directly laterally referenced */     Relids      lateral_relids; /* minimum parameterization of rel */      /* information about a base rel (not set for join rels!) */     //reloptkind=RELOPT_BASEREL时使用的数据结构     Index       relid;          /* Relation ID */     Oid         reltablespace;  /* 表空间 containing tablespace */     RTEKind     rtekind;        /* 基表?子查询?还是函数等等?RELATION, SUBQUERY, FUNCTION, etc */     AttrNumber  min_attr;       /* 最小的属性编号 smallest attrno of rel (often <0) */     AttrNumber  max_attr;       /* 最大的属性编号 largest attrno of rel */     Relids     *attr_needed;    /* 数组 array indexed [min_attr .. max_attr] */     int32      *attr_widths;    /* 属性宽度 array indexed [min_attr .. max_attr] */     List       *lateral_vars;   /* 关系依赖的Vars/PHVs LATERAL Vars and PHVs referenced by rel */     Relids      lateral_referencers;    /*依赖该关系的Relids rels that reference me laterally */     List       *indexlist;      /* 该关系的IndexOptInfo链表 list of IndexOptInfo */     List       *statlist;       /* 统计信息链表 list of StatisticExtInfo */     BlockNumber pages;          /* 块数 size estimates derived from pg_class */     double      tuples;         /* 元组数 */     double      allvisfrac;     /* ? */     PlannerInfo *subroot;       /* 如为子查询,存储子查询的root if subquery */     List       *subplan_params; /* 如为子查询,存储子查询的参数 if subquery */     int         rel_parallel_workers;   /* 并行执行,需要多少个workers? wanted number of parallel workers */      /* Information about foreign tables and foreign joins */     //FWD相关信息     Oid         serverid;       /* identifies server for the table or join */     Oid         userid;         /* identifies user to check access as */     bool        useridiscurrent;    /* join is only valid for current user */     /* use "struct FdwRoutine" to avoid including fdwapi.h here */     struct FdwRoutine *fdwroutine;     void       *fdw_private;      /* cache space for remembering if we have proven this relation unique */     //已知的,可保证唯一的Relids链表     List       *unique_for_rels;    /* known unique for these other relid                                      * set(s) */     List       *non_unique_for_rels;    /* 已知的,不唯一的Relids链表 known not unique for these set(s) */      /* used by various scans and joins: */     List       *baserestrictinfo;   /* 如为基本关系,存储约束条件 RestrictInfo structures (if base rel) */     QualCost    baserestrictcost;   /* 解析约束表达式的成本? cost of evaluating the above */     Index       baserestrict_min_security;  /* 最低安全等级 min security_level found in                                              * baserestrictinfo */     List       *joininfo;       /* 连接语句的约束条件信息 RestrictInfo structures for join clauses                                  * involving this rel */     bool        has_eclass_joins;   /* 是否存在等价类连接? T means joininfo is incomplete */      /* used by partitionwise joins: */     bool        consider_partitionwise_join;    /* 分区? consider partitionwise                                                  * join paths? (if                                                  * partitioned rel) */     Relids      top_parent_relids;  /* Relids of topmost parents (if "other"                                      * rel) */      /* used for partitioned relations */     //分区表使用     PartitionScheme part_scheme;    /* 分区的schema Partitioning scheme. */     int         nparts;         /* 分区数 number of partitions */     struct PartitionBoundInfoData *boundinfo;   /* 分区边界信息 Partition bounds */     List       *partition_qual; /* 分区约束 partition constraint */     struct RelOptInfo **part_rels;  /* 分区的RelOptInfo数组 Array of RelOptInfos of partitions,                                      * stored in the same order of bounds */     List      **partexprs;      /* 非空分区键表达式 Non-nullable partition key expressions. */     List      **nullable_partexprs; /* 可为空的分区键表达式 Nullable partition key expressions. */     List       *partitioned_child_rels; /* RT Indexes链表 List of RT indexes. */ } RelOptInfo;

二、源码解读

reconsider_outer_join_clauses函数

该函数遍历优化器信息(PlannerInfo)中的外连接子句(left_join_clauses),把条件分发到合适的地方,其中限制条件(Where子句中的条件)分发到RelOptInfo->baserestrictinfo中,连接条件(连接语句中的条件ON XX)分发到joininfo中

/*  * reconsider_outer_join_clauses  *    Re-examine any outer-join clauses that were set aside by  *    distribute_qual_to_rels(), and see if we can derive any  *    EquivalenceClasses from them.  Then, if they were not made  *    redundant, push them out into the regular join-clause lists.  *  * When we have mergejoinable clauses A = B that are outer-join clauses,  * we can't blindly combine them with other clauses A = C to deduce B = C,  * since in fact the "equality" A = B won't necessarily hold above the  * outer join (one of the variables might be NULL instead).  Nonetheless  * there are cases where we can add qual clauses using transitivity.  *  * One case that we look for here is an outer-join clause OUTERVAR = INNERVAR  * for which there is also an equivalence clause OUTERVAR = CONSTANT.  * It is safe and useful to push a clause INNERVAR = CONSTANT into the  * evaluation of the inner (nullable) relation, because any inner rows not  * meeting this condition will not contribute to the outer-join result anyway.  * (Any outer rows they could join to will be eliminated by the pushed-down  * equivalence clause.)  *  * Note that the above rule does not work for full outer joins; nor is it  * very interesting to consider cases where the generated equivalence clause  * would involve relations outside the outer join, since such clauses couldn't  * be pushed into the inner side's scan anyway.  So the restriction to  * outervar = pseudoconstant is not really giving up anything.  *  * For full-join cases, we can only do something useful if it's a FULL JOIN  * USING and a merged column has an equivalence MERGEDVAR = CONSTANT.  * By the time it gets here, the merged column will look like  *      COALESCE(LEFTVAR, RIGHTVAR)  * and we will have a full-join clause LEFTVAR = RIGHTVAR that we can match  * the COALESCE expression to. In this situation we can push LEFTVAR = CONSTANT  * and RIGHTVAR = CONSTANT into the input relations, since any rows not  * meeting these conditions cannot contribute to the join result.  *  * Again, there isn't any traction to be gained by trying to deal with  * clauses comparing a mergedvar to a non-pseudoconstant.  So we can make  * use of the EquivalenceClasses to search for matching variables that were  * equivalenced to constants.  The interesting outer-join clauses were  * accumulated for us by distribute_qual_to_rels.  *  * When we find one of these cases, we implement the changes we want by  * generating a new equivalence clause INNERVAR = CONSTANT (or LEFTVAR, etc)  * and pushing it into the EquivalenceClass structures.  This is because we  * may already know that INNERVAR is equivalenced to some other var(s), and  * we'd like the constant to propagate to them too.  Note that it would be  * unsafe to merge any existing EC for INNERVAR with the OUTERVAR's EC ---  * that could result in propagating constant restrictions from  * INNERVAR to OUTERVAR, which would be very wrong.  *  * It's possible that the INNERVAR is also an OUTERVAR for some other  * outer-join clause, in which case the process can be repeated.  So we repeat  * looping over the lists of clauses until no further deductions can be made.  * Whenever we do make a deduction, we remove the generating clause from the  * lists, since we don't want to make the same deduction twice.  *  * If we don't find any match for a set-aside outer join clause, we must  * throw it back into the regular joinclause processing by passing it to  * distribute_restrictinfo_to_rels().  If we do generate a derived clause,  * however, the outer-join clause is redundant.  We still throw it back,  * because otherwise the join will be seen as a clauseless join and avoided  * during join order searching; but we mark it as redundant to keep from  * messing up the joinrel's size estimate.  (This behavior means that the  * API for this routine is uselessly complex: we could have just put all  * the clauses into the regular processing initially.  We keep it because  * someday we might want to do something else, such as inserting "dummy"  * joinclauses instead of real ones.)  *  * Outer join clauses that are marked outerjoin_delayed are special: this  * condition means that one or both VARs might go to null due to a lower  * outer join.  We can still push a constant through the clause, but only  * if its operator is strict; and we *have to* throw the clause back into  * regular joinclause processing.  By keeping the strict join clause,  * we ensure that any null-extended rows that are mistakenly generated due  * to suppressing rows not matching the constant will be rejected at the  * upper outer join.  (This doesn't work for full-join clauses.)  */ void reconsider_outer_join_clauses(PlannerInfo *root) {     bool        found;     ListCell   *cell;     ListCell   *prev;     ListCell   *next;      /* Outer loop repeats until we find no more deductions */     do     {         found = false;          /* Process the LEFT JOIN clauses */         prev = NULL;         for (cell = list_head(root->left_join_clauses); cell; cell = next)//遍历left_join_clauses         {             RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);              next = lnext(cell);             if (reconsider_outer_join_clause(root, rinfo, true))             {                 found = true;                 /* remove it from the list */                 root->left_join_clauses =                     list_delete_cell(root->left_join_clauses, cell, prev);//如可以,则去掉连接条件(移到约束条件中)                 /* we throw it back anyway (see notes above) */                 /* but the thrown-back clause has no extra selectivity */                 rinfo->norm_selec = 2.0;                 rinfo->outer_selec = 1.0;                 distribute_restrictinfo_to_rels(root, rinfo);//分发到RelOptInfo中             }             else                 prev = cell;         }          /* Process the RIGHT JOIN clauses */         prev = NULL;         for (cell = list_head(root->right_join_clauses); cell; cell = next)//处理右连接         {             RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);              next = lnext(cell);             if (reconsider_outer_join_clause(root, rinfo, false))             {                 found = true;                 /* remove it from the list */                 root->right_join_clauses =                     list_delete_cell(root->right_join_clauses, cell, prev);                 /* we throw it back anyway (see notes above) */                 /* but the thrown-back clause has no extra selectivity */                 rinfo->norm_selec = 2.0;                 rinfo->outer_selec = 1.0;                 distribute_restrictinfo_to_rels(root, rinfo);             }             else                 prev = cell;         }          /* Process the FULL JOIN clauses */         prev = NULL;         for (cell = list_head(root->full_join_clauses); cell; cell = next)//全连接         {             RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);              next = lnext(cell);             if (reconsider_full_join_clause(root, rinfo))             {                 found = true;                 /* remove it from the list */                 root->full_join_clauses =                     list_delete_cell(root->full_join_clauses, cell, prev);                 /* we throw it back anyway (see notes above) */                 /* but the thrown-back clause has no extra selectivity */                 rinfo->norm_selec = 2.0;                 rinfo->outer_selec = 1.0;                 distribute_restrictinfo_to_rels(root, rinfo);             }             else                 prev = cell;         }     } while (found);   //处理连接条件链表中余下的条件     /* Now, any remaining clauses have to be thrown back */     foreach(cell, root->left_join_clauses)     {         RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);          distribute_restrictinfo_to_rels(root, rinfo);     }     foreach(cell, root->right_join_clauses)     {         RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);          distribute_restrictinfo_to_rels(root, rinfo);     }     foreach(cell, root->full_join_clauses)     {         RestrictInfo *rinfo = (RestrictInfo *) lfirst(cell);          distribute_restrictinfo_to_rels(root, rinfo);     } }  /*  * reconsider_outer_join_clauses for a single LEFT/RIGHT JOIN clause  *  * Returns true if we were able to propagate a constant through the clause.  */ static bool reconsider_outer_join_clause(PlannerInfo *root, RestrictInfo *rinfo,                              bool outer_on_left) {     Expr       *outervar,                *innervar;     Oid         opno,                 collation,                 left_type,                 right_type,                 inner_datatype;     Relids      inner_relids,                 inner_nullable_relids;     ListCell   *lc1;      Assert(is_opclause(rinfo->clause));     opno = ((OpExpr *) rinfo->clause)->opno;     collation = ((OpExpr *) rinfo->clause)->inputcollid;      /* If clause is outerjoin_delayed, operator must be strict */     if (rinfo->outerjoin_delayed && !op_strict(opno))         return false;      /* Extract needed info from the clause */     op_input_types(opno, &left_type, &right_type);     if (outer_on_left)     {         outervar = (Expr *) get_leftop(rinfo->clause);         innervar = (Expr *) get_rightop(rinfo->clause);         inner_datatype = right_type;         inner_relids = rinfo->right_relids;     }     else     {         outervar = (Expr *) get_rightop(rinfo->clause);         innervar = (Expr *) get_leftop(rinfo->clause);         inner_datatype = left_type;         inner_relids = rinfo->left_relids;     }     inner_nullable_relids = bms_intersect(inner_relids,                                           rinfo->nullable_relids);      /* Scan EquivalenceClasses for a match to outervar */     foreach(lc1, root->eq_classes)//遍历等价类     {         EquivalenceClass *cur_ec = (EquivalenceClass *) lfirst(lc1);         bool        match;         ListCell   *lc2;          /* Ignore EC unless it contains pseudoconstants */         if (!cur_ec->ec_has_const)             continue;         /* Never match to a volatile EC */         if (cur_ec->ec_has_volatile)             continue;         /* It has to match the outer-join clause as to semantics, too */         if (collation != cur_ec->ec_collation)             continue;         if (!equal(rinfo->mergeopfamilies, cur_ec->ec_opfamilies))             continue;         /* Does it contain a match to outervar? */         match = false;         foreach(lc2, cur_ec->ec_members)         {             EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);              Assert(!cur_em->em_is_child);   /* no children yet */             if (equal(outervar, cur_em->em_expr))             {                 match = true;                 break;             }         }         if (!match)             continue;           /* no match, so ignore this EC */          /*          * Yes it does!  Try to generate a clause INNERVAR = CONSTANT for each          * CONSTANT in the EC.  Note that we must succeed with at least one          * constant before we can decide to throw away the outer-join clause.          */         match = false;         foreach(lc2, cur_ec->ec_members)         {             EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);             Oid         eq_op;             RestrictInfo *newrinfo;              if (!cur_em->em_is_const)                 continue;       /* ignore non-const members */             eq_op = select_equality_operator(cur_ec,                                              inner_datatype,                                              cur_em->em_datatype);             if (!OidIsValid(eq_op))                 continue;       /* can't generate equality */             newrinfo = build_implied_join_equality(eq_op,                                                    cur_ec->ec_collation,                                                    innervar,                                                    cur_em->em_expr,                                                    bms_copy(inner_relids),                                                    bms_copy(inner_nullable_relids),                                                    cur_ec->ec_min_security);             if (process_equivalence(root, &newrinfo, true))                 match = true;         }          /*          * If we were able to equate INNERVAR to any constant, report success.          * Otherwise, fall out of the search loop, since we know the OUTERVAR          * appears in at most one EC.          */         if (match)             return true;         else             break;     }      return false;               /* failed to make any deduction */ }

generate_base_implied_equalities函数

该函数遍历所有的等价类,找出一个隐含的条件然后分发到RelOptInfo中,这样做的目的是为了在连接(join)前过滤元组,减少参与运算的元组数量.

/*  * generate_base_implied_equalities  *    Generate any restriction clauses that we can deduce from equivalence  *    classes.  *  * When an EC contains pseudoconstants, our strategy is to generate  * "member = const1" clauses where const1 is the first constant member, for  * every other member (including other constants).  If we are able to do this  * then we don't need any "var = var" comparisons because we've successfully  * constrained all the vars at their points of creation.  If we fail to  * generate any of these clauses due to lack of cross-type operators, we fall  * back to the "ec_broken" strategy described below.  (XXX if there are  * multiple constants of different types, it's possible that we might succeed  * in forming all the required clauses if we started from a different const  * member; but this seems a sufficiently hokey corner case to not be worth  * spending lots of cycles on.)  *  * For ECs that contain no pseudoconstants, we generate derived clauses  * "member1 = member2" for each pair of members belonging to the same base  * relation (actually, if there are more than two for the same base relation,  * we only need enough clauses to link each to each other).  This provides  * the base case for the recursion: each row emitted by a base relation scan  * will constrain all computable members of the EC to be equal.  As each  * join path is formed, we'll add additional derived clauses on-the-fly  * to maintain this invariant (see generate_join_implied_equalities).  *  * If the opfamilies used by the EC do not provide complete sets of cross-type  * equality operators, it is possible that we will fail to generate a clause  * that must be generated to maintain the invariant.  (An example: given  * "WHERE a.x = b.y AND b.y = a.z", the scheme breaks down if we cannot  * generate "a.x = a.z" as a restriction clause for A.)  In this case we mark  * the EC "ec_broken" and fall back to regurgitating its original source  * RestrictInfos at appropriate times.  We do not try to retract any derived  * clauses already generated from the broken EC, so the resulting plan could  * be poor due to bad selectivity estimates caused by redundant clauses.  But  * the correct solution to that is to fix the opfamilies ...  *  * Equality clauses derived by this function are passed off to  * process_implied_equality (in plan/initsplan.c) to be inserted into the  * restrictinfo datastructures.  Note that this must be called after initial  * scanning of the quals and before Path construction begins.  *  * We make no attempt to avoid generating duplicate RestrictInfos here: we  * don't search ec_sources for matches, nor put the created RestrictInfos  * into ec_derives.  Doing so would require some slightly ugly changes in  * initsplan.c's API, and there's no real advantage, because the clauses  * generated here can't duplicate anything we will generate for joins anyway.  */ void generate_base_implied_equalities(PlannerInfo *root) {     ListCell   *lc;     Index       rti;      foreach(lc, root->eq_classes)//遍历等价类     {         EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc);          Assert(ec->ec_merged == NULL);  /* else shouldn't be in list */         Assert(!ec->ec_broken); /* not yet anyway... */          /* Single-member ECs won't generate any deductions */         if (list_length(ec->ec_members) <= 1)//小于1个成员,无需处理类             continue;          if (ec->ec_has_const)//有常量             generate_base_implied_equalities_const(root, ec);         else//无常量             generate_base_implied_equalities_no_const(root, ec);          /* Recover if we failed to generate required derived clauses */         if (ec->ec_broken)//处理失败个案             generate_base_implied_equalities_broken(root, ec);     }      /*      * This is also a handy place to mark base rels (which should all exist by      * now) with flags showing whether they have pending eclass joins.      */     for (rti = 1; rti < root->simple_rel_array_size; rti++)//设置标记     {         RelOptInfo *brel = root->simple_rel_array[rti];          if (brel == NULL)             continue;          brel->has_eclass_joins = has_relevant_eclass_joinclause(root, brel);     } }  /*  * generate_base_implied_equalities when EC contains pseudoconstant(s)  */ static void generate_base_implied_equalities_const(PlannerInfo *root,                                        EquivalenceClass *ec) {     EquivalenceMember *const_em = NULL;     ListCell   *lc;      /*      * In the trivial case where we just had one "var = const" clause, push      * the original clause back into the main planner machinery.  There is      * nothing to be gained by doing it differently, and we save the effort to      * re-build and re-analyze an equality clause that will be exactly      * equivalent to the old one.      */     if (list_length(ec->ec_members) == 2 &&         list_length(ec->ec_sources) == 1)     {         RestrictInfo *restrictinfo = (RestrictInfo *) linitial(ec->ec_sources);          if (bms_membership(restrictinfo->required_relids) != BMS_MULTIPLE)         {             distribute_restrictinfo_to_rels(root, restrictinfo);             return;         }     }      /*      * Find the constant member to use.  We prefer an actual constant to      * pseudo-constants (such as Params), because the constraint exclusion      * machinery might be able to exclude relations on the basis of generated      * "var = const" equalities, but "var = param" won't work for that.      */     foreach(lc, ec->ec_members)//获取常量Member     {         EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc);          if (cur_em->em_is_const)         {             const_em = cur_em;             if (IsA(cur_em->em_expr, Const))                 break;         }     }     Assert(const_em != NULL);      /* Generate a derived equality against each other member */     foreach(lc, ec->ec_members)     {         EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc);         Oid         eq_op;          Assert(!cur_em->em_is_child);   /* no children yet */         if (cur_em == const_em)             continue;         eq_op = select_equality_operator(ec,                                          cur_em->em_datatype,                                          const_em->em_datatype);         if (!OidIsValid(eq_op))         {             /* failed... */             ec->ec_broken = true;             break;         }         process_implied_equality(root, eq_op, ec->ec_collation,                                  cur_em->em_expr, const_em->em_expr,                                  bms_copy(ec->ec_relids),                                  bms_union(cur_em->em_nullable_relids,                                            const_em->em_nullable_relids),                                  ec->ec_min_security,                                  ec->ec_below_outer_join,                                  cur_em->em_is_const);//下推条件     } }  /*  * generate_base_implied_equalities when EC contains no pseudoconstants  */ static void generate_base_implied_equalities_no_const(PlannerInfo *root,                                           EquivalenceClass *ec) {     EquivalenceMember **prev_ems;     ListCell   *lc;      /*      * We scan the EC members once and track the last-seen member for each      * base relation.  When we see another member of the same base relation,      * we generate "prev_mem = cur_mem".  This results in the minimum number      * of derived clauses, but it's possible that it will fail when a      * different ordering would succeed.  XXX FIXME: use a UNION-FIND      * algorithm similar to the way we build merged ECs.  (Use a list-of-lists      * for each rel.)      */     prev_ems = (EquivalenceMember **)         palloc0(root->simple_rel_array_size * sizeof(EquivalenceMember *));      foreach(lc, ec->ec_members)     {         EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc);         int         relid;          Assert(!cur_em->em_is_child);   /* no children yet */         if (!bms_get_singleton_member(cur_em->em_relids, &relid))             continue;         Assert(relid < root->simple_rel_array_size);          if (prev_ems[relid] != NULL)         {             EquivalenceMember *prev_em = prev_ems[relid];             Oid         eq_op;              eq_op = select_equality_operator(ec,                                              prev_em->em_datatype,                                              cur_em->em_datatype);             if (!OidIsValid(eq_op))             {                 /* failed... */                 ec->ec_broken = true;                 break;             }             process_implied_equality(root, eq_op, ec->ec_collation,                                      prev_em->em_expr, cur_em->em_expr,                                      bms_copy(ec->ec_relids),                                      bms_union(prev_em->em_nullable_relids,                                                cur_em->em_nullable_relids),                                      ec->ec_min_security,                                      ec->ec_below_outer_join,                                      false);         }         prev_ems[relid] = cur_em;     }      pfree(prev_ems);      /*      * We also have to make sure that all the Vars used in the member clauses      * will be available at any join node we might try to reference them at.      * For the moment we force all the Vars to be available at all join nodes      * for this eclass.  Perhaps this could be improved by doing some      * pre-analysis of which members we prefer to join, but it's no worse than      * what happened in the pre-8.3 code.      */     foreach(lc, ec->ec_members)     {         EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc);         List       *vars = pull_var_clause((Node *) cur_em->em_expr,                                            PVC_RECURSE_AGGREGATES |                                            PVC_RECURSE_WINDOWFUNCS |                                            PVC_INCLUDE_PLACEHOLDERS);          add_vars_to_targetlist(root, vars, ec->ec_relids, false);         list_free(vars);     } }  /*  * generate_base_implied_equalities cleanup after failure  *  * What we must do here is push any zero- or one-relation source RestrictInfos  * of the EC back into the main restrictinfo datastructures.  Multi-relation  * clauses will be regurgitated later by generate_join_implied_equalities().  * (We do it this way to maintain continuity with the case that ec_broken  * becomes set only after we've gone up a join level or two.)  However, for  * an EC that contains constants, we can adopt a simpler strategy and just  * throw back all the source RestrictInfos immediately; that works because  * we know that such an EC can't become broken later.  (This rule justifies  * ignoring ec_has_const ECs in generate_join_implied_equalities, even when  * they are broken.)  */ static void generate_base_implied_equalities_broken(PlannerInfo *root,                                         EquivalenceClass *ec) {     ListCell   *lc;      foreach(lc, ec->ec_sources)     {         RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(lc);          if (ec->ec_has_const ||             bms_membership(restrictinfo->required_relids) != BMS_MULTIPLE)             distribute_restrictinfo_to_rels(root, restrictinfo);     } } /*  * process_implied_equality  *    Create a restrictinfo item that says "item1 op item2", and push it  *    into the appropriate lists.  (In practice opno is always a btree  *    equality operator.)  *  * "qualscope" is the nominal syntactic level to impute to the restrictinfo.  * This must contain at least all the rels used in the expressions, but it  * is used only to set the qual application level when both exprs are  * variable-free.  Otherwise the qual is applied at the lowest join level  * that provides all its variables.  *  * "nullable_relids" is the set of relids used in the expressions that are  * potentially nullable below the expressions.  (This has to be supplied by  * caller because this function is used after deconstruct_jointree, so we  * don't have knowledge of where the clause items came from.)  *  * "security_level" is the security level to assign to the new restrictinfo.  *  * "both_const" indicates whether both items are known pseudo-constant;  * in this case it is worth applying eval_const_expressions() in case we  * can produce constant TRUE or constant FALSE.  (Otherwise it's not,  * because the expressions went through eval_const_expressions already.)  *  * Note: this function will copy item1 and item2, but it is caller's  * responsibility to make sure that the Relids parameters are fresh copies  * not shared with other uses.  *  * This is currently used only when an EquivalenceClass is found to  * contain pseudoconstants.  See path/pathkeys.c for more details.  */ void process_implied_equality(PlannerInfo *root,                          Oid opno,                          Oid collation,                          Expr *item1,                          Expr *item2,                          Relids qualscope,                          Relids nullable_relids,                          Index security_level,                          bool below_outer_join,                          bool both_const) {     Expr       *clause;      /*      * Build the new clause.  Copy to ensure it shares no substructure with      * original (this is necessary in case there are subselects in there...)      */     clause = make_opclause(opno,                            BOOLOID, /* opresulttype */                            false,   /* opretset */                            copyObject(item1),                            copyObject(item2),                            InvalidOid,                            collation);//构造条件表达式      /* If both constant, try to reduce to a boolean constant. */     if (both_const)//     {         clause = (Expr *) eval_const_expressions(root, (Node *) clause);          /* If we produced const TRUE, just drop the clause */         if (clause && IsA(clause, Const))         {             Const      *cclause = (Const *) clause;              Assert(cclause->consttype == BOOLOID);             if (!cclause->constisnull && DatumGetBool(cclause->constvalue))                 return;         }     }      /*      * Push the new clause into all the appropriate restrictinfo lists.      */     distribute_qual_to_rels(root, (Node *) clause,                             true, below_outer_join, JOIN_INNER,                             security_level,                             qualscope, NULL, NULL, nullable_relids,                             NULL);//分发条件至RelOptInfo }

三、跟踪分析

测试脚本:

testdb=# explain verbose select t1.dwbh,t2.grbhtestdb-# from t_dwxx t1 left join t_grxx t2 on t1.dwbh = t2.dwbh and t2.dwbh = '1001' testdb-# order by t2.dwbh;                                    QUERY PLAN                                     ----------------------------------------------------------------------------------- Sort  (cost=19.16..19.56 rows=160 width=114)   Output: t1.dwbh, t2.grbh, t2.dwbh   Sort Key: t2.dwbh   ->  Hash Left Join  (cost=1.09..13.30 rows=160 width=114)         Output: t1.dwbh, t2.grbh, t2.dwbh         Hash Cond: ((t1.dwbh)::text = (t2.dwbh)::text)         ->  Seq Scan on public.t_dwxx t1  (cost=0.00..11.60 rows=160 width=38)               Output: t1.dwmc, t1.dwbh, t1.dwdz         ->  Hash  (cost=1.07..1.07 rows=1 width=76)               Output: t2.grbh, t2.dwbh               ->  Seq Scan on public.t_grxx t2  (cost=0.00..1.07 rows=1 width=76)                     Output: t2.grbh, t2.dwbh                     Filter: ((t2.dwbh)::text = '1001'::text)(13 rows)

跟踪分析,启动gdb

(gdb) b planmain.c:161Breakpoint 1 at 0x76958b: file planmain.c, line 161.(gdb) cContinuing.Breakpoint 1, query_planner (root=0x2c92a88, tlist=0x2c5f048, qp_callback=0x76e906 
, qp_extra=0x7fffed6e9c10) at planmain.c:163warning: Source file is more recent than executable.163 reconsider_outer_join_clauses(root);

调用前检查root(PlannerInfo)->simple_rel_array数组的内存结构,可以看到baserestrictinfo和joininfo均为NULL

(gdb) p *root->simple_rel_array[1]$2 = {type = T_RelOptInfo, reloptkind = RELOPT_BASEREL, relids = 0x2c5fdd0, rows = 0, consider_startup = false,   consider_param_startup = false, consider_parallel = false, reltarget = 0x2c5fde8, pathlist = 0x0, ppilist = 0x0,   partial_pathlist = 0x0, cheapest_startup_path = 0x0, cheapest_total_path = 0x0, cheapest_unique_path = 0x0,   cheapest_parameterized_paths = 0x0, direct_lateral_relids = 0x0, lateral_relids = 0x0, relid = 1, reltablespace = 0,   rtekind = RTE_RELATION, min_attr = -7, max_attr = 3, attr_needed = 0x2c5fe38, attr_widths = 0x2c5fec8,   lateral_vars = 0x0, lateral_referencers = 0x0, indexlist = 0x2c60160, statlist = 0x0, pages = 10, tuples = 160,   allvisfrac = 0, subroot = 0x0, subplan_params = 0x0, rel_parallel_workers = -1, serverid = 0, userid = 0,   useridiscurrent = false, fdwroutine = 0x0, fdw_private = 0x0, unique_for_rels = 0x0, non_unique_for_rels = 0x0,   baserestrictinfo = 0x0, baserestrictcost = {startup = 0, per_tuple = 0}, baserestrict_min_security = 4294967295,   joininfo = 0x0, has_eclass_joins = false, top_parent_relids = 0x0, part_scheme = 0x0, nparts = 0, boundinfo = 0x0,   partition_qual = 0x0, part_rels = 0x0, partexprs = 0x0, nullable_partexprs = 0x0, partitioned_child_rels = 0x0}(gdb) p *root->simple_rel_array[2]$3 = {type = T_RelOptInfo, reloptkind = RELOPT_BASEREL, relids = 0x2c60860, rows = 0, consider_startup = false,   consider_param_startup = false, consider_parallel = false, reltarget = 0x2c60878, pathlist = 0x0, ppilist = 0x0,   partial_pathlist = 0x0, cheapest_startup_path = 0x0, cheapest_total_path = 0x0, cheapest_unique_path = 0x0,   cheapest_parameterized_paths = 0x0, direct_lateral_relids = 0x0, lateral_relids = 0x0, relid = 2, reltablespace = 0,   rtekind = RTE_RELATION, min_attr = -7, max_attr = 5, attr_needed = 0x2c608c8, attr_widths = 0x2c60958,   lateral_vars = 0x0, lateral_referencers = 0x0, indexlist = 0x0, statlist = 0x0, pages = 1, tuples = 6, allvisfrac = 0,   subroot = 0x0, subplan_params = 0x0, rel_parallel_workers = -1, serverid = 0, userid = 0, useridiscurrent = false,   fdwroutine = 0x0, fdw_private = 0x0, unique_for_rels = 0x0, non_unique_for_rels = 0x0, baserestrictinfo = 0x0,   baserestrictcost = {startup = 0, per_tuple = 0}, baserestrict_min_security = 4294967295, joininfo = 0x0,   has_eclass_joins = false, top_parent_relids = 0x0, part_scheme = 0x0, nparts = 0, boundinfo = 0x0, partition_qual = 0x0,   part_rels = 0x0, partexprs = 0x0, nullable_partexprs = 0x0, partitioned_child_rels = 0x0}(gdb)

调用reconsider_outer_join_clauses,注意joininfo,填入了相应的数据

(gdb) p *root->simple_rel_array[1]$4 = {type = T_RelOptInfo, reloptkind = RELOPT_BASEREL, relids = 0x2c5fdd0, rows = 0, consider_startup = false,   consider_param_startup = false, consider_parallel = false, reltarget = 0x2c5fde8, pathlist = 0x0, ppilist = 0x0,   partial_pathlist = 0x0, cheapest_startup_path = 0x0, cheapest_total_path = 0x0, cheapest_unique_path = 0x0,   cheapest_parameterized_paths = 0x0, direct_lateral_relids = 0x0, lateral_relids = 0x0, relid = 1, reltablespace = 0,   rtekind = RTE_RELATION, min_attr = -7, max_attr = 3, attr_needed = 0x2c5fe38, attr_widths = 0x2c5fec8,   lateral_vars = 0x0, lateral_referencers = 0x0, indexlist = 0x2c60160, statlist = 0x0, pages = 10, tuples = 160,   allvisfrac = 0, subroot = 0x0, subplan_params = 0x0, rel_parallel_workers = -1, serverid = 0, userid = 0,   useridiscurrent = false, fdwroutine = 0x0, fdw_private = 0x0, unique_for_rels = 0x0, non_unique_for_rels = 0x0,   baserestrictinfo = 0x0, baserestrictcost = {startup = 0, per_tuple = 0}, baserestrict_min_security = 4294967295,   joininfo = 0x2c61780, has_eclass_joins = false, top_parent_relids = 0x0, part_scheme = 0x0, nparts = 0, boundinfo = 0x0,   partition_qual = 0x0, part_rels = 0x0, partexprs = 0x0, nullable_partexprs = 0x0, partitioned_child_rels = 0x0}(gdb) p *root->simple_rel_array[2]$5 = {type = T_RelOptInfo, reloptkind = RELOPT_BASEREL, relids = 0x2c60860, rows = 0, consider_startup = false,   consider_param_startup = false, consider_parallel = false, reltarget = 0x2c60878, pathlist = 0x0, ppilist = 0x0,   partial_pathlist = 0x0, cheapest_startup_path = 0x0, cheapest_total_path = 0x0, cheapest_unique_path = 0x0,   cheapest_parameterized_paths = 0x0, direct_lateral_relids = 0x0, lateral_relids = 0x0, relid = 2, reltablespace = 0,   rtekind = RTE_RELATION, min_attr = -7, max_attr = 5, attr_needed = 0x2c608c8, attr_widths = 0x2c60958,   lateral_vars = 0x0, lateral_referencers = 0x0, indexlist = 0x0, statlist = 0x0, pages = 1, tuples = 6, allvisfrac = 0,   subroot = 0x0, subplan_params = 0x0, rel_parallel_workers = -1, serverid = 0, userid = 0, useridiscurrent = false,   fdwroutine = 0x0, fdw_private = 0x0, unique_for_rels = 0x0, non_unique_for_rels = 0x0, baserestrictinfo = 0x0,   baserestrictcost = {startup = 0, per_tuple = 0}, baserestrict_min_security = 4294967295, joininfo = 0x2c617d0,   has_eclass_joins = false, top_parent_relids = 0x0, part_scheme = 0x0, nparts = 0, boundinfo = 0x0, partition_qual = 0x0,   part_rels = 0x0, partexprs = 0x0, nullable_partexprs = 0x0, partitioned_child_rels = 0x0}

调用generate_base_implied_equalities,注意root->simple_rel_array[2]->baserestrictinfo,条件已下推至限制条件(原为连接条件)

(gdb) p *root->simple_rel_array[2]$7 = {type = T_RelOptInfo, reloptkind = RELOPT_BASEREL, relids = 0x2c60860, rows = 0, consider_startup = false,   consider_param_startup = false, consider_parallel = false, reltarget = 0x2c60878, pathlist = 0x0, ppilist = 0x0,   partial_pathlist = 0x0, cheapest_startup_path = 0x0, cheapest_total_path = 0x0, cheapest_unique_path = 0x0,   cheapest_parameterized_paths = 0x0, direct_lateral_relids = 0x0, lateral_relids = 0x0, relid = 2, reltablespace = 0,   rtekind = RTE_RELATION, min_attr = -7, max_attr = 5, attr_needed = 0x2c608c8, attr_widths = 0x2c60958,   lateral_vars = 0x0, lateral_referencers = 0x0, indexlist = 0x0, statlist = 0x0, pages = 1, tuples = 6, allvisfrac = 0,   subroot = 0x0, subplan_params = 0x0, rel_parallel_workers = -1, serverid = 0, userid = 0, useridiscurrent = false,   fdwroutine = 0x0, fdw_private = 0x0, unique_for_rels = 0x0, non_unique_for_rels = 0x0, baserestrictinfo = 0x2c61820,   baserestrictcost = {startup = 0, per_tuple = 0}, baserestrict_min_security = 0, joininfo = 0x2c617d0,   has_eclass_joins = false, top_parent_relids = 0x0, part_scheme = 0x0, nparts = 0, boundinfo = 0x0, partition_qual = 0x0,   part_rels = 0x0, partexprs = 0x0, nullable_partexprs = 0x0, partitioned_child_rels = 0x0}

详细的数据结构,可自行通过gdb查看

四、参考资料

initsplan.c

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/6906/viewspace-2374866/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/6906/viewspace-2374866/

你可能感兴趣的文章
laravel连接sql server 2008
查看>>
Laravel 操作redis的各种数据类型
查看>>
Laravel框架学习笔记之任务调度(定时任务)
查看>>
laravel 定时任务秒级执行
查看>>
浅析 Laravel 官方文档推荐的 Nginx 配置
查看>>
Swagger在Laravel项目中的使用
查看>>
Laravel 的生命周期
查看>>
CentOS Docker 安装
查看>>
Nginx
查看>>
Navicat远程连接云主机数据库
查看>>
Nginx配置文件nginx.conf中文详解(总结)
查看>>
Mysql出现Table 'performance_schema.session_status' doesn't exist
查看>>
MySQL innert join、left join、right join等理解
查看>>
vivado模块封装ip/edf
查看>>
sdc时序约束
查看>>
Xilinx Jtag Access/svf文件/BSCANE2
查看>>
NoC片上网络
查看>>
开源SoC整理
查看>>
【2020-3-21】Mac安装Homebrew慢,解决办法
查看>>
influxdb 命令行输出时间为 yyyy-MM-dd HH:mm:ss(年月日时分秒)的方法
查看>>