Skip to content

Conversation

XuPeng-SH
Copy link
Contributor

@XuPeng-SH XuPeng-SH commented Jul 2, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #21437

What this PR does / why we need it:

If the ColPos and ColName in pushdown's ColExpr do not match, panic. If ColName is empty, preferentially use ColPos


PR Type

Bug fix


Description

  • Fix column position validation in filter expressions

  • Add panic checks for mismatched column names and positions

  • Update function signatures to include column position parameter

  • Handle empty column names by using position fallback


Changes diagram

flowchart LR
  A["Filter Expression"] --> B["getColDefByName()"]
  B --> C["Validate ColName vs ColPos"]
  C --> D["Panic if Mismatch"]
  E["evalValue()"] --> F["Handle Empty ColName"]
  F --> G["Use ColPos Fallback"]
Loading

Changes walkthrough 📝

Relevant files
Bug fix
expr_filter.go
Add column position parameter to filter operations             

pkg/vm/engine/readutil/expr_filter.go

  • Updated all getColDefByName() calls to include colExpr.Col.ColPos
    parameter
  • Modified function calls for comparison operators (<=, >=, >, <, =,
    etc.)
  • Added column position parameter to null/not-null checks
  • Updated prefix and between operations with position validation
  • +12/-12 
    expr_util.go
    Implement column validation and position handling               

    pkg/vm/engine/readutil/expr_util.go

  • Added fmt import for panic formatting
  • Modified getColDefByName() to accept colPos parameter and validate
    consistency
  • Added panic check when column name doesn't match expected position
  • Updated evalValue() to handle empty column names and validate
    positions
  • Added pkColId parameter to evalValue() function signature
  • +16/-15 
    pk_filter_base.go
    Add primary key column ID to filter evaluations                   

    pkg/vm/engine/readutil/pk_filter_base.go

  • Updated all evalValue() calls to include int32(tblDef.Pkey.PkeyColId)
    parameter
  • Modified comparison operations (>=, <=, >, <, =) with new parameter
  • Updated prefix operations and IN/BETWEEN clauses with column ID
    validation
  • +10/-10 

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • Copy link

    qodo-merge-pro bot commented Jul 2, 2025

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Panic Behavior

    The code introduces panic calls for validation failures instead of returning errors. This could cause unexpected crashes in production if column name/position mismatches occur due to data corruption or schema changes.

    if name != tableDef.Cols[colPos].Name {
    	panic(fmt.Sprintf("Bad-ColExpr: %s, %d, %d", name, colPos, pos))
    }
    Logic Change

    The evalValue function logic has been significantly restructured, changing how empty column names are handled and validation is performed. The new logic may have different behavior than the original implementation.

    colName := col.Col.Name
    if colName == "" {
    	colName = tblDef.Cols[pkColId].Name
    } else {
    	if col.Col.ColPos != pkColId {
    		panic(fmt.Sprintf("Bad-ColExpr: %s, %d, %d", colName, col.Col.ColPos, pkColId))
    	}
    }
    if !compPkCol(colName, pkName) {
    	return false, 0, nil
    }
    
    if isVec {
    	return true, types.T(tblDef.Cols[pkColId].Typ.Id), [][]byte{val}
    }
    return true, types.T(tblDef.Cols[pkColId].Typ.Id), vals
    Parameter Consistency

    All getColDefByName calls now require colPos parameter but the function signature change may not be consistently applied across the entire codebase, potentially causing compilation issues.

    colDef := getColDefByName(colExpr.Col.Name, colExpr.Col.ColPos, tableDef)
    _, isSorted := isSortedKey(colDef)

    @mergify mergify bot added the kind/bug Something isn't working label Jul 2, 2025
    Copy link

    qodo-merge-pro bot commented Jul 2, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Fix inconsistent column position usage
    Suggestion Impact:The commit addressed the inconsistency issue but took a different approach - instead of changing the return statement, it modified the error handling to use debug logging and kept returning tableDef.Cols[pos]

    code diff:

    -	if name != tableDef.Cols[colPos].Name {
    -		panic(fmt.Sprintf("Bad-ColExpr: %s, %d, %d", name, colPos, pos))
    -	}
    +	common.DoIfDebugEnabled(func() {
    +		if name != tableDef.Cols[colPos].Name {
    +			logutil.Error(
    +				"Bad-ColExpr",
    +				zap.String("col-name", name),
    +				zap.Int32("col-actual-pos", colPos),
    +				zap.Int32("col-expected-pos", pos),
    +				zap.String("col-expr", plan2.FormatExpr(expr)),
    +			)
    +		}
    +	})
     	return tableDef.Cols[pos]

    The function uses colPos for validation but returns tableDef.Cols[pos] instead
    of tableDef.Cols[colPos]. This inconsistency could lead to returning the wrong
    column definition when the position parameters differ.

    pkg/vm/engine/readutil/expr_util.go [72-75]

     if name != tableDef.Cols[colPos].Name {
         panic(fmt.Sprintf("Bad-ColExpr: %s, %d, %d", name, colPos, pos))
     }
    -return tableDef.Cols[pos]
    +return tableDef.Cols[colPos]

    [Suggestion processed]

    Suggestion importance[1-10]: 9

    __

    Why: The suggestion correctly identifies a logical bug where the function validates using colPos but returns a column based on pos, which could be different and lead to incorrect behavior.

    High
    Add bounds check for array access
    Suggestion Impact:The commit addressed the underlying issue by refactoring the code to use colPos calculated from column name lookup instead of directly using pkColId, eliminating the need for bounds checking

    code diff:

     	colName := col.Col.Name
    -	if colName == "" {
    -		colName = tblDef.Cols[pkColId].Name
    -	} else {
    -		if col.Col.ColPos != pkColId {
    -			panic(fmt.Sprintf("Bad-ColExpr: %s, %d, %d", colName, col.Col.ColPos, pkColId))
    -		}
    -	}
    +
    +	common.DoIfDebugEnabled(func() {
    +		if colName == "" {
    +			logutil.Error(
    +				"Bad-ColExpr",
    +				zap.String("col-name", colName),
    +				zap.String("pk-name", pkName),
    +				zap.String("col-expr", plan2.FormatExpr(expr)),
    +			)
    +		}
    +	})
     	if !compPkCol(colName, pkName) {
     		return false, 0, nil
     	}
     
    +	var (
    +		colPos int32
    +		idx    = strings.Index(colName, ".")
    +	)
    +	if idx == -1 {
    +		colPos = tblDef.Name2ColIndex[colName]
    +	} else {
    +		colPos = tblDef.Name2ColIndex[colName[idx+1:]]
    +	}
    +
    +	common.DoIfDebugEnabled(func() {
    +		if colPos != col.Col.ColPos {
    +			logutil.Error(
    +				"Bad-ColExpr",
    +				zap.String("col-name", colName),
    +				zap.Int32("col-actual-pos", col.Col.ColPos),
    +				zap.Int32("col-expected-pos", colPos),
    +				zap.String("col-expr", plan2.FormatExpr(expr)),
    +			)
    +		}
    +	})
    +
     	if isVec {
    -		return true, types.T(tblDef.Cols[pkColId].Typ.Id), [][]byte{val}
    -	}
    -	return true, types.T(tblDef.Cols[pkColId].Typ.Id), vals
    +		return true, types.T(tblDef.Cols[colPos].Typ.Id), [][]byte{val}
    +	}
    +	return true, types.T(tblDef.Cols[colPos].Typ.Id), vals

    The bounds check for pkColId is missing before accessing tblDef.Cols[pkColId].
    This could cause an index out of bounds panic if pkColId is invalid or exceeds
    the slice length.

    pkg/vm/engine/readutil/expr_util.go [107-113]

     if colName == "" {
    +    if pkColId < 0 || int(pkColId) >= len(tblDef.Cols) {
    +        panic(fmt.Sprintf("Invalid pkColId: %d", pkColId))
    +    }
         colName = tblDef.Cols[pkColId].Name
     } else {
         if col.Col.ColPos != pkColId {
             panic(fmt.Sprintf("Bad-ColExpr: %s, %d, %d", colName, col.Col.ColPos, pkColId))
         }
     }

    [Suggestion processed]

    Suggestion importance[1-10]: 8

    __

    Why: The suggestion correctly points out a missing bounds check before accessing tblDef.Cols[pkColId], which could prevent a potential panic from an index out-of-range error.

    Medium
    • Update

    @XuPeng-SH XuPeng-SH changed the title Expr col name pos match panic where the ColExpr's name and pos do not match during pushdown. Jul 2, 2025
    @XuPeng-SH XuPeng-SH changed the title panic where the ColExpr's name and pos do not match during pushdown. panic where the ColExpr's name and pos do not match during pushdown(must fix) Jul 2, 2025
    Copy link
    Contributor

    mergify bot commented Jul 3, 2025

    This pull request has been removed from the queue for the following reason: pull request manually updated.

    The pull request #22089 has been manually updated.

    If you want to requeue this pull request, you can post a @mergifyio requeue comment.

    @mergify mergify bot merged commit 331a330 into matrixorigin:main Jul 3, 2025
    18 of 19 checks passed
    Cabbage74 pushed a commit to Cabbage74/matrixone that referenced this pull request Jul 7, 2025
    If the ColPos and ColName in pushdown's ColExpr do not match, panic. If ColName is empty, preferentially use ColPos
    
    Approved by: @aunjgr, @gouhongshen
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    kind/bug Something isn't working Review effort 3/5 size/M Denotes a PR that changes [100,499] lines
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    4 participants