Skip to content

Conversation

gouhongshen
Copy link
Contributor

@gouhongshen gouhongshen commented Jun 20, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue ##21977 #21979 #21978

What this PR does / why we need it:

Create a database/table with a shallow copy and replace the insert into select in the restore statement.

only copy the metadata.

clone db/table within a account: 
	create table db2.t2 clone db1.t1 {snapshot}(optional);
	create database db2 clone db1 {snapshot}(optional);


clone db/table across acocunts:
	only sys has the privilege
	1. the source data come from another normal account
		1. sys create a snapshot for the account acc1
		2. sys clone the db/table from the snapshot to the account acc2
			create database db2 clone db1 {snapshot} to acc2; (the snapshot is NOT an option)
			
	2. the source data come from the sys
		create database db2 clone db1 {snapshot}(optional) to acc2;

PR Type

Enhancement


Description

• Implement table clone functionality for optimized restore operations
• Replace insert-select with shallow copy for metadata-only operations
• Add new SQL syntax "CREATE TABLE ... CLONE ..."
• Integrate clone operations with snapshot restore workflows


Changes walkthrough 📝

Relevant files
Enhancement
17 files
table_meta_reader.go
Add table metadata reader for clone operations                     
+439/-0 
table_clone.go
Implement core table clone operator logic                               
+274/-0 
snapshot.go
Update restore logic to use clone syntax                                 
+33/-14 
build_table_clone.go
Add query plan builder for clone operations                           
+173/-0 
compile.go
Add compilation support for table clone                                   
+46/-3   
types.go
Optimize transaction write ordering logic                               
+44/-10 
back_exec.go
Add restore detection for clone statements                             
+21/-7   
s3util.go
Extract object stats expansion utility function                   
+36/-17 
build_ddl.go
Support snapshot context in create table                                 
+15/-5   
operator.go
Add table clone operator constructor                                         
+21/-1   
clone.go
Define AST node for clone syntax                                                 
+64/-0   
types.go
Define table clone operator types                                               
+59/-0   
pitr.go
Update PITR to support clone operations                                   
+9/-7     
stmt_kind.go
Add clone statement transaction handling                                 
+8/-1     
snapshot_restore_with_ts.go
Update timestamp restore with clone support                           
+6/-4     
authenticate.go
Add privilege checks for clone operations                               
+4/-0     
query_builder.go
Include table clone in column reference mapping                   
+2/-1     
Miscellaneous
2 files
txn_table.go
Rename function for better clarity                                             
+4/-4     
utils.go
Add debug helper for database validation                                 
+7/-1     
Bug fix
1 files
build_show.go
Fix SQL parsing parameter count                                                   
+1/-1     
Additional files
16 files
mysql_cmd_executor.go +6/-1     
plan.pb.go +804/-797
ddl.go +12/-0   
types.go +1/-0     
keywords.go +1/-0     
mysql_sql.go +8138/-8110
mysql_sql.y +13/-1   
create.go +10/-0   
build.go +2/-0     
types_mock_test.go +1/-1     
types.go +4/-0     
blocks_iter.go +0/-1     
stats.go +1/-1     
txn.go +2/-0     
reader_util.go +13/-0   
plan.proto +4/-0     

Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🎫 Ticket compliance analysis ✅

    21977 - PR Code Verified

    Compliant requirements:

    • Implement shallow copy for metadata-only operations
    • Replace insert-select operations in restore statements

    Requires further human verification:

    • Optimize restore operations with no copy functionality (requires performance testing)

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Resource Management

    The TableMetaReader has multiple defer cleanup functions but some resources like iter and s3Writer may not be properly closed in all error paths. The collectVisibleInMemRows function has complex resource management that should be verified.

    defer func() {
    	if iter != nil {
    		iter.Close()
    	}
    
    	if rowsBatch != nil {
    		rowsBatch.Clean(mp)
    	}
    
    	if s3Writer != nil {
    		s3Writer.Close(mp)
    	}
    }()
    Error Handling

    The checkObjStatsFmt function returns an internal error for objects that are appendable or not CN created, but this validation logic may be too strict and could cause legitimate clone operations to fail.

    stats := objectio.ObjectStats(col[i].GetByteSlice(area))
    if stats.GetAppendable() || !stats.GetCNCreated() {
    	return moerr.NewInternalErrorNoCtxf("object fmt wrong: %s", stats.FlagString())
    }
    Logic Flow

    The recreateTable function has complex conditional logic for handling restore operations with clone vs insert-select. The error handling for missing tables could mask legitimate errors and the flow between different restore modes needs validation.

    if !isRestoreByCloneSql.MatchString(restoreTableDataByTsFmt) {
    	// create table
    	getLogger(sid).Info(fmt.Sprintf("[%s] start to create table: %v, create table sql: %s", snapshotName, tblInfo.tblName, tblInfo.createSql))
    	if err = bh.Exec(ctx, tblInfo.createSql); err != nil {
    		if strings.Contains(err.Error(), "no such table") {
    			getLogger(sid).Info(fmt.Sprintf("[%s] foreign key table %v referenced table not exists, skip restore", snapshotName, tblInfo.tblName))
    			err = nil
    		}
    		return
    	}
    }
    
    if curAccountId == toAccountId {
    	// insert data
    	insertIntoSql := fmt.Sprintf(restoreTableDataByTsFmt, tblInfo.dbName, tblInfo.tblName, tblInfo.dbName, tblInfo.tblName, snapshotTs)
    	beginTime := time.Now()
    	getLogger(sid).Info(fmt.Sprintf("[%s] start to insert select table: %v, insert sql: %s", snapshotName, tblInfo.tblName, insertIntoSql))
    	if err = bh.Exec(ctx, insertIntoSql); err != nil {
    		if moerr.IsMoErrCode(err, moerr.ErrNoSuchTable) && !strings.Contains(err.Error(), tblInfo.tblName) {
    			err = nil
    		} else {
    			return err
    		}
    	}

    Copy link

    qodo-merge-pro bot commented Jun 20, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    General
    Remove leftover debugging code
    Suggestion Impact:The suggestion was directly implemented - the debugging defer block with unused variables was completely removed and the function signature was cleaned up

    code diff:

    -func databaseIsValid(dbName string, ctx CompilerContext, snapshot *Snapshot) (x string, err error) {
    -	defer func() {
    -		if err != nil {
    -			p := 0
    -			p++
    -		}
    -	}()
    +func databaseIsValid(dbName string, ctx CompilerContext, snapshot *Snapshot) (string, error) {

    The defer function contains debugging code that serves no purpose and should be
    removed. The variables p is declared but never used meaningfully, indicating
    leftover debug code that can cause confusion.

    pkg/sql/plan/utils.go [1990-2000]

    -func databaseIsValid(dbName string, ctx CompilerContext, snapshot *Snapshot) (x string, err error) {
    -    defer func() {
    -        if err != nil {
    -            p := 0
    -            p++
    -        }
    -    }()
    +func databaseIsValid(dbName string, ctx CompilerContext, snapshot *Snapshot) (string, error) {
         connectDBFirst := false
         if len(dbName) == 0 {
             connectDBFirst = true
         }

    [Suggestion processed]

    Suggestion importance[1-10]: 7

    __

    Why: The suggestion correctly identifies and proposes the removal of leftover debugging code. This improves code clarity and maintainability by removing a non-functional and potentially confusing defer block.

    Medium
    Fix function naming and bounds checking

    The function name checkObjStatsFmt is misleading as it suggests formatting but
    actually validates object statistics. Consider renaming to validateObjectStats
    for clarity. Also, add bounds checking before accessing bat.Vecs[idx] to prevent
    potential panic.

    pkg/sql/colexec/table_clone/table_clone.go [168-189]

    -checkObjStatsFmt := func(bat *batch.Batch, isTombstone bool) error {
    +validateObjectStats := func(bat *batch.Batch, isTombstone bool) error {
         idx := 0
         if !isTombstone {
             idx = 1
    +    }
    +
    +    if idx >= len(bat.Vecs) {
    +        return moerr.NewInternalErrorNoCtxf("invalid vector index: %d", idx)
         }
     
         col, area := vector.MustVarlenaRawData(bat.Vecs[idx])
         for i := range col {
             stats := objectio.ObjectStats(col[i].GetByteSlice(area))
             if stats.GetAppendable() || !stats.GetCNCreated() {
                 return moerr.NewInternalErrorNoCtxf("object fmt wrong: %s", stats.FlagString())
             }
         }
     
         return nil
     }
    • Apply / Chat
    Suggestion importance[1-10]: 6

    __

    Why: The suggestion correctly identifies a misleading function name and proposes a clearer one. It also adds a bounds check which, while not strictly necessary with the current implementation, is good defensive programming that improves code robustness.

    Low
    • Update

    @XuPeng-SH XuPeng-SH requested a review from jiangxinmeng1 July 11, 2025 07:30
    @mergify mergify bot merged commit ac23313 into matrixorigin:main Jul 14, 2025
    19 checks passed
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    kind/enhancement Review effort 4/5 size/XXL Denotes a PR that changes 2000+ lines
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    9 participants