Skip to content

Conversation

@gaoyia
Copy link
Contributor

@gaoyia gaoyia commented Nov 12, 2025

Fix Array Out-of-Bounds and Optimize MD5 Algorithm Implementation

log-position:

    c = md5ff(c, d, a, b, x[i + 14], 17, -1502002290);
    console.log(x, i + 15)
    b = md5ff(b, c, d, a, x[i + 15], 22, 1236535329);  // ←  Array Out-of-Bounds

test-code:

const str = '123456'

const arr= new Uint8Array(str.length);
for (let i = 0; i < str .length; i++) {  
  arr[i] = str.charCodeAt(i);
}

output:

Uint32Array(15) [
  875770417, 8402485,  0,
          0,       0,  0,
          0,       0,  0,
          0,       0,  0,
          0,       0, 48
] 15

Fix Array Out-of-Bounds and Optimize MD5 Algorithm Implementation
@gaoyia
Copy link
Contributor Author

gaoyia commented Nov 12, 2025

nodejs test code:

(AI Generate)I'm too lazy to translate 😝

// 测试
import crypto from 'node:crypto';

// 辅助:字符串 → UTF-8 Uint8Array
const u8 = (str) => new TextEncoder().encode(str);

// 创建测试用的32位整数数组(模拟大端序数据)
function createBigEndianTestData() {
  // 返回一个包含32位整数的Uint8Array,模拟大端序数据
  const testCases = [];
  
  // 测试用例1:简单的32位整数
  const buffer1 = new ArrayBuffer(4);
  const view1 = new DataView(buffer1);
  view1.setUint32(0, 0x12345678, false); // 大端序
  testCases.push({
    name: '32位整数 0x12345678 (大端序)',
    data: new Uint8Array(buffer1),
    description: '单个32位整数大端序'
  });
  
  // 测试用例2:多个32位整数
  const buffer2 = new ArrayBuffer(8);
  const view2 = new DataView(buffer2);
  view2.setUint32(0, 0x12345678, false); // 大端序
  view2.setUint32(4, 0x87654321, false); // 大端序
  testCases.push({
    name: '多个32位整数 (大端序)',
    data: new Uint8Array(buffer2),
    description: '多个32位整数大端序'
  });
  
  // 测试用例3:边界情况(空数组)
  testCases.push({
    name: '空数组',
    data: new Uint8Array(0),
    description: '空数组边界测试'
  });
  
  return testCases;
}

// 测试用例数组,expect值将动态生成
const testCases = [
  { in: '',               description: '空字符串' },
  { in: '123456',         description: '数字字符串' },
  { in: 'abc',            description: '短字母字符串' },
  { in: 'message digest', description: '英文句子' },
  { in: 'a',              description: '单个字符' },
  { in: 'abcdefghijklmnopqrstuvwxyz', description: '字母表' },
  { in: 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789', 
    description: '大小写字母和数字' },
  { in: '12345678901234567890123456789012345678901234567890123456789012345678901234567890', 
    description: '长数字字符串' },
  { in: '\u0000',        description: '包含空字符的字符串' },
  { in: 'a\u0000b',       description: '中间有空字符' },
  { in: '\u0000\u0000',   description: '多个空字符' },
  { in: ' ',              description: '空格' },
  { in: '  ',             description: '多个空格' },
  { in: '\t',             description: '制表符' },
  { in: '\n',             description: '换行符' },
  { in: '\r\n',           description: '回车换行' },
  { in: '中文',           description: '中文字符' },
  { in: '测试',           description: '测试字符' },
  { in: '你好世界',       description: '中文句子' },
  { in: '!@#$%^&*()',     description: '特殊符号' },
  { in: '😀',             description: 'Emoji表情' },
];

// 动态生成期望值(使用Node.js crypto)
const allTestCases = testCases.map(testCase => ({
  ...testCase,
  expect: crypto.createHash('md5').update(testCase.in).digest('hex')
}));



console.log('=== MD5实现测试 ===');
console.log(`共 ${allTestCases.length} 个测试用例\n`);

let pass = 0;
let fail = 0;
const failedTests = [];

allTestCases.forEach(({ in: str, expect, description }) => {
  // Node.js原生实现
  const nodeHex = crypto.createHash('md5').update(str).digest('hex');

  // 我们的实现
  const u8Arr = u8(str);
  const yourBytes = md5(u8Arr);
  const yourHex = uint8ArrayToHex(yourBytes);

  // 对比结果
  const ok = nodeHex === yourHex && yourHex === expect;
  
  if (ok) {
    pass++;
    console.log(`✅ ${description}: "${str}"`);
    console.log(`   期望: ${expect}`);
    console.log(`   结果: ${yourHex}`);
  } else {
    fail++;
    failedTests.push({ description, str, nodeHex, yourHex, expect });
    console.log(`❌ ${description}: "${str}"`);
    console.log(`   Node.js: ${nodeHex}`);
    console.log(`   我们的:  ${yourHex}`);
    console.log(`   期望:    ${expect}`);
  }
  console.log('');
});

console.log('=== 测试结果汇总 ===');
console.log(`总计: ${allTestCases.length} 个测试用例`);
console.log(`通过: ${pass}`);
console.log(`失败: ${fail}`);

if (fail === 0) {
  console.log('🎉 所有测试用例通过!MD5实现正确!');
} else {
  console.log('\n❌ 失败的测试用例:');
  failedTests.forEach(test => {
    console.log(`\n${test.description}: "${test.str}"`);
    console.log(`  Node.js: ${test.nodeHex}`);
    console.log(`  我们的:  ${test.yourHex}`);
    console.log(`  期望:    ${test.expect}`);
  });
}

// 空字符串边界测试
console.log('\n=== 空字符串详细测试 ===');
const emptyInput = u8('');
const emptyResult = md5(emptyInput);
const emptyHex = uint8ArrayToHex(emptyResult);
const emptyNodeHex = crypto.createHash('md5').update('').digest('hex');
console.log(`空字符串MD5: ${emptyHex}`);
console.log(`Node.js结果: ${emptyNodeHex}`);
console.log(`是否匹配:    ${emptyHex === emptyNodeHex}`);

// 大端序数据测试
console.log('\n=== 大端序数据测试 ===');
const bigEndianTestCases = createBigEndianTestData();
let bigEndianPass = 0;
let bigEndianFail = 0;

bigEndianTestCases.forEach(testCase => {
  const nodeHex = crypto.createHash('md5').update(testCase.data).digest('hex');
  const yourHex = uint8ArrayToHex(md5(testCase.data));
  
  const ok = nodeHex === yourHex;
  
  if (ok) {
    bigEndianPass++;
    console.log(`✅ ${testCase.name}`);
    console.log(`   MD5: ${yourHex}`);
  } else {
    bigEndianFail++;
    console.log(`❌ ${testCase.name}`);
    console.log(`   Node.js: ${nodeHex}`);
    console.log(`   我们的:  ${yourHex}`);
  }
  console.log('');
});

console.log('=== 大端序测试结果 ===');
console.log(`通过: ${bigEndianPass}/${bigEndianTestCases.length}`);
if (bigEndianFail === 0) {
  console.log('🎉 所有大端序测试用例通过!');
}

@broofa
Copy link
Member

broofa commented Nov 12, 2025

Thanks for taking the time to put this together. I'm currently disinclined to take this for the following reasons:

Regarding the out-of-bounds issue: Please create an issue with an MRE (e.g. a codesandbox or stackblitz project) that shows how calling into the public uuid API - i.e. the v3() method - will trigger this out-of-bounds issue. (If the issue can't be triggered that way, I don't consider it a bug.)

Regarding performance: Do you have a use case where v3() performance is a bottleneck? If so, I'd love to hear it. Otherwise I don't consider performance a priority for this project.

@gaoyia gaoyia closed this Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants