Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions apps/workers/workers/feedWorker.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,21 @@ import logger from "@karakeep/shared/logger";
import { DequeuedJob, getQueueClient } from "@karakeep/shared/queueing";
import { BookmarkTypes } from "@karakeep/shared/types/bookmarks";

/**
* Deterministically maps a feed ID to a minute offset within the hour (0-59).
* This ensures feeds are spread evenly across the hour based on their ID.
*/
function getFeedMinuteOffset(feedId: string): number {
// Simple hash function: sum character codes
let hash = 0;
for (let i = 0; i < feedId.length; i++) {
hash = (hash << 5) - hash + feedId.charCodeAt(i);
hash = hash & hash; // Convert to 32-bit integer
}
// Return a minute offset between 0 and 59
return Math.abs(hash) % 60;
}
Comment on lines +21 to +30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Bitwise operation on line 26 is a no-op.

The expression hash = hash & hash; simply returns hash unchanged. To convert to a 32-bit integer as the comment indicates, use hash = hash | 0; or hash = hash >>> 0; instead.

Apply this diff:

-    hash = hash & hash; // Convert to 32-bit integer
+    hash = hash | 0; // Convert to 32-bit integer
🤖 Prompt for AI Agents
In apps/workers/workers/feedWorker.ts around lines 21 to 30, the bitwise
operation `hash = hash & hash;` is a no-op but the intent is to coerce to a
32-bit integer; replace that line with a proper coercion such as `hash = hash |
0;` (for signed 32-bit) or `hash = hash >>> 0;` (for unsigned 32-bit) so the
hash becomes a 32-bit integer before taking the absolute value and modulo 60.


export const FeedRefreshingWorker = cron.schedule(
"0 * * * *",
() => {
Expand All @@ -30,16 +45,32 @@ export const FeedRefreshingWorker = cron.schedule(
const currentHour = new Date();
currentHour.setMinutes(0, 0, 0);
const hourlyWindow = currentHour.toISOString();
const now = new Date();
const currentMinute = now.getMinutes();

for (const feed of feeds) {
const idempotencyKey = `${feed.id}-${hourlyWindow}`;
const targetMinute = getFeedMinuteOffset(feed.id);

// Calculate delay: if target minute has passed, schedule for next hour
let delayMinutes = targetMinute - currentMinute;
if (delayMinutes < 0) {
delayMinutes += 60;
}
const delayMs = delayMinutes * 60 * 1000;

logger.debug(
`[feed] Scheduling feed ${feed.id} at minute ${targetMinute} (delay: ${delayMinutes} minutes)`,
);

FeedQueue.enqueue(
{
feedId: feed.id,
},
{
idempotencyKey,
groupId: feed.userId,
delayMs,
},
Comment on lines +48 to 74
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Idempotency key can allow duplicate jobs across hour boundaries.

When the cron job runs late and a feed's target minute has already passed in the current hour, the feed is scheduled for the next hour (lines 56-59). However, the idempotencyKey is always based on hourlyWindow (the current hour, line 52), not the actual hour when the feed will execute.

Scenario:

  1. Cron runs late at 14:02 (instead of 14:00)
  2. Feed with targetMinute = 1 gets scheduled for 15:01 with idempotencyKey = "feedId-2025-12-06T14:00:00Z"
  3. At 15:00, cron runs again and schedules the same feed for 15:01 with idempotencyKey = "feedId-2025-12-06T15:00:00Z"
  4. Result: Two jobs run at 15:01 for the same feed

Fix: Base the idempotency key on the actual target hour, not the current hour:

 const now = new Date();
 const currentMinute = now.getMinutes();

 for (const feed of feeds) {
-  const idempotencyKey = `${feed.id}-${hourlyWindow}`;
   const targetMinute = getFeedMinuteOffset(feed.id);

   // Calculate delay: if target minute has passed, schedule for next hour
   let delayMinutes = targetMinute - currentMinute;
   if (delayMinutes < 0) {
     delayMinutes += 60;
   }
   const delayMs = delayMinutes * 60 * 1000;

+  // Base idempotency key on the actual hour the feed will run
+  const targetHour = new Date(now.getTime() + delayMs);
+  targetHour.setMinutes(0, 0, 0);
+  const targetHourlyWindow = targetHour.toISOString();
+  const idempotencyKey = `${feed.id}-${targetHourlyWindow}`;
+
   logger.debug(
     `[feed] Scheduling feed ${feed.id} at minute ${targetMinute} (delay: ${delayMinutes} minutes)`,
   );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const now = new Date();
const currentMinute = now.getMinutes();
for (const feed of feeds) {
const idempotencyKey = `${feed.id}-${hourlyWindow}`;
const targetMinute = getFeedMinuteOffset(feed.id);
// Calculate delay: if target minute has passed, schedule for next hour
let delayMinutes = targetMinute - currentMinute;
if (delayMinutes < 0) {
delayMinutes += 60;
}
const delayMs = delayMinutes * 60 * 1000;
logger.debug(
`[feed] Scheduling feed ${feed.id} at minute ${targetMinute} (delay: ${delayMinutes} minutes)`,
);
FeedQueue.enqueue(
{
feedId: feed.id,
},
{
idempotencyKey,
groupId: feed.userId,
delayMs,
},
const now = new Date();
const currentMinute = now.getMinutes();
for (const feed of feeds) {
const targetMinute = getFeedMinuteOffset(feed.id);
// Calculate delay: if target minute has passed, schedule for next hour
let delayMinutes = targetMinute - currentMinute;
if (delayMinutes < 0) {
delayMinutes += 60;
}
const delayMs = delayMinutes * 60 * 1000;
// Base idempotency key on the actual hour the feed will run
const targetHour = new Date(now.getTime() + delayMs);
targetHour.setMinutes(0, 0, 0);
const targetHourlyWindow = targetHour.toISOString();
const idempotencyKey = `${feed.id}-${targetHourlyWindow}`;
logger.debug(
`[feed] Scheduling feed ${feed.id} at minute ${targetMinute} (delay: ${delayMinutes} minutes)`,
);
FeedQueue.enqueue(
{
feedId: feed.id,
},
{
idempotencyKey,
groupId: feed.userId,
delayMs,
},
🤖 Prompt for AI Agents
In apps/workers/workers/feedWorker.ts around lines 48-74, the idempotencyKey is
built using the current hourlyWindow which can differ from the actual execution
hour when a feed is pushed to the next hour; compute the actual execution time
(e.g. executionTime = new Date(now.getTime() + delayMs)) or derive the
targetHour by adding one hour when delayMinutes < 0, then build the
idempotencyKey from that execution hour (formatted the same way as hourlyWindow)
instead of the current hourlyWindow so jobs scheduled across an hour boundary
share a stable, correct idempotency key.

);
}
Expand Down