-
Notifications
You must be signed in to change notification settings - Fork 49
Add S3 backend storage and update docs #996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add S3 backend storage and update docs #996
Conversation
Co-authored-by: adiswa123 <[email protected]>
Cursor Agent can help with this pull request. Just |
|
WalkthroughAdds S3 as a new storage backend: documentation, examples, and configuration updated; Env type extended; new S3Storage implementation introduced; storage selection updated to prioritize S3 > KV > R2; tests added for S3Storage and selection logic; baseline wrangler comments expanded. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Turborepo Client
participant Worker as Cloudflare Worker
participant Manager as StorageManager
participant S3 as S3Storage
participant KV as KVStorage
participant R2 as R2Storage
Note over Manager: Initialization
Worker->>Manager: new StorageManager(env)
alt S3 credentials present
Manager->>S3: init(accessKeyId, secret, bucket, region)
Note right of Manager: Active storage = S3
else KV configured
Manager->>KV: init(binding)
Note right of Manager: Active storage = KV
else R2 configured
Manager->>R2: init(binding)
Note right of Manager: Active storage = R2
else
Manager-->>Worker: throw "No storage configured"
end
Note over Client,Worker: Example: read artifact
Client->>Worker: GET /artifact/:key
Worker->>Manager: getActiveStorage()
alt S3 active
Worker->>S3: readWithMetadata(key)
S3-->>Worker: { data, metadata }
else KV active
Worker->>KV: readWithMetadata(key)
KV-->>Worker: { data, metadata }
else R2 active
Worker->>R2: readWithMetadata(key)
R2-->>Worker: { data, metadata }
end
Worker-->>Client: 200 data (headers may include metadata)
rect rgba(230,245,255,0.5)
Note over Worker,S3: New/changed flow emphasizes S3 precedence
end
sequenceDiagram
participant Worker as Cloudflare Worker
participant S3 as S3Storage
participant AWS as Amazon S3 API
Note over Worker,S3: Write with optional user metadata
Worker->>S3: write(key, data, { buildId, team })
S3->>AWS: PUT /{bucket}/{key}\nHeaders: x-amz-meta-buildid, x-amz-meta-team
AWS-->>S3: 200 OK
S3-->>Worker: void
Note over Worker,S3: List with pagination
Worker->>S3: listWithMetadata({ prefix, limit, cursor })
S3->>AWS: GET /{bucket}?list-type=2&prefix=...&max-keys=...&continuation-token=...
AWS-->>S3: 200 OK + XML (Contents, NextContinuationToken)
S3-->>Worker: { keys: [{key, metadata}], cursor, truncated }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
🌩 Deploying with Cloudflare Pages
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (17)
docs/index.md (1)
26-28
: Wording update looks good; keep storage list consistent across docs"R2, KV, or S3" reads clearly and aligns with the new backend. No further changes needed here.
If you want absolute consistency with other pages that lead with S3 (due to precedence elsewhere), consider "S3, KV, or R2" – purely stylistic.
src/index.ts (1)
9-12
: Env type extension for S3 is correct; add brief field docs to prevent misuseThe optional fields are typed appropriately. To reduce developer confusion between secrets vs non-secret config, add inline docs noting that credentials must be set via Wrangler secrets, while bucket/region can be plain vars.
Apply this diff to annotate the fields:
export type Env = { ENVIRONMENT: 'development' | 'production'; R2_STORE?: R2Bucket; KV_STORE?: KVNamespace; - S3_ACCESS_KEY_ID?: string; - S3_SECRET_ACCESS_KEY?: string; - S3_BUCKET_NAME?: string; - S3_REGION?: string; + /** Secret: set via `wrangler secret put S3_ACCESS_KEY_ID` */ + S3_ACCESS_KEY_ID?: string; + /** Secret: set via `wrangler secret put S3_SECRET_ACCESS_KEY` */ + S3_SECRET_ACCESS_KEY?: string; + /** Non-secret config: may be set in vars or secrets */ + S3_BUCKET_NAME?: string; + /** Non-secret config: may be set in vars or secrets; defaults to 'us-east-1' if omitted */ + S3_REGION?: string; TURBO_TOKEN: string; BUCKET_OBJECT_EXPIRATION_HOURS: number; STORAGE_MANAGER: StorageManager; };docs/configuration/project-configuration.md (2)
11-14
: Minor copy edits for bullets (punctuation and parallelism)Tighten the bullets with terminal periods and consistent phrasing.
Apply this diff:
-- **🪣 [R2 Storage](/configuration/r2-storage)**: Cloudflare's object storage with zero egress fees -- **🔑 [KV Storage](/configuration/kv-storage)**: Cloudflare's key-value storage with global distribution -- **☁️ [S3 Storage](/configuration/s3-storage)**: Amazon S3 for maximum compatibility and flexibility +- **🪣 [R2 Storage](/configuration/r2-storage)**: Cloudflare's object storage with zero egress fees. +- **🔑 [KV Storage](/configuration/kv-storage)**: Cloudflare's key-value storage with global distribution. +- **☁️ [S3 Storage](/configuration/s3-storage)**: Amazon S3 for maximum compatibility and flexibility.
15-16
: Clarify precedence scope (selection, not failover/replication)Explicitly state that only one backend is active at a time and there’s no automatic cross-backend failover or replication.
Apply this diff:
-The storage priority order is: S3 > KV > R2. When multiple storage options are configured, the highest priority one will be used. +The storage priority order is: S3 > KV > R2. When multiple storage options are configured, only the highest-priority backend is used for all operations (no automatic cross-backend replication or failover).README.md (2)
30-31
: Minor wording polish to avoid repetitionStreamline the sentence to read more crisply.
Apply this diff:
-- 💿 **Storage Options**: Choose between 🪣 [R2](https://adirishi.github.io/turborepo-remote-cache-cloudflare/configuration/r2-storage), 🔑 [KV](https://adirishi.github.io/turborepo-remote-cache-cloudflare/configuration/kv-storage), or ☁️ [S3](https://adirishi.github.io/turborepo-remote-cache-cloudflare/configuration/s3-storage) storage for your build artifacts. This gives you the flexibility to choose the storage option that best fits your needs. +- 💿 **Storage Options**: Choose between 🪣 [R2](https://adirishi.github.io/turborepo-remote-cache-cloudflare/configuration/r2-storage), 🔑 [KV](https://adirishi.github.io/turborepo-remote-cache-cloudflare/configuration/kv-storage), or ☁️ [S3](https://adirishi.github.io/turborepo-remote-cache-cloudflare/configuration/s3-storage) for your build artifacts, providing the flexibility to pick what best fits your needs.
73-79
: README still mentions “bound R2 bucket” for deletion; update to “configured storage backend”Elsewhere (docs/configuration/project-configuration.md) you’ve generalized deletion to the active backend. Mirror that here for consistency.
Apply this diff:
-This project sets up a [cron trigger](https://developers.cloudflare.com/workers/platform/triggers/cron-triggers/) for Cloudflare workers, which automatically deletes old cache files within the bound R2 bucket. This behavior can be customized: +This project sets up a [cron trigger](https://developers.cloudflare.com/workers/platform/triggers/cron-triggers/) for Cloudflare Workers, which automatically deletes old cache files within the configured storage backend. This behavior can be customized:wrangler.jsonc (1)
13-17
: Differentiate secrets vs non-secret vars in comments to guide usersCredentials should be set as Wrangler secrets, while bucket name and region can be plain vars. Adjust the comments to make this explicit.
Apply this diff:
- // - S3_ACCESS_KEY_ID (AWS access key ID for S3 storage) - // - S3_SECRET_ACCESS_KEY (AWS secret access key for S3 storage) - // - S3_BUCKET_NAME (S3 bucket name for storage) - // - S3_REGION (AWS region for S3 bucket, defaults to us-east-1) - // Run `echo <VALUE> | wrangler secret put <NAME>` for each of these + // Secrets (set via `echo <VALUE> | wrangler secret put <NAME>`): + // - S3_ACCESS_KEY_ID (AWS access key ID for S3 storage) + // - S3_SECRET_ACCESS_KEY (AWS secret access key for S3 storage) + // + // Non-secret config (can be set here in "vars" or via secrets if preferred): + // - S3_BUCKET_NAME (S3 bucket name for storage) + // - S3_REGION (AWS region for the S3 bucket; defaults to us-east-1)docs/configuration/s3-storage.md (2)
84-87
: Clarify that R2/KV can remain configured; S3 already takes precedence.Commenting out R2/KV is optional because the code prioritizes S3 > KV > R2. This change avoids implying that other bindings must be removed.
- // Comment out R2 and KV configurations when using S3 + // Optional: You can keep R2 and KV configured; S3 takes precedence automatically. + // Remove these bindings only if you don't want to provision them for this Worker. // "r2_buckets": [...], // "kv_namespaces": [...]
99-99
: Capitalize “GitHub Actions”.-Once you've updated your Worker script and `wrangler.jsonc` file, deploy your Worker using the Wrangler CLI or your GitHub actions workflow. +Once you've updated your Worker script and `wrangler.jsonc` file, deploy your Worker using the Wrangler CLI or your GitHub Actions workflow.examples/README.md (1)
20-26
: Note the default for S3_REGION to reduce confusion.Call out that S3_REGION is optional and defaults to us-east-1, matching the implementation and docs.
For S3 storage, you'll need to set these secrets: ```bash echo "your-access-key-id" | wrangler secret put S3_ACCESS_KEY_ID echo "your-secret-access-key" | wrangler secret put S3_SECRET_ACCESS_KEY
+Note: S3_REGION is optional; it defaults to "us-east-1" if not provided.
</blockquote></details> <details> <summary>examples/s3-config.jsonc (1)</summary><blockquote> `10-15`: **Add secrets reminder and quote TTL in examples/s3-config.jsonc** Wrangler’s `vars` values are always treated as strings, and sensitive keys should never live in source. • File: examples/s3-config.jsonc (around lines 10–15) • Wrap `BUCKET_OBJECT_EXPIRATION_HOURS` in quotes to satisfy the schema • Add a JSONC comment directing users to use `wrangler secret put` for their S3 credentials Suggested diff: ```diff - "vars": { + // Set S3 credentials via Wrangler secrets; do not include them in "vars": + // wrangler secret put S3_ACCESS_KEY_ID + // wrangler secret put S3_SECRET_ACCESS_KEY + "vars": { "ENVIRONMENT": "production", - "BUCKET_OBJECT_EXPIRATION_HOURS": 720, + "BUCKET_OBJECT_EXPIRATION_HOURS": "720", "S3_BUCKET_NAME": "your-turborepo-cache-bucket", "S3_REGION": "us-east-1", },
tests/storage/storage-manager.test.ts (1)
19-23
: Make tests resilient by explicitly clearing all S3 fields where S3 should not be chosen.Future default envs could set S3 creds, which would break these tests. Clear all S3 fields in the scenarios that should not pick S3.
test('getActiveStorage() returns r2 if available', () => { - const r2OnlyEnv = { ...workerEnv, KV_STORE: undefined }; + const r2OnlyEnv = { + ...workerEnv, + KV_STORE: undefined, + S3_ACCESS_KEY_ID: undefined, + S3_SECRET_ACCESS_KEY: undefined, + S3_BUCKET_NAME: undefined, + }; storageManager = new StorageManager(r2OnlyEnv); expect(storageManager.getActiveStorage()).toBeInstanceOf(R2Storage); }); test('getActiveStorage() returns kv if r2 is not available', () => { - const kvOnlyEnv = { ...workerEnv, R2_STORE: undefined }; + const kvOnlyEnv = { + ...workerEnv, + R2_STORE: undefined, + S3_ACCESS_KEY_ID: undefined, + S3_SECRET_ACCESS_KEY: undefined, + S3_BUCKET_NAME: undefined, + }; storageManager = new StorageManager(kvOnlyEnv); expect(storageManager.getActiveStorage()).toBeInstanceOf(KvStorage); }); test('getActiveStorage() throws if no storage is available', () => { const emptyEnv = { ...workerEnv, R2_STORE: undefined, KV_STORE: undefined, S3_ACCESS_KEY_ID: undefined, + S3_SECRET_ACCESS_KEY: undefined, + S3_BUCKET_NAME: undefined, }; expect(() => new StorageManager(emptyEnv)).toThrowError('No storage provided'); }); test('getActiveStorage() returns kv if both r2 and kv are available', () => { - const r2KvEnv = { ...workerEnv, S3_ACCESS_KEY_ID: undefined }; + const r2KvEnv = { + ...workerEnv, + S3_ACCESS_KEY_ID: undefined, + S3_SECRET_ACCESS_KEY: undefined, + S3_BUCKET_NAME: undefined, + }; storageManager = new StorageManager(r2KvEnv); expect(storageManager.getActiveStorage()).toBeInstanceOf(KvStorage); });Also applies to: 25-29, 31-38, 41-45
tests/storage/s3-storage.test.ts (1)
253-262
: Add coverage for keys containing slashes and special characters.S3 keys commonly include path-like prefixes. Add tests to ensure URLs don’t encode “/” within keys and that spaces/special chars are encoded correctly. This will prevent regressions around path encoding.
test('writes data with key containing slashes and spaces', async () => { mockFetch.mockResolvedValueOnce({ ok: true, status: 200 }); await storage.write('foo/bar baz.txt', 'v'); expect(mockFetch).toHaveBeenCalledWith( 'https://test-bucket.s3.us-east-1.amazonaws.com/foo/bar%20baz.txt', expect.objectContaining({ method: 'PUT' } as RequestInit), ); }); test('reads data with key containing slashes', async () => { mockFetch.mockResolvedValueOnce({ ok: true, status: 200, body: new ReadableStream({ start(c) { c.enqueue(new TextEncoder().encode('ok')); c.close(); } }), headers: new Headers({ 'last-modified': 'Wed, 01 Jan 2024 00:00:00 GMT' }), }); await storage.read('a/b/c'); expect(mockFetch).toHaveBeenCalledWith( 'https://test-bucket.s3.us-east-1.amazonaws.com/a/b/c', expect.objectContaining({ method: 'GET' } as RequestInit), ); });Also applies to: 268-279
src/storage/s3-storage.ts (4)
36-41
: Nit: avoid trailing ampersand in ListObjectsV2 URL when no params.Current construction can yield …/?list-type=2&. Not harmful, but easy to tidy.
- const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/?list-type=2&${params.toString()}`; + const qs = params.toString(); + const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/?list-type=2${qs ? `&${qs}` : ''}`;
145-162
: Metadata extraction is sensible; consider stricter fallback.Minor: using Date.now() as fallback mixes numeric and string overloads. Using new Date() keeps intent crystal-clear.
- createdAt: new Date(headers.get('last-modified') ?? Date.now()), + createdAt: new Date(headers.get('last-modified') ?? new Date()),
164-202
: Regex-based XML parsing is brittle; consider minimal XML parsing.Regex parsing will break if AWS introduces whitespace/layout changes or namespaces. In Workers, DOMParser may not be available; a tiny, dependency-free parser for the few tags you need or fast-xml-parser (small and robust) would be safer. Not blocking given current tests.
If you want, I can draft a minimal, dependency-free parser tailored to the current fields (<Key/LastModified>, , ).
5-5
: Testability: inject AwsClient or relax visibility (optional).Tests had to poke a private field. Prefer DI or protected visibility:
- Option A (DI): add an optional client param.
- Option B (protected): change s3Client to protected so a test subclass can override it.
Example (DI):
constructor( accessKeyId: string, secretAccessKey: string, bucketName: string, region = 'us-east-1', client?: AwsClient ) { this.s3Client = client ?? new AwsClient({ accessKeyId, secretAccessKey, region }); this.bucketName = bucketName; this.region = region; }This removes the need for private-field hacks in tests.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (2)
package.json
is excluded by!**/*.json
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
,!**/*.yaml
📒 Files selected for processing (12)
README.md
(1 hunks)docs/configuration/project-configuration.md
(1 hunks)docs/configuration/s3-storage.md
(1 hunks)docs/index.md
(1 hunks)examples/README.md
(1 hunks)examples/s3-config.jsonc
(1 hunks)src/index.ts
(1 hunks)src/storage/s3-storage.ts
(1 hunks)src/storage/storage-manager.ts
(2 hunks)tests/storage/s3-storage.test.ts
(1 hunks)tests/storage/storage-manager.test.ts
(2 hunks)wrangler.jsonc
(1 hunks)
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/contributing-guidelines.mdc:0-0
Timestamp: 2025-06-24T14:11:47.733Z
Learning: When adding a new storage backend, ensure it implements StorageInterface, is initialized in storage-manager, has corresponding tests, and is configured in wrangler.jsonc.
📚 Learning: 2025-06-24T14:12:11.842Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/project-overview.mdc:0-0
Timestamp: 2025-06-24T14:12:11.842Z
Learning: The storage layer is pluggable, supporting both Cloudflare R2 and KV backends, which allows for flexible deployment and storage strategies.
Applied to files:
docs/index.md
docs/configuration/project-configuration.md
📚 Learning: 2025-06-24T14:12:02.195Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/development-workflow.mdc:0-0
Timestamp: 2025-06-24T14:12:02.195Z
Learning: Environment variables such as TURBO_TOKEN, BUCKET_OBJECT_EXPIRATION_HOURS, and ENVIRONMENT should be managed securely, preferably via Wrangler secrets.
Applied to files:
wrangler.jsonc
src/index.ts
📚 Learning: 2025-06-24T14:11:47.733Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/contributing-guidelines.mdc:0-0
Timestamp: 2025-06-24T14:11:47.733Z
Learning: Always use Wrangler secrets for sensitive environment variables and document which variables are required versus optional.
Applied to files:
wrangler.jsonc
📚 Learning: 2025-06-24T14:12:20.659Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/storage-architecture.mdc:0-0
Timestamp: 2025-06-24T14:12:20.659Z
Learning: Storage backends are configured through Cloudflare Workers environment bindings, specifically R2_STORE for R2, KV_STORE for KV, and BUCKET_OBJECT_EXPIRATION_HOURS for TTL management.
Applied to files:
src/index.ts
src/storage/storage-manager.ts
docs/configuration/project-configuration.md
📚 Learning: 2025-06-24T14:12:20.659Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/storage-architecture.mdc:0-0
Timestamp: 2025-06-24T14:12:20.659Z
Learning: The storage manager (src/storage/storage-manager.ts) is responsible for initializing available storage backends based on environment bindings, prioritizing KV over R2, and providing utility methods for stream conversion. It throws an InvalidStorageError if no storage is configured.
Applied to files:
src/index.ts
src/storage/storage-manager.ts
tests/storage/storage-manager.test.ts
📚 Learning: 2025-06-24T14:12:02.195Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/development-workflow.mdc:0-0
Timestamp: 2025-06-24T14:12:02.195Z
Learning: Cron jobs for scheduled tasks (e.g., cache cleanup) should be defined in both code (e.g., src/crons/deleteOldCache.ts) and configuration (wrangler.jsonc) for clarity and reliability.
Applied to files:
examples/s3-config.jsonc
📚 Learning: 2025-06-24T14:12:20.659Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/storage-architecture.mdc:0-0
Timestamp: 2025-06-24T14:12:20.659Z
Learning: All storage backends in the project must implement a common StorageInterface, which includes methods for listing, reading, writing, and deleting objects, with support for optional metadata. This ensures pluggability and consistency across storage implementations.
Applied to files:
src/storage/storage-manager.ts
src/storage/s3-storage.ts
📚 Learning: 2025-06-24T14:12:20.659Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/storage-architecture.mdc:0-0
Timestamp: 2025-06-24T14:12:20.659Z
Learning: Storage backend selection follows a strict priority: KV is used if available, otherwise R2 is used, and an error is thrown if neither is configured.
Applied to files:
src/storage/storage-manager.ts
📚 Learning: 2025-06-24T14:12:02.195Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/development-workflow.mdc:0-0
Timestamp: 2025-06-24T14:12:02.195Z
Learning: Testing should be performed using Vitest with a Cloudflare Workers-specific test environment, and tests should mirror the src/ directory structure for clarity and maintainability.
Applied to files:
tests/storage/storage-manager.test.ts
📚 Learning: 2025-06-24T14:12:20.659Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/storage-architecture.mdc:0-0
Timestamp: 2025-06-24T14:12:20.659Z
Learning: Cloudflare R2 storage is intended for large artifacts and long-term storage, supports custom metadata and automatic expiration, and is configured via the R2_STORE binding.
Applied to files:
docs/configuration/project-configuration.md
📚 Learning: 2025-06-24T14:12:20.659Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/storage-architecture.mdc:0-0
Timestamp: 2025-06-24T14:12:20.659Z
Learning: Cloudflare KV storage is best suited for smaller artifacts due to its 25MB per value size limitation and offers faster access. It uses BUCKET_OBJECT_EXPIRATION_HOURS for TTL and is configured via the KV_STORE binding.
Applied to files:
docs/configuration/project-configuration.md
📚 Learning: 2025-06-24T14:12:02.195Z
Learnt from: CR
PR: AdiRishi/turborepo-remote-cache-cloudflare#0
File: .cursor/rules/development-workflow.mdc:0-0
Timestamp: 2025-06-24T14:12:02.195Z
Learning: Use R2 and KV bindings for object and key-value storage in Cloudflare Workers, and reference them explicitly in configuration and code.
Applied to files:
docs/configuration/project-configuration.md
🧬 Code Graph Analysis (4)
tests/storage/s3-storage.test.ts (2)
src/storage/s3-storage.ts (1)
S3Storage
(4-213)src/storage/storage-manager.ts (1)
StorageManager
(7-58)
src/storage/storage-manager.ts (2)
src/storage/interface.d.ts (1)
StorageInterface
(1-10)src/storage/s3-storage.ts (1)
S3Storage
(4-213)
tests/storage/storage-manager.test.ts (2)
src/storage/storage-manager.ts (1)
StorageManager
(7-58)src/storage/s3-storage.ts (1)
S3Storage
(4-213)
src/storage/s3-storage.ts (1)
src/storage/interface.d.ts (2)
WritableValue
(12-12)Metadata
(32-37)
🪛 LanguageTool
examples/README.md
[grammar] ~14-~14: There might be a mistake here.
Context: ...le to your project root 2. Rename it to wrangler.jsonc
3. Update the values to match your setup 4....
(QB_NEW_EN)
docs/configuration/s3-storage.md
[grammar] ~43-~43: There might be a mistake here.
Context: ...ck Users
in the left sidebar 3. Click Create user
4. Enter a username and select `Programmati...
(QB_NEW_EN)
[grammar] ~44-~44: There might be a mistake here.
Context: ...te user4. Enter a username and select
Programmatic access5. Click
Next: Permissions6. Click
Atta...
(QB_NEW_EN)
[grammar] ~45-~45: There might be a mistake here.
Context: ...d select Programmatic access
5. Click Next: Permissions
6. Click `Attach existing policies directly...
(QB_NEW_EN)
[grammar] ~46-~46: There might be a mistake here.
Context: ...5. Click
Next: Permissions6. Click
Attach existing policies directly7. Search for and select
AmazonS3FullAcces...
(QB_NEW_EN)
[grammar] ~105-~105: There might be a mistake here.
Context: ...ription | Default | | ---------------------- | -------- | --...
(QB_NEW_EN)
[grammar] ~106-~106: There might be a mistake here.
Context: ...------------------------ | ----------- | | S3_ACCESS_KEY_ID
| Yes | AW...
(QB_NEW_EN)
[grammar] ~107-~107: There might be a mistake here.
Context: ...access key ID | - | | S3_SECRET_ACCESS_KEY
| Yes | AW...
(QB_NEW_EN)
[grammar] ~108-~108: There might be a mistake here.
Context: ...secret access key | - | | S3_BUCKET_NAME
| Yes | S3...
(QB_NEW_EN)
[grammar] ~109-~109: There might be a mistake here.
Context: ...ucket name | - | | S3_REGION
| No | AW...
(QB_NEW_EN)
[grammar] ~113-~113: There might be a mistake here.
Context: ...storage priority order is: S3 > KV > R2. :::
(QB_NEW_EN)
README.md
[grammar] ~26-~26: There might be a mistake here.
Context: ...iguration) ## 🤔 Why should I use this? If you're a Turborepo user, this project ...
(QB_NEW_EN)
[style] ~30-~30: Consider replacing ‘gives’ with a different word to let your writing stand out.
Context: ... storage for your build artifacts. This gives you the flexibility to choose the storage o...
(GIVE_TIME_STYLE)
docs/configuration/project-configuration.md
[grammar] ~12-~12: There might be a mistake here.
Context: ...bject storage with zero egress fees - **🔑 KV Storage...
(QB_NEW_EN)
[grammar] ~13-~13: There might be a mistake here.
Context: ...e storage with global distribution - **☁️ S3 Storage...
(QB_NEW_EN)
[grammar] ~14-~14: There might be a mistake here.
Context: ... maximum compatibility and flexibility The storage priority order is: S3 > KV > R...
(QB_NEW_EN)
🪛 GitHub Check: Check correctness
tests/storage/s3-storage.test.ts
[failure] 29-29:
Conversion of type 'this' to type '{ s3Client: { fetch: Mock; }; }' may be a mistake because neither type sufficiently overlaps with the other. If this was intentional, convert the expression to 'unknown' first.
🪛 GitHub Actions: CI
tests/storage/s3-storage.test.ts
[error] 29-29: tsc --noEmit: TS2352: Conversion of type 'this' to type '{ s3Client: { fetch: Mock; }; }' may be a mistake because neither type sufficiently overlaps with the other. If this was intentional, convert the expression to 'unknown' first.
🔇 Additional comments (7)
docs/configuration/project-configuration.md (1)
19-21
: Deletion behavior wording aligns with new priority modelThe reference to “configured storage backend” is accurate and prevents confusion when S3 or KV is active instead of R2.
README.md (1)
23-25
: IMPORTANT note correctly points to consolidated configuration docsThe link target is appropriate with S3 now supported. No changes required.
src/storage/storage-manager.ts (2)
33-40
: Priority S3 > KV > R2 is implemented cleanly.Selection logic is straightforward and preserves previous fallbacks. The non-null assertion on R2 is safe due to the earlier "No storage provided" guard.
21-28
: ✅ S3 Initialization and Env Type Verified
- Confirmed
Env
type in src/index.ts (lines 9–12) declaresS3_ACCESS_KEY_ID
,S3_SECRET_ACCESS_KEY
,S3_BUCKET_NAME
, andS3_REGION
.- Initialization correctly checks for all required bindings and falls back to
'us-east-1'
whenS3_REGION
is undefined.S3 setup aligns with the docs’ default region. No changes needed.
tests/storage/storage-manager.test.ts (1)
47-67
: Good coverage for S3 precedence.These tests clearly assert that S3 wins when present, including when all backends are configured.
tests/storage/s3-storage.test.ts (1)
37-122
: Solid, comprehensive test coverage.The suite validates list/read/write/delete, pagination, metadata extraction, and error paths with a clean aws4fetch mock. Nice use of StorageManager helpers for stream conversions.
Also applies to: 124-159, 161-209, 211-243, 245-292, 294-354
src/storage/s3-storage.ts (1)
4-22
: Implements StorageInterface cleanly; constructor and client wiring look good.Nice alignment with the common StorageInterface (per our project learnings). Region defaulting and AwsClient setup are straightforward.
7. Search for and select `AmazonS3FullAccess` (or create a custom policy with minimal S3 permissions) | ||
8. Complete the user creation process | ||
9. Save the Access Key ID and Secret Access Key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid recommending AmazonS3FullAccess; prefer a minimal, bucket-scoped policy.
Granting AmazonS3FullAccess is overly permissive. Recommend attaching the minimal policy (shown below) and explicitly discourage FullAccess in production.
Apply this diff to Step 7:
-7. Search for and select `AmazonS3FullAccess` (or create a custom policy with minimal S3 permissions)
+7. Create and attach a custom policy with minimal S3 permissions (recommended). Avoid using `AmazonS3FullAccess` in production.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
7. Search for and select `AmazonS3FullAccess` (or create a custom policy with minimal S3 permissions) | |
8. Complete the user creation process | |
9. Save the Access Key ID and Secret Access Key | |
7. Create and attach a custom policy with minimal S3 permissions (recommended). Avoid using `AmazonS3FullAccess` in production. | |
8. Complete the user creation process | |
9. Save the Access Key ID and Secret Access Key |
🤖 Prompt for AI Agents
docs/configuration/s3-storage.md around lines 47 to 49: the doc currently
recommends selecting AmazonS3FullAccess which is overly permissive; update Step
7 to instruct attaching a minimal, bucket-scoped policy instead (and reference
the policy example shown elsewhere in the doc), explicitly discourage using
AmazonS3FullAccess in production, and replace the line with wording that tells
users to create or attach a custom policy limited to the specific bucket and
required actions (GetObject, PutObject, ListBucket, DeleteObject as needed) and
to save the Access Key ID and Secret Access Key after completing user creation.
const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${encodeURIComponent(key)}`; | ||
const response = await this.s3Client.fetch(url, { method: 'GET' }); | ||
|
||
if (response.status === 404) { | ||
return { data: undefined, metadata: undefined }; | ||
} | ||
|
||
if (!response.ok) { | ||
throw new Error(`S3 read failed: ${response.status} ${response.statusText}`); | ||
} | ||
|
||
const metadata = this.extractMetadataFromHeaders(response.headers); | ||
return { data: response.body ?? undefined, metadata }; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: don’t encode slashes in S3 object keys (encode segments, not the whole key).
Using encodeURIComponent on the entire key will encode “/” as “%2F”, breaking keys with prefixes (e.g., folder/object). Encode per segment or preserve slashes.
Apply this diff and helper:
- const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${encodeURIComponent(key)}`;
+ const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${this.encodeS3Key(key)}`;
Add this helper inside the class:
private encodeS3Key(key: string): string {
// Preserve path separators; percent-encode each segment
return key.split('/').map(encodeURIComponent).join('/');
}
🤖 Prompt for AI Agents
In src/storage/s3-storage.ts around lines 66 to 79, the code incorrectly uses
encodeURIComponent on the entire S3 key which encodes slashes and breaks
prefixed keys; replace that with a per-segment encoder and update the fetch URL
to use the new helper. Add a private method on the class named encodeS3Key that
splits the key on '/', applies encodeURIComponent to each segment, and rejoins
with '/', then change the URL construction to use this.encodeS3Key(key) instead
of encodeURIComponent(key). Ensure the helper is placed inside the class (near
other private helpers) and used wherever object keys are URL-encoded.
async write(key: string, data: WritableValue, metadata?: Record<string, string>) { | ||
const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${encodeURIComponent(key)}`; | ||
|
||
const headers: Record<string, string> = {}; | ||
if (metadata) { | ||
Object.entries(metadata).forEach(([k, v]) => { | ||
headers[`x-amz-meta-${k}`] = v; | ||
}); | ||
} | ||
|
||
const response = await this.s3Client.fetch(url, { | ||
method: 'PUT', | ||
body: data, | ||
headers, | ||
}); | ||
|
||
if (!response.ok) { | ||
throw new Error(`S3 write failed: ${response.status} ${response.statusText}`); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Same key-encoding issue on write().
Use the same segment-wise encoding to avoid breaking keys containing “/”.
- const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${encodeURIComponent(key)}`;
+ const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${this.encodeS3Key(key)}`;
Note: headers mapping to x-amz-meta-* looks good.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async write(key: string, data: WritableValue, metadata?: Record<string, string>) { | |
const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${encodeURIComponent(key)}`; | |
const headers: Record<string, string> = {}; | |
if (metadata) { | |
Object.entries(metadata).forEach(([k, v]) => { | |
headers[`x-amz-meta-${k}`] = v; | |
}); | |
} | |
const response = await this.s3Client.fetch(url, { | |
method: 'PUT', | |
body: data, | |
headers, | |
}); | |
if (!response.ok) { | |
throw new Error(`S3 write failed: ${response.status} ${response.statusText}`); | |
} | |
} | |
async write(key: string, data: WritableValue, metadata?: Record<string, string>) { | |
const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${this.encodeS3Key(key)}`; | |
const headers: Record<string, string> = {}; | |
if (metadata) { | |
Object.entries(metadata).forEach(([k, v]) => { | |
headers[`x-amz-meta-${k}`] = v; | |
}); | |
} | |
const response = await this.s3Client.fetch(url, { | |
method: 'PUT', | |
body: data, | |
headers, | |
}); | |
if (!response.ok) { | |
throw new Error(`S3 write failed: ${response.status} ${response.statusText}`); | |
} | |
} |
🤖 Prompt for AI Agents
In src/storage/s3-storage.ts around lines 86 to 105, the write() method
currently encodes the entire key with encodeURIComponent which breaks keys
containing "/" — change to encode each path segment separately and rejoin with
"/" so slashes are preserved (e.g., split key on "/", encodeURIComponent each
segment, then join with "/"), then use that segment-encodedKey when building the
S3 URL; leave the headers logic unchanged.
if (keys.length === 1) { | ||
// Single object delete | ||
const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${encodeURIComponent(keys[0])}`; | ||
const response = await this.s3Client.fetch(url, { method: 'DELETE' }); | ||
|
||
if (!response.ok && response.status !== 404) { | ||
throw new Error(`S3 delete failed: ${response.status} ${response.statusText}`); | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Same key-encoding issue on single-key delete().
Preserve “/” in keys during delete.
- const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${encodeURIComponent(keys[0])}`;
+ const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${this.encodeS3Key(keys[0])}`;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if (keys.length === 1) { | |
// Single object delete | |
const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${encodeURIComponent(keys[0])}`; | |
const response = await this.s3Client.fetch(url, { method: 'DELETE' }); | |
if (!response.ok && response.status !== 404) { | |
throw new Error(`S3 delete failed: ${response.status} ${response.statusText}`); | |
} | |
} else { | |
if (keys.length === 1) { | |
// Single object delete | |
const url = `https://${this.bucketName}.s3.${this.region}.amazonaws.com/${this.encodeS3Key(keys[0])}`; | |
const response = await this.s3Client.fetch(url, { method: 'DELETE' }); | |
if (!response.ok && response.status !== 404) { | |
throw new Error(`S3 delete failed: ${response.status} ${response.statusText}`); | |
} | |
} else { |
🤖 Prompt for AI Agents
In src/storage/s3-storage.ts around lines 110 to 118, the single-object delete
builds the S3 URL with encodeURIComponent(keys[0]) which encodes "/" and breaks
object paths; change the encoding so "/" is preserved (for example, use
encodeURIComponent(key).replace(/%2F/g, '/') or use an encoding helper that
preserves slashes) and apply the same approach as the multi-delete path to
ensure consistent behavior when deleting objects with "/" in their keys.
vi.mock('~/storage/s3-storage', async () => { | ||
const actual = await vi.importActual('~/storage/s3-storage'); | ||
return { | ||
...actual, | ||
S3Storage: class MockS3Storage extends (actual as { S3Storage: typeof S3Storage }).S3Storage { | ||
constructor( | ||
accessKeyId: string, | ||
secretAccessKey: string, | ||
bucketName: string, | ||
region = 'us-east-1' | ||
) { | ||
super(accessKeyId, secretAccessKey, bucketName, region); | ||
// Override the s3Client to use our mock | ||
(this as { s3Client: { fetch: typeof mockFetch } }).s3Client = { | ||
fetch: mockFetch, | ||
}; | ||
} | ||
}, | ||
}; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix CI: remove S3Storage re-mock and private-field override (redundant and type-unsafe).
This block is unnecessary because aws4fetch is already mocked above, so S3Storage will use the mocked AwsClient.fetch. The cast on Line 29 triggers TS2352 and breaks CI. Remove the entire S3Storage re-mock and the private-field override.
Apply this diff to delete the redundant mock:
-// Mock the S3Storage class to use our mock fetch
-vi.mock('~/storage/s3-storage', async () => {
- const actual = await vi.importActual('~/storage/s3-storage');
- return {
- ...actual,
- S3Storage: class MockS3Storage extends (actual as { S3Storage: typeof S3Storage }).S3Storage {
- constructor(
- accessKeyId: string,
- secretAccessKey: string,
- bucketName: string,
- region = 'us-east-1'
- ) {
- super(accessKeyId, secretAccessKey, bucketName, region);
- // Override the s3Client to use our mock
- (this as { s3Client: { fetch: typeof mockFetch } }).s3Client = {
- fetch: mockFetch,
- };
- }
- },
- };
-});
If you insist on keeping this pattern, convert via unknown or any to satisfy TypeScript and bypass privacy (not recommended):
- (this as { s3Client: { fetch: typeof mockFetch } }).s3Client = {
+ (this as unknown as { s3Client: { fetch: typeof mockFetch } }).s3Client = {
fetch: mockFetch,
};
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 GitHub Check: Check correctness
[failure] 29-29:
Conversion of type 'this' to type '{ s3Client: { fetch: Mock; }; }' may be a mistake because neither type sufficiently overlaps with the other. If this was intentional, convert the expression to 'unknown' first.
🪛 GitHub Actions: CI
[error] 29-29: tsc --noEmit: TS2352: Conversion of type 'this' to type '{ s3Client: { fetch: Mock; }; }' may be a mistake because neither type sufficiently overlaps with the other. If this was intentional, convert the expression to 'unknown' first.
🤖 Prompt for AI Agents
In tests/storage/s3-storage.test.ts around lines 16 to 35, the test re-mocks
S3Storage and overrides a private s3Client field which is redundant (aws4fetch
is already mocked) and causes a TypeScript TS2352 cast error; remove the entire
vi.mock block that redefines S3Storage and the private-field assignment so the
test uses the real S3Storage with the already-mocked AwsClient.fetch; if you
must keep this pattern instead of deleting, change the unsafe cast to use
unknown or any to bypass the privacy/type-check, but preferred fix is to delete
the redundant mock block.
afterEach(() => { | ||
vi.restoreAllMocks(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Restore real timers after each test.
Fake timers can leak across tests and cause subtle failures. Restore timers in afterEach.
afterEach(() => {
vi.restoreAllMocks();
+ vi.useRealTimers();
});
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
afterEach(() => { | |
vi.restoreAllMocks(); | |
}); | |
afterEach(() => { | |
vi.restoreAllMocks(); | |
vi.useRealTimers(); | |
}); |
🤖 Prompt for AI Agents
In tests/storage/s3-storage.test.ts around lines 49 to 51, the afterEach
currently only restores mocks but does not restore timers; update the afterEach
to also restore real timers by calling the Vitest API to switch back to real
timers (vi.useRealTimers()) so fake timers don't leak between tests.
Description
This PR introduces Amazon S3 as a new backend storage option for Turborepo Remote Cache, providing users with greater flexibility in choosing their artifact storage.
The S3 integration follows the existing patterns of R2 and KV storage, implementing the
StorageInterface
and leveragingaws4fetch
for a lightweight S3 client. S3 is set as the highest priority storage option (S3 > KV > R2) when configured.This change includes:
S3Storage
class for S3 operations.StorageManager
to support S3 and manage storage priority.wrangler.jsonc
examples.README.md
and project configuration documentation.Dependencies:
aws4fetch
Fixes # (issue)
How Has This Been Tested?
The following tests were executed to verify the changes:
S3Storage
: A dedicated test suite (tests/storage/s3-storage.test.ts
) was created, covering allStorageInterface
operations (list
,read
,write
,delete
), including single and multi-object deletion, metadata handling, and error cases.aws4fetch
was thoroughly mocked to ensure isolated testing.StorageManager
: The existingtests/storage/storage-manager.test.ts
was updated to confirm that S3 storage takes the correct priority when configured alongside R2 and KV.pnpm test
) to ensure no regressions were introduced. All tests passed successfully.pnpm lint
,pnpm format
) to ensure adherence to code style guidelines.