feat(enqueueLinks): add "allowedSubdomains" option for subdomain filtering in "same-domain" strategy #3098
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR introduces a new
enqueueLinks
option calledallowedSubdomains
which takes in a string array to filter user-defined subdomains and allows users to have simplified control of subdomain access more precisely. Furthermore, this includes new documentation and testing to ensure its capabilities work consistently.allowedSubdomains
- The newenqueueLinks
option which filters subdomains by user's choice.By default,
allowedSubdomains
is set to['*']
if not specified.Note: This option can only be used in EnqueueStrategy
same-domain
due to its natural behavior of allowing any subdomain under the same domain.Implementation
The enhanced
same-domain
strategy has several modifications that allow users to add specific subdomains intoenqueueStrategyPatterns
:same-domain
ifallowedSubdomains
is either set to['*']
or[]
, granting backwards compatibility.allowedSubdomains
when at least one subdomain is found.options.baseUrl
) intoenqueueStrategyPatterns
.allowedSubdomain
and sets the hostname of the newfilteredSubdomainUrl
.filteredSubdomainUrl
intoenqueueStrategyPatterns
while avoiding a duplicate of the URL origin.enqueueStrategyPatterns
.As it turns out, the major difference with this is replacing the asterisk that is in front of the domain normally in
same-domain
's former algorithm.Example
Assume that
allowedSubdomains: ['www', 'blog']
and the base URL ishttps://example.com
.Before (without
allowedSubdomains
):After (with
allowedSubdomains
):Use Cases
Here are the conditions that would be affected based on how
allowedSubdomains
is checked:allowedSubdomains: ['']
, it should still accept it as subdomain filtering because this means that there is no other subdomain that should be accepted other than the apex (the original URL) itself.allowedSubdomains: []
, it should automatically handle requests with the default behavior because the user never specified whether subdomains should be filtered or not.allowedSubdomains: ['*']
or[sub1, sub2, ..., '*']
(includes the asterisk), it will always automatically handle requests with the default behavior because the definition of asterisk is equivalent to accepting any subdomain.Documentation Updates
This PR includes documentation that:
allowedSubdomains
option with a simple definition and use case.Testing Improvements
This PR also includes new tests:
enqueue_links.test.ts
to validate the behavior of theallowedSubdomains
option with various configurations.HTML_WITH_SUBDOMAINS
) to facilitate testing of subdomain filtering.Contributors
Closes #3099
Alternative solution to #2513