Skip to content

Conversation

@bajtos
Copy link
Member

@bajtos bajtos commented Sep 4, 2018

Introduce a new decorator @param.query.object allowing controller methods to describe a parameter as an object and optionally provide the schema for the accepted values.

Improve parseParams action to correctly parse and coerce object values coming from query strings, supporting the following two flavours:

  • "deepObject" encoding as described by OpenAPI Spec v3:

    GET /api/products?filter[where][name]=Pen&filter[limit]=10
    
  • JSON-based encoding for compatibility with LoopBack 3.x:

    GET /api/products?filter={"where":{"name":"Pen"},"limit":10}
    

See #100

While working on these changes, I discovered and fixed a bug in openapiv3-ts, see metadevpro/openapi3-ts#28.

Checklist

  • npm test passes on your machine
  • New tests added or existing tests modified to cover all changes
  • Code conforms with the style guide
  • API Documentation in code was updated
  • Documentation in /docs/site was updated
  • Affected artifact templates in packages/cli were updated
  • Affected example projects in examples/* were updated

@bajtos bajtos added feature Validation REST Issues related to @loopback/rest package and REST transport in general labels Sep 4, 2018
@bajtos bajtos added this to the September Milestone milestone Sep 4, 2018
@bajtos bajtos self-assigned this Sep 4, 2018
@bajtos
Copy link
Member Author

bajtos commented Sep 4, 2018

A discussion point:

At the moment, we rely on express to parse all incoming query strings and build deep objects from individual key-value pairs, e.g. convert ?color[R]=100&color[G]=200&color[B]=150 to {R: 100, G:200, B:150}. Under the hood, Express uses qs package to do the parsing.

I see few downsides of this solution:

  • We run qs on every incoming request, regardless of whether the operation accepts any object parameters from the query string. This can be a potential vector for DoS or at least for degraded performance, because the client can attach an arbitrary query string value that we don't care about but that will still be processed by qs.
  • By default, qs parses children only up to 5 levels of depth. This can be configured via qs options, which can be configured via express options. I guess we will eventually need to expose this configuration to LB4 app developers.

If we run qs ourselves in parseParams, then we can

  • Run qs only for endpoints expecting objects in the query string.
  • Configure the maximum depth on per-endpoint basis depending on the parameter schema. E.g. if a parameter accepts only top-level properties R, G and B, then it's enough to set depth to 1.

I am wondering if it's worth moving the part parsing deep-object values from the query string away from express and down into our parseParams action. Such move would give us much better control of how a query string is processed. On the other hand, it means slightly more work and complexity in our code base, I am not sure if the benefits outweigh the costs.

@raymondfeng @strongloop/sq-lb-apex @hacksparrow thoughts?

@bajtos bajtos force-pushed the feat/coerce-object-from-query branch from f38be05 to b8aed16 Compare September 4, 2018 12:21
@hacksparrow
Copy link
Contributor

I am in favor of moving query string parsing to parseParams actions. The more control we have over it, the better. Slightly more work and complexity in our code base should be an acceptable trade-off.

It should also solve the "We run qs on every incoming request..." problem, right?

@bajtos
Copy link
Member Author

bajtos commented Sep 4, 2018

It should also solve the "We run qs on every incoming request..." problem, right?

Yes, I updated my comment to make this clear.

Copy link
Contributor

@jannyHou jannyHou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff! And the discussion about relying on qs for parsing or not is very insightful.

+1 for implementing the query parser in our rest server. Express only uses a few lines to call the qs.parse function.

  • "Run qs only for endpoints expecting objects in the query string." seems simple 👍

  • "Configure the maximum depth on per-endpoint basis depending on the parameter schema" sounds a little complicated since we need to detect the depth of a nested object. I would expect users to optimize it on their end if they want to... Since it's hard for me to tell which one takes more time/effort: "detecting the depth of object" vs "parsing an object with the depth set larger than its actual one".

* @param name Parameter name
* @param schema Optional OpenAPI Schema describing the object value.
*/
object: function(
Copy link
Contributor

@jannyHou jannyHou Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we name it as deepObject? Or something could infer that {in: 'query', style: 'deepObject'} is hardcorded for this decorator.

I understand that for most situations the object is provided in query. While just in case people misuse it with parameters got from path(with style matrix or label), thought?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to distinguish between two bits of information provided by @param.query.deepObject:

  • the value has type "object"
  • the value is encoded using deepObject style.

Object values provided in query can use the following encoding styles (see the section "Style values"):

  • deepObject (e.g. ?color[R]=100&color[G]=200&color[B]=150)
  • form + explode: true (e.g. ?color=R,100,G,200,B,150)
  • form + explode: false (e.g. ?R=100&G=200&B=150)

As I see it, @param.query.object is saying that the value is of object type, similar how @param.query.number or @param.query.boolean works. To me, the fact that we allow deepObject style only, is a limitation we can lift later in the future. For example, we can allow callers to specify a different style and explode values by adding an optional third parameter:

@param.query.object('filter', FilterSchema, {style: 'deepObject'});
@param.query.object('filter', FilterSchema, {style: 'form', explode: false});
@param.query.object('filter', FilterSchema, {style: 'form', explode: true});

While just in case people misuse it with parameters got from path(with style matrix or label), thought?

At the moment, object method is defined on param.query only. I agree we should eventually add param.path.object and param.header.object in the future, supporting encoding styles that are appropriate for those sources. To me, it still makes sense to use the same method name object, because the method name is describing the value type, not the encoding style.

Maybe I don't understand your concern, what kind of misuse do you have in mind?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bajtos

To me, the fact that we allow deepObject style only, is a limitation we can lift later in the future.

ok I got it 👍 I think we are on the same page now, previously I thought you are creating a decorator particularly for {style: 'deepObject'}.

And my bad I mixed the usage of param.query.object and param.object.

Let's keep the name as param.query.object

result = request.query[spec.name];
break;
const value = request.query[spec.name];
if (spec.style === 'deepObject' && typeof value === 'string') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bajtos I feel function parseDeepObjectString is doing the parameter coercion, so a better place to have it would be inside coerceParameter, thought?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense 👍

});

afterEach(() => {
if (spy) spy.restore();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, any reason to assert the spy here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am reluctant to assume that all tests are initializing the spy. What if a test does not use the spy at all?

Having said that, maybe I should follow YAGNI here and don't worry about hypothetical future cases.

})
.expect(200);
sinon.assert.calledWithExactly(spy, {
// Notice that numeric and boolean values are converted to strings.
Copy link
Contributor

@jannyHou jannyHou Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused why this test loses data type, is it because given the following two signatures:

Signature 1: {filter: '{"where":{"id":1,"name":"Pen", "active": true}}'}
Signature 2:

{
        'filter[where][id]': 1,
        'filter[where][name]': 'Pen',
        'filter[where][active]': true,
      }

qs parses them differently, it preserves the data type for sig#1 while turns everything to string for sig#2?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think qs can handle types. It only produces strings, IIRC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query strings are encoding all values as a string, there is no (standard) way how to convey type information.

Here is the request URL as sent by this test. Please ignore extra whitespace, I added it to make the example easier to read.

GET /object-from-query
  ?filter[where][id]=1
  &filter[where][name]=Pen
  &filter[where][active]=true

When this request is parsed, we get the following request.query object:

{
  filter: {
    id: '1',
    name: 'Pen',
    active: 'true',
  }
}

Because we don't have any schema for the "filter" parameter, we cannot tell whether id should be treated as a number or as a string. There are valid cases where a value looking like a number needs to be treated as a string. MongoDB's Object IDs have taught us that lesson back in LoopBack 2.x days.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Was wondering why the data types are preserved in the test case above but not this one, now I realized the reason...the filter in last test case is provided as a whole string, and therefore the data types are not converted to string when parsing it to a JSON object.

req.query.aparameter = config.rawValue;
break;
case 'header':
case 'cookie':
Copy link
Contributor

@jannyHou jannyHou Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we support the coercion for param from header, see code in getParamFromRequest

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime code may support coercion from header parameters, but there are no unit tests verifying that would call testCoercion with in: header. Adding such tests is out of scope of this pull request IMO, and until such tests are added, there is no need for testCoercion to support in: header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense~ :shipit:

@virkt25
Copy link
Contributor

virkt25 commented Sep 5, 2018

I am in favor of moving query string parsing to parseParams actions. The more control we have over it, the better. Slightly more work and complexity in our code base should be an acceptable trade-off.

How do we disable the parser for express? AFAIK the query parsing is baked into the framework.

@bajtos
Copy link
Member Author

bajtos commented Sep 6, 2018

How do we disable the parser for express? AFAIK the query parsing is baked into the framework.

I believe the query parser can be disabled as follows:

app.set('query parser fn', false);

See util.compileQueryParser, it's used to determine a query-parser function to be used by the built-in query parsing middleware.

@bajtos bajtos force-pushed the feat/coerce-object-from-query branch from b8aed16 to f0174d1 Compare September 6, 2018 13:17
@bajtos
Copy link
Member Author

bajtos commented Sep 6, 2018

I have addressed the review commits and added documentation.

It turns out that our Todo/TodoList examples and the CRUD Controller template do not leverage object parameters yet. I feel that adding a filter object is a slightly more complex change (how are we going to determine the parameter schema?), I prefer to leave this part out of this pull request and open a new follow-up PR after landing this one.

This pull request is ready for the final review and landing.

@virkt25 @jannyHou @hacksparrow please review

Copy link
Contributor

@b-admike b-admike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! One minor comment, but LGTM otherwise. I like how the tests are organized 👍. I'm okay to update examples in a follow-up PR.

return result;
}

function ensureRequestQueryWasParsed(request: Request) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configure the maximum depth on per-endpoint basis depending on the parameter schema. E.g. if a parameter accepts only top-level properties R, G and B, then it's enough to set depth to 1.

How is the above done here where we're calling qs?

Copy link
Member Author

@bajtos bajtos Sep 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left that out of this initial pull request and use qs default configuration of depth: 5.

Determining the minimum depth is a bit involved, as we have to visit all nested schemas and $ref links. And then there is the schema option additionalProperties: true which basically enables unlimited depth.

Copy link
Contributor

@jannyHou jannyHou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that adding a filter object is a slightly more complex change (how are we going to determine the parameter schema?), I prefer to leave this part out of this pull request and open a new follow-up PR after landing this one.

I agree to create new PRs to update the controller methods.

LGTM! 🚢

*/
object: function(
name: string,
// requires https://github.com/metadevpro/openapi3-ts/pull/28
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: it's released 🚢 .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you for the reminder!

@bajtos bajtos force-pushed the feat/coerce-object-from-query branch from f0174d1 to a8aaafa Compare September 7, 2018 07:59
Introduce a new decorator `@param.query.object` allowing controller
methods to describe a parameter as an object and optionally provide
the schema for the accepted values.

Improve `parseParams` action to correctly parse and coerce object values
coming from query strings, supporting the following two flavours:

"deepObject" encoding as described by OpenAPI Spec v3:

    GET /api/products?filter[where][name]=Pen&filter[limit]=10

JSON-based encoding for compatibility with LoopBack 3.x:

    GET /api/products?filter={"where":{"name":"Pen"},"limit":10}
@bajtos bajtos force-pushed the feat/coerce-object-from-query branch from a8aaafa to d8f377a Compare September 7, 2018 11:12
@bajtos bajtos merged commit d095693 into master Sep 7, 2018
@bajtos bajtos deleted the feat/coerce-object-from-query branch September 7, 2018 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature REST Issues related to @loopback/rest package and REST transport in general Validation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants