-
Notifications
You must be signed in to change notification settings - Fork 134
propose additional node specific pre-provisioning kernel args #552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,201 @@ | ||||||||
<!-- | ||||||||
This work is licensed under a Creative Commons Attribution 3.0 | ||||||||
Unported License. | ||||||||
|
||||||||
http://creativecommons.org/licenses/by/3.0/legalcode | ||||||||
--> | ||||||||
|
||||||||
# pre-provisioning kernel argument support | ||||||||
|
||||||||
## Status | ||||||||
|
||||||||
planned | ||||||||
|
||||||||
## Summary | ||||||||
|
||||||||
This document proposes a new BareMetalHost spec attribute intended for | ||||||||
configuring the pre-provisioning image's (usually IPA) kernel arguments. | ||||||||
|
||||||||
## Motivation | ||||||||
|
||||||||
There are situations where a user would need node specific kernel arguments | ||||||||
for the pre-provisioning image. The reason could be the use of a mixed set of | ||||||||
hardware, troubleshooting provisioning or boot issues, selectively configuring | ||||||||
custom logic e.g. enabling IPA plugins or other services embedded in the | ||||||||
pre-provisioning image. | ||||||||
|
||||||||
### Goals | ||||||||
|
||||||||
1. Implement a string type optional spec field named `preprovisioningExtraKernelParams` | ||||||||
2. Extend the generic `Provisioner` interface to support the new optional field | ||||||||
3. Implement support for the `preprovisioningExtraKernelParams` in the Ironic provisioner | ||||||||
4. Combine the values of the `preprovisioningExtraKernelParams` and the | ||||||||
`preprovisioningImage`'s `status.extraKernelParams` field if a | ||||||||
`preprovisioningImage` is refferences by the BMH and teh `status.extraKernelParams` | ||||||||
is not empty | ||||||||
Comment on lines
+32
to
+35
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it a good idea to construct a value of a field from two different sources? I imagine things might go wrong, for example how do we handle a case when prepovisioningImage object referenced by BMH doesn't exist for some reason? Does this mean that a node might have different kernel parameters compared to what is asked via There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dtantsur thoughts ? |
||||||||
5. Introducing a `preprovisioningExtraKernelParams` related status field that | ||||||||
would display the last applied `preprovisioningExtraKernelParams` value. | ||||||||
Let's call this field `status.lastAppliedPreprovisioningExtraKernelParams` | ||||||||
Comment on lines
+36
to
+38
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps something shorter would be easier but I don't have strong opinion on this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have been thinking about this, and our offline discussions and now I am on the opinion that I should separate this status topic to a different proposal/discussion, because right now we don't really provide status for applying provisioning or preprvisioning data to Ironic API so we might need a more comprehensive approach to give feedback about what, user/network/meta data and what preprovisioning data was applied. We might also need a status flag to show if the previously mentioned data references/values have changed since the last LCM operation (inspect,provision,deprovision,clean, service). If I would leave the status topic out from the proposal I think it would be still aligned with our current feedback mechanism (or the lack of it) and then we can improve our feedback mechanism with the followup proposal. |
||||||||
|
||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You also need to extend the PreprovisioningImage object to accept these params. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can keep them separate, I didn't intend to make this new spec to be applied for the "PreprovisioningImage" I thought that already has a spec for this purpose. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That will create a wild inconsistency that will cause users file bugs for me with " There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Btw it's fair if you ask me to follow-up with PPI support myself if you don't have cycles for that. I just want it to be in the design. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have already looked into it, my original impression was that the user is already able to set the Kernel Args via the preprovisioningImage CR's spec but now I understand that it is not possible as on that CR the kernel args are only part of the status and not the spec. I have also checked https://github.com/openshift/image-customization-controller so I am more familiar with the use-case of the preprovisioningImage CR. Based on what I know now It makes sense to include preprovisioningImage in the proposal. |
||||||||
### Non-Goals | ||||||||
|
||||||||
1. To implement any sort validation for the new optional field | ||||||||
2. Providing a default value for the new spec field | ||||||||
3. Implement conditional logic to overwrite default kernel parameters on | ||||||||
per node basis | ||||||||
|
||||||||
## Proposal | ||||||||
|
||||||||
This document proposes a new BareMetalHost spec attribute intended for | ||||||||
configuring the pre-provisioning image's (usually IPA) kernel arguments. | ||||||||
|
||||||||
The proposal involves the introduction of a new spec field thus a modification | ||||||||
to the BMO API. The API change is intended to be backward compatible as the | ||||||||
new `preprovisioningExtraKernelParams` field is optional. | ||||||||
|
||||||||
### Implementation Details/Notes/Constraints | ||||||||
|
||||||||
The `preprovisioningExtraKernelParams` field would be located directly under `spec` | ||||||||
section. The format of the new field would be a list/array of individual | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This has been changes elsewhere to be a string
Suggested change
|
||||||||
strings. The new field would be optional and it would be considered | ||||||||
either a NOOP or a cleaning operation depending on the previous value of the | ||||||||
field in case it would be not present or have no value or would have an empty | ||||||||
array as a value. | ||||||||
|
||||||||
The proposed BMH looks like the following: | ||||||||
|
||||||||
```yaml | ||||||||
apiVersion: metal3.io/v1alpha1 | ||||||||
kind: BareMetalHost | ||||||||
metadata: | ||||||||
name: bm0 | ||||||||
spec: | ||||||||
online: true | ||||||||
bmc: | ||||||||
address: idrac://192.168.122.1:6230/ | ||||||||
credentialsName: bm0-bmc-secret | ||||||||
preprovisioningExtraKernelParams: "--timeout 60000 systemd.journald.forward_to_console=yes ipa-debug=1" | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||
... | ||||||||
... | ||||||||
... | ||||||||
bootMACAddress: 52:54:00:b7:b2:6f | ||||||||
... | ||||||||
... | ||||||||
... | ||||||||
|
||||||||
``` | ||||||||
|
||||||||
The example contains a snippet of a common IPA kernel argument list. | ||||||||
The `preprovisioningExtraKernelParams` string would be received by the `Provisioner` | ||||||||
interface as is. The handling of the list would be up to the particular | ||||||||
implementations of the `Provisioner` interface. | ||||||||
|
||||||||
Given the BMH snippet from above, the kernel parameter list generated by | ||||||||
the `Ironic provisioner` and posted to the Ironic Node API would look like | ||||||||
almost the same with one important addition. The ironic provisioner will | ||||||||
always prepend the `%default%` substring to the `preprovisioningExtraKernelParams`. | ||||||||
|
||||||||
The addition of the "%default%" will cause Ironic to inject the extra | ||||||||
kernel parameters provided via the ironic config file and ironic-image | ||||||||
environment variables. The inclusion of "%default%" is beneficial from | ||||||||
multiple point of view, it makes the API based kernel parameter update logic | ||||||||
consistent with the same feature used for dynamically built `preprovisioningImage`s. | ||||||||
An other benefit is that kernel parameters coming from environment variables | ||||||||
might contain credentials like ssh keys, or other sensitive information | ||||||||
that would cause a security issue to display in a spec field. | ||||||||
|
||||||||
It is possible to implement an annotation or an other spec field to | ||||||||
force BMO to overwrite the node specific kernel arguments. Implementing | ||||||||
logic to conditionally overwrite the parameters might prove useful in some | ||||||||
scenarios but it would completely deviate from how users are applying | ||||||||
kernel parameters both with pre built or with dynamically build pre provisioning | ||||||||
images thus using this feature might cause unexpected issues. | ||||||||
|
||||||||
In this proposal I am proposing the minimal viable feature to add node | ||||||||
specific kernel arguments, in case in the future there will be interest in | ||||||||
fully overwriting kernel parameters then the proposed feature can be extended | ||||||||
with the relevant logic. | ||||||||
|
||||||||
The `status.lastAppliedPreprovisioningExtraKernelParams` would have some limitations. | ||||||||
An Ironic node's or other provisioners relevant API endpoint could be updated | ||||||||
with extra kernel parameters such a way to cause the preprovisioning agent to not | ||||||||
boot properly or behave incorrectly even though the HTTP API patching was | ||||||||
successful. The status in the aforementioned case would only show what was | ||||||||
accepted by the provisioner API last time but would not provide info about | ||||||||
the sensibility of the value and wouldn't even mean that the value is actively | ||||||||
used because an updated `preprovisioningExtraKernelParams` value would be only | ||||||||
used at the next suitable life cycle operation. | ||||||||
|
||||||||
### Risks and Mitigations | ||||||||
|
||||||||
Providing a faulty list of kernel arguments might result in boot issues | ||||||||
or unexpected behavior during the life cycle of the pre-provisioning image. | ||||||||
There is no possibility to point out explicitly when the provisioning image | ||||||||
boot process fails as the provisioner would simply not receive any data so | ||||||||
it would time out, but the same is true in case of network or hardware | ||||||||
issues so user won't be able to distinguish without investigation. | ||||||||
|
||||||||
## Design Details | ||||||||
|
||||||||
Most of the design is very straightforward and easy to combine with the | ||||||||
reconcile logic, the support for node specific pre-provisioning kernel args | ||||||||
also already exist in Ironic, so right after the feature is implemented in BMO | ||||||||
it can be used without waiting for Ironic implementation to arrive. | ||||||||
|
||||||||
The only challenge is the process of updating/cleaning the previously applied | ||||||||
kernel argument list, this cleanup/update process highly depends on the | ||||||||
particular provisioner. | ||||||||
|
||||||||
This document proposes extending the existing "NodeUpdate" functions of the | ||||||||
Ironic provisioner utilized during provisioning, inspection and registration | ||||||||
with updating the kernel parameters of the Ironic driver_info. In case of | ||||||||
servicing and preparing states/operations "NodeUpdate"s have to be added this | ||||||||
based on the user's workflow there might be a few extra node updates based on | ||||||||
how many tines are the servicing or preparing logic is run. | ||||||||
|
||||||||
### Work Items | ||||||||
|
||||||||
- Implement a string type optional spec field named `preprovisioningExtraKernelParams` | ||||||||
- Extend the generic `Provisioner` interface to support the new optional field | ||||||||
- Implement support for the `preprovisioningExtraKernelParams` in the Ironic driver | ||||||||
- Extend the data types used in different operational states to hold | ||||||||
an entry for `preprovisioningExtraKernelParams` | ||||||||
- Implement unit tests, and align existing unit tests as needed | ||||||||
- Add a check to the existing BMO e2e workflow that checks that the redfish | ||||||||
images and the PXE config files to verify that the provisioner image will | ||||||||
receive the correct parameters. Logging into a running IPA could also work | ||||||||
but in e2e tests Metal3 CI is using upstream IPA images that don't allow | ||||||||
users to log in via neither serial or ssh connections. | ||||||||
|
||||||||
### Dependencies | ||||||||
|
||||||||
- Ironic in case of the `Ironic Provisioner` | ||||||||
|
||||||||
### Test Plan | ||||||||
|
||||||||
- Unit tests for the functions | ||||||||
- Integration testing with VMs, most likely as part of the existing | ||||||||
e2e workflow. | ||||||||
|
||||||||
### Upgrade / Downgrade Strategy | ||||||||
|
||||||||
- In case of upgrade user would not need to do anything | ||||||||
- In case of downgrade the user has to remove the field from the BMH | ||||||||
to both make the BMH compatible with the older API and to allow the | ||||||||
particular provisioner in use to return using the defaults arguments list. | ||||||||
|
||||||||
### Version Skew Strategy | ||||||||
|
||||||||
None | ||||||||
|
||||||||
## Drawbacks | ||||||||
|
||||||||
None | ||||||||
|
||||||||
## Alternatives | ||||||||
|
||||||||
None | ||||||||
|
||||||||
## Reference | ||||||||
|
||||||||
None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.