- 
                Notifications
    You must be signed in to change notification settings 
- Fork 15.9k
Extend papermill operator to support remote kernels #34840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst) 
 | 
ca6a3ce    to
    c105472      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The situation around Papermill is not good, this provider might be suspended in the future (or even soon), see:
| 
 Yes but in the meantime we can accept PRs | 
89b463c    to
    a810cb1      
    Compare
  
    | 
 I am happy to contribute/maintain papermill. | 
| @Taragolis @eladkal Could you please take a look? | 
acd40ab    to
    9594640      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Need to fix Static Checks
- Move generic classes to appropriate places
- Create Connection documentation
- Additional unit tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why it should be constant value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are registering a remote kernel engine(custom papermill engine) so the operator can work with remote kernels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why there is no implementation here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The operator is not managing the lifecycle of the kernel but only connecting to a remote kernel if configured via hook
05823c1    to
    125ddbf      
    Compare
  
    | @Taragolis Thanks for your comments and suggestions. I have updated by PR to address your feedback. Please take a look. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you still need to fix Static Checks, most of them fixed by run pre-commit hooks
In additional I think some fixes might required into the documentation formatting, I've add my assumptions, but better to run
breeze build-docs papermill, have a look what the error happen and try to fix it. Some addition useful link which might help to setup local development environment:
If you have any problem you always could ask help in Slack channel #development-first-pr-support
        
          
                docs/apache-airflow-providers-papermill/connections/jupyter_kernel.rst
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                docs/apache-airflow-providers-papermill/connections/jupyter_kernel.rst
              
                Outdated
          
            Show resolved
            Hide resolved
        
      047ab5b    to
    14ca569      
    Compare
  
    | 
 @Taragolis Updated and verified building docs. Thanks | 
90f52a6    to
    dcf69aa      
    Compare
  
    dcf69aa    to
    6364081      
    Compare
  
    | Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. | 
This PR adds support to run papermill operator that can connect to kernels managed externally by other systems. This would be useful to run the operator in cloud environments and would also be helpful to run spark or scala notebooks
It extends papermill to support new engine using the entry_points as described here
It adds unittest and also a system test to run in CI environments.
Validated using below steps in breeze environment: