Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
b35ffbd
Dask and Pandas repositories now use default branches named 'main'.
lafiona Aug 31, 2022
ed1bbf2
Replace 'master' with 'default' in comment about Travis CI default be…
kevingurney Aug 31, 2022
53271cd
Remove mention of "master" from help text for --arrow-branch crossbow…
kevingurney Aug 31, 2022
9cad4f8
Pandas repository uses 'main' as the default branch.
lafiona Aug 31, 2022
f405a02
Add base_branch property to Release object and modify commits_to_pick…
lafiona Aug 31, 2022
393c5de
Modify 'archery' command line interface to reference the mainline dev…
lafiona Sep 1, 2022
8bc7abe
Dynamically compute the default branch name for archery and crossbow …
lafiona Sep 8, 2022
80a9bb4
Performed python linting
lafiona Sep 8, 2022
3fc0a12
remove duplicate code
lafiona Sep 13, 2022
4a44b57
Print debugging info for default_branch_name Repo class function
lafiona Sep 13, 2022
f0ec4cb
Print more debugging info
lafiona Sep 13, 2022
47b1734
Remove resolve_refish print command
lafiona Sep 13, 2022
61a184e
Add new line between reference object details
lafiona Sep 13, 2022
7d62f4f
Print branches
lafiona Sep 13, 2022
dbee344
Use environment variable, DEFAULT_BRANCH, that is set in the yml file.
lafiona Sep 14, 2022
6d353c0
Remove string concatenation, types incompatible
lafiona Sep 14, 2022
8dc8409
Enable both CI workflows and local repository workflows for getting d…
lafiona Sep 14, 2022
058226c
Enable both CI workflows and local repository workflows for getting d…
lafiona Sep 14, 2022
e6ecbd4
Add DEFAULT_BRANCH environment variable to archery.yml test step for …
lafiona Sep 14, 2022
9fbf6a9
Set workflow-wide environment variable, DEFAULT_BRANCH, for archery.yml
lafiona Sep 14, 2022
07d2486
Print reason for skipping tests
lafiona Sep 14, 2022
0227965
Add 'enable-integration' flag to ensure crossbowcli tests run
lafiona Sep 14, 2022
9c00e3d
Factor out GitFixup step DEFAULT_BRANCH value
lafiona Sep 14, 2022
64dbf18
Run python linting
lafiona Sep 15, 2022
b6173dd
Address bare except and line lengths
lafiona Sep 19, 2022
5bdf348
Add context to error message when obtaining default branch name.
lafiona Sep 19, 2022
d9c2902
add debugging print statement in archery/archery/release/core.py comm…
lafiona Sep 20, 2022
da9ee2f
Debugging statements for DefaultBranchName constructor
lafiona Sep 20, 2022
5d32d6c
Remove base_branch property of Release, instead add default_branch_pr…
lafiona Sep 20, 2022
bf4fe84
Use separate function for computing default branch.
lafiona Sep 20, 2022
9d4e46e
Refactor the default branch code to be calculated within Release class
lafiona Sep 20, 2022
a53db90
Add DEFAULT_BRANCH environment variable to Execute Docker Build step …
lafiona Sep 21, 2022
61cce18
In integration.yml, merge edits from default branch and current feature
lafiona Sep 21, 2022
1ccd756
Add DEFAULT_BRANCH env var for archery docker run command in .travis.yml
lafiona Sep 22, 2022
07d1760
Use git command to get default branch name in .travis.yml
lafiona Sep 23, 2022
db35f60
Fix integration.yml merge
lafiona Sep 23, 2022
ddd6ae8
Set and export the DEFAULT_BRANCH env var for the archery command.
lafiona Sep 23, 2022
0db9218
Remove computation for default branch name from module loading step i…
lafiona Oct 12, 2022
3476ef5
Removing error if default branch cannot be determined, default to
lafiona Oct 12, 2022
2bd3535
Alphabetize the standard library imports in dev/archery/archery/relea…
lafiona Oct 12, 2022
df341b6
Remove error in the case that the default branch name could not be de…
lafiona Oct 12, 2022
a291583
Remame DEFAULT_BRANCH env var to ARCHERY_DEFAULT_RBANCH
lafiona Oct 12, 2022
694106e
Reuse arrow Repo object for getting the default branch name, if needed
lafiona Oct 13, 2022
4b7288d
Update the dask and pandas install scripts to use default branch comp…
lafiona Oct 13, 2022
1cee9a3
Change the flag for indicating upstream development version of Pandas…
lafiona Oct 14, 2022
19fb6c3
Run python linting
lafiona Oct 14, 2022
402a054
Update Dask and Pandas version flag in tasks.yml and dev/archery/arch…
lafiona Oct 19, 2022
4551b59
Update .github/workflows/integration.yml
lafiona Oct 21, 2022
362c9b8
Update dev/archery/archery/release/core.py error message to include s…
lafiona Oct 21, 2022
4be1f79
Update dev/archery/archery/crossbow/core.py to add space in error mes…
lafiona Oct 21, 2022
404a122
Remove () for accessing computed property, default_branch_name
lafiona Oct 21, 2022
5fd786a
Update .github/workflows/integration.yml indentation
lafiona Oct 24, 2022
7bfd4e2
Update .github/workflows/integration.yml
lafiona Oct 25, 2022
37f29e4
Update dev/archery/archery/docker/cli.py
lafiona Oct 25, 2022
edaf2c0
Update docs/source/developers/continuous_integration/docker.rst
lafiona Oct 25, 2022
fb3d142
Update docs/source/developers/continuous_integration/docker.rst
lafiona Oct 25, 2022
374f540
Factor out repo object set up lines from try block
lafiona Oct 25, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .github/workflows/archery.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ on:
- 'dev/tasks/**'
- 'docker-compose.yml'

env:
ARCHERY_DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}

concurrency:
group: ${{ github.repository }}-${{ github.head_ref || github.sha }}-${{ github.workflow }}
cancel-in-progress: true
Expand All @@ -52,9 +55,7 @@ jobs:
fetch-depth: 0
- name: Git Fixup
shell: bash
run: |
DEFAULT_BRANCH=${{ github.event.repository.default_branch }}
git branch $DEFAULT_BRANCH origin/$DEFAULT_BRANCH || true
run: git branch $ARCHERY_DEFAULT_BRANCH origin/$ARCHERY_DEFAULT_BRANCH || true
- name: Setup Python
uses: actions/setup-python@v4
with:
Expand Down
6 changes: 5 additions & 1 deletion .github/workflows/integration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,11 @@ jobs:
env:
ARCHERY_DOCKER_USER: ${{ secrets.DOCKERHUB_USER }}
ARCHERY_DOCKER_PASSWORD: ${{ secrets.DOCKERHUB_TOKEN }}
run: archery docker run -e ARCHERY_INTEGRATION_WITH_RUST=1 conda-integration
run: >
archery docker run \
-e ARCHERY_DEFAULT_BRANCH=${{ github.event.repository.default_branch }} \
-e ARCHERY_INTEGRATION_WITH_RUST=1 \
conda-integration
- name: Docker Push
if: success() && github.event_name == 'push' && github.repository == 'apache/arrow'
env:
Expand Down
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,7 @@ install:
- sudo -H pip3 install -e dev/archery[docker]

script:
- export ARCHERY_DEFAULT_BRANCH=$(git rev-parse --abbrev-ref origin/HEAD | sed s@origin/@@)
- |
archery docker run \
${DOCKER_RUN_ARGS} \
Expand Down
2 changes: 1 addition & 1 deletion ci/scripts/install_dask.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ fi

dask=$1

if [ "${dask}" = "master" ]; then
if [ "${dask}" = "upstream_devel" ]; then
pip install https://github.com/dask/dask/archive/main.tar.gz#egg=dask[dataframe]
elif [ "${dask}" = "latest" ]; then
pip install dask[dataframe]
Expand Down
2 changes: 1 addition & 1 deletion ci/scripts/install_pandas.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ else
pip install numpy==${numpy}
fi

if [ "${pandas}" = "master" ]; then
if [ "${pandas}" = "upstream_devel" ]; then
pip install git+https://github.com/pandas-dev/pandas.git --no-build-isolation
elif [ "${pandas}" = "nightly" ]; then
pip install --extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple --pre pandas
Expand Down
13 changes: 8 additions & 5 deletions dev/archery/archery/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -529,7 +529,7 @@ def benchmark_run(ctx, rev_or_path, src, preserve, output, cmake_extras,
help="Hide counters field in diff report.")
@click.argument("contender", metavar="[<contender>",
default=ArrowSources.WORKSPACE, required=False)
@click.argument("baseline", metavar="[<baseline>]]", default="origin/master",
@click.argument("baseline", metavar="[<baseline>]]", default="origin/HEAD",
required=False)
@click.pass_context
def benchmark_diff(ctx, src, preserve, output, language, cmake_extras,
Expand All @@ -542,7 +542,8 @@ def benchmark_diff(ctx, src, preserve, output, language, cmake_extras,

The caller can optionally specify both the contender and the baseline. If
unspecified, the contender will default to the current workspace (like git)
and the baseline will default to master.
and the baseline will default to the mainline development branch (i.e.
default git branch).

Each target (contender or baseline) can either be a git revision
(commit, tag, special values like HEAD) or a cmake build directory. This
Expand All @@ -559,16 +560,18 @@ def benchmark_diff(ctx, src, preserve, output, language, cmake_extras,
Examples:

\b
# Compare workspace (contender) with master (baseline)
# Compare workspace (contender) against the mainline development branch
# (baseline)
\b
archery benchmark diff

\b
# Compare master (contender) with latest version (baseline)
# Compare the mainline development branch (contender) against the latest
# version (baseline)
\b
export LAST=$(git tag -l "apache-arrow-[0-9]*" | sort -rV | head -1)
\b
archery benchmark diff master "$LAST"
archery benchmark diff <default-branch> "$LAST"

\b
# Compare g++7 (contender) with clang++-8 (baseline) builds
Expand Down
12 changes: 9 additions & 3 deletions dev/archery/archery/crossbow/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def check_config(obj, config_path):
'locally. Examples: https://github.com/apache/arrow or '
'https://github.com/kszucs/arrow.')
@click.option('--arrow-branch', '-b', default=None,
help='Give the branch name explicitly, e.g. master, ARROW-1949.')
help='Give the branch name explicitly, e.g. ARROW-1949.')
@click.option('--arrow-sha', '-t', default=None,
help='Set commit SHA or Tag name explicitly, e.g. f67a515, '
'apache-arrow-0.11.1.')
Expand Down Expand Up @@ -157,7 +157,7 @@ def submit(obj, tasks, groups, params, job_prefix, config_path, arrow_version,


@crossbow.command()
@click.option('--base-branch', default="master",
@click.option('--base-branch', default=None,
help='Set base branch for the PR.')
@click.option('--create-pr', is_flag=True, default=False,
help='Create GitHub Pull Request')
Expand Down Expand Up @@ -192,6 +192,12 @@ def verify_release_candidate(obj, base_branch, create_pr,

# Redefine Arrow repo to use the correct arrow remote.
arrow = Repo(path=obj['arrow'].path, remote_url=remote)

# Default value for base_branch is the repository's default branch name
if base_branch is None:
# Get the default branch name from the repository
base_branch = arrow.default_branch_name

response = arrow.github_pr(title=pr_title, head=head_branch,
base=base_branch, body=pr_body,
github_token=obj['queue'].github_token,
Expand Down Expand Up @@ -225,7 +231,7 @@ def verify_release_candidate(obj, base_branch, create_pr,
'locally. Examples: https://github.com/apache/arrow or '
'https://github.com/kszucs/arrow.')
@click.option('--arrow-branch', '-b', default=None,
help='Give the branch name explicitly, e.g. master, ARROW-1949.')
help='Give the branch name explicitly, e.g. ARROW-1949.')
@click.option('--arrow-sha', '-t', default=None,
help='Set commit SHA or Tag name explicitly, e.g. f67a515, '
'apache-arrow-0.11.1.')
Expand Down
38 changes: 34 additions & 4 deletions dev/archery/archery/crossbow/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
from io import StringIO
from pathlib import Path
from datetime import date
import warnings

import jinja2
from ruamel.yaml import YAML
Expand Down Expand Up @@ -133,7 +134,7 @@ def format_all(items, pattern):

# configurations for setting up branch skipping
# - appveyor has a feature to skip builds without an appveyor.yml
# - travis reads from the master branch and applies the rules
# - travis reads from the default branch and applies the rules
# - circle requires the configuration to be present on all branch, even ones
# that are configured to be skipped
# - azure skips branches without azure-pipelines.yml by default
Expand Down Expand Up @@ -361,6 +362,29 @@ def signature(self):
return pygit2.Signature(self.user_name, self.user_email,
int(time.time()))

@property
def default_branch_name(self):
default_branch_name = os.getenv("ARCHERY_DEFAULT_BRANCH")

if default_branch_name is None:
try:
ref_obj = self.repo.references["refs/remotes/origin/HEAD"]
target_name = ref_obj.target
target_name_tokenized = target_name.split("/")
default_branch_name = target_name_tokenized[-1]
except KeyError:
# TODO: ARROW-18011 to track changing the hard coded default
# value from "master" to "main".
default_branch_name = "master"
warnings.warn('Unable to determine default branch name: '
'ARCHERY_DEFAULT_BRANCH environment variable is '
'not set. Git repository does not contain a '
'\'refs/remotes/origin/HEAD\'reference. Setting '
'the default branch name to ' +
default_branch_name, RuntimeWarning)

return default_branch_name

def create_tree(self, files):
builder = self.repo.TreeBuilder()

Expand All @@ -382,7 +406,7 @@ def create_commit(self, files, parents=None, message='',
if parents is None:
# by default use the main branch as the base of the new branch
# required to reuse github actions cache across crossbow tasks
commit, _ = self.repo.resolve_refish("master")
commit, _ = self.repo.resolve_refish(self.default_branch_name)
parents = [commit.id]
tree_id = self.create_tree(files)

Expand Down Expand Up @@ -546,8 +570,10 @@ def github_overwrite_release_assets(self, tag_name, target_commitish,
'Unsupported upload method {}'.format(method)
)

def github_pr(self, title, head=None, base="master", body=None,
def github_pr(self, title, head=None, base=None, body=None,
github_token=None, create=False):
# Default value for base is the default_branch_name
base = self.default_branch_name if base is None else base
github_token = github_token or self.github_token
repo = self.as_github_repo(github_token=github_token)
if create:
Expand Down Expand Up @@ -1289,11 +1315,15 @@ def validate(self):
'is: `{}`'.format(task_name, str(e))
)

# Get the default branch name from the repository
arrow_source_dir = ArrowSources.find()
repo = Repo(arrow_source_dir.path)

# validate that the defined tasks are renderable, in order to to that
# define the required object with dummy data
target = Target(
head='e279a7e06e61c14868ca7d71dea795420aea6539',
branch='master',
branch=repo.default_branch_name,
remote='https://github.com/apache/arrow',
version='1.0.0dev123',
r_version='0.13.0.100000123',
Expand Down
3 changes: 2 additions & 1 deletion dev/archery/archery/docker/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,8 @@ def docker_run(obj, image, command, *, env, user, force_pull, force_build,
PYTHON=3.8 archery docker run conda-python

# disable the cache only for the leaf image
PANDAS=master archery docker run --no-leaf-cache conda-python-pandas
PANDAS=upstream_devel archery docker run --no-leaf-cache \
conda-python-pandas

# entirely skip building the image
archery docker run --no-pull --no-build conda-python
Expand Down
8 changes: 4 additions & 4 deletions dev/archery/archery/docker/tests/test_docker.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,12 +259,12 @@ def test_arrow_example_validation_passes(arrow_compose_path):
def test_compose_default_params_and_env(arrow_compose_path):
compose = DockerCompose(arrow_compose_path, params=dict(
UBUNTU='18.04',
DASK='master'
DASK='upstream_devel'
))
assert compose.config.dotenv == arrow_compose_env
assert compose.config.params == {
'UBUNTU': '18.04',
'DASK': 'master',
'DASK': 'upstream_devel',
}


Expand Down Expand Up @@ -492,7 +492,7 @@ def test_compose_push(arrow_compose_path):
def test_compose_error(arrow_compose_path):
compose = DockerCompose(arrow_compose_path, params=dict(
PYTHON='3.8',
PANDAS='master'
PANDAS='upstream_devel'
))

error = subprocess.CalledProcessError(99, [])
Expand All @@ -503,7 +503,7 @@ def test_compose_error(arrow_compose_path):
exception_message = str(exc.value)
assert "exited with a non-zero exit code 99" in exception_message
assert "PANDAS: latest" in exception_message
assert "export PANDAS=master" in exception_message
assert "export PANDAS=upstream_devel" in exception_message


def test_image_with_gpu(arrow_compose_path):
Expand Down
49 changes: 45 additions & 4 deletions dev/archery/archery/release/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,9 @@
from abc import abstractmethod
from collections import defaultdict
import functools
import re
import os
import pathlib
import re
import shelve
import warnings

Expand Down Expand Up @@ -361,6 +362,46 @@ def commits(self):
commit_range = f"{lower}..{upper}"
return list(map(Commit, self.repo.iter_commits(commit_range)))

@cached_property
def default_branch(self):
default_branch_name = os.getenv("ARCHERY_DEFAULT_BRANCH")

if default_branch_name is None:
# Set up repo object
arrow = ArrowSources.find()
repo = Repo(arrow.path)
origin = repo.remotes["origin"]
origin_refs = origin.refs

try:
# Get git.RemoteReference object to origin/HEAD
# If the reference does not exist, a KeyError will be thrown
origin_head = origin_refs["HEAD"]

# Get git.RemoteReference object to origin/default-branch-name
origin_head_reference = origin_head.reference

# Get string value of remote head reference, should return
# "origin/main" or "origin/master"
origin_head_name = origin_head_reference.name
origin_head_name_tokenized = origin_head_name.split("/")

# The last token is the default branch name
default_branch_name = origin_head_name_tokenized[-1]
except KeyError:
# Use a hard-coded default value to set default_branch_name
# TODO: ARROW-18011 to track changing the hard coded default
# value from "master" to "main".
default_branch_name = "master"
warnings.warn('Unable to determine default branch name: '
'ARCHERY_DEFAULT_BRANCH environment variable is '
'not set. Git repository does not contain a '
'\'refs/remotes/origin/HEAD\'reference. Setting '
'the default branch name to ' +
default_branch_name, RuntimeWarning)

return default_branch_name

def curate(self, minimal=False):
# handle commits with parquet issue key specially and query them from
# jira and add it to the issues
Expand Down Expand Up @@ -422,9 +463,9 @@ def changelog(self):
return JiraChangelog(release=self, categories=categories)

def commits_to_pick(self, exclude_already_applied=True):
# collect commits applied on the main branch since the root of the
# collect commits applied on the default branch since the root of the
# maintenance branch (the previous major release)
commit_range = f"{self.previous.tag}..master"
commit_range = f"{self.previous.tag}..{self.default_branch}"

# keeping the original order of the commits helps to minimize the merge
# conflicts during cherry-picks
Expand Down Expand Up @@ -476,7 +517,7 @@ def branch(self):

@property
def base_branch(self):
return "master"
return self.default_branch

@cached_property
def siblings(self):
Expand Down
4 changes: 2 additions & 2 deletions dev/tasks/tasks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1492,7 +1492,7 @@ tasks:
("3.7", "latest", "latest", False),
("3.8", "latest", "latest", False),
("3.8", "nightly", "nightly", False),
("3.9", "master", "nightly", False)] %}
("3.9", "upstream_devel", "nightly", False)] %}
test-conda-python-{{ python_version }}-pandas-{{ pandas_version }}:
ci: github
template: docker-tests/github.linux.yml
Expand All @@ -1512,7 +1512,7 @@ tasks:
image: conda-python-pandas
{% endfor %}

{% for dask_version in ["latest", "master"] %}
{% for dask_version in ["latest", "upstream_devel"] %}
test-conda-python-3.9-dask-{{ dask_version }}:
ci: github
template: docker-tests/github.linux.yml
Expand Down
Loading