Skip to content

Conversation

@kou
Copy link
Member

@kou kou commented Mar 28, 2025

Rationale for this change

We want to stop using apache.jfrog.io as much as possible. See also: #40760

We can use https://github.com/apache/arrow/releases as alternative.

What changes are included in this PR?

If we use GitHub Release, we can automate uploading by GitHub Actions.

But GitHub Release doesn't support directory structure. So we need to put all artifacts to https://github.com/apache/arrow/releases/download/apache-arrow-X.Y.Z/ .

For example:

apache.jfrog.io:

libarrow/bin/darwin-arm64-openssl-1.1/arrow-X.Y.Z.zip
libarrow/bin/darwin-arm64-openssl-3.0/arrow-X.Y.Z.zip
libarrow/bin/darwin-x86_64-openssl-1.1/arrow-X.Y.Z.zip
libarrow/bin/darwin-x86_64-openssl-3.0/arrow-X.Y.Z.zip
libarrow/bin/linux-openssl-1.0/arrow-X.Y.Z.zip
libarrow/bin/linux-openssl-1.1/arrow-X.Y.Z.zip
libarrow/bin/linux-openssl-3.0/arrow-X.Y.Z.zip
libarrow/bin/windows/arrow-X.Y.Z.zip

GitHub Actions:

r-libarrow-darwin-arm64-openssl-1.1--X.Y.Z.zip
r-libarrow-darwin-arm64-openssl-3.0-X.Y.Z.zip
r-libarrow-darwin-x86_64-openssl-1.1-X.Y.Z.zip
r-libarrow-darwin-x86_64-openssl-3.0-X.Y.Z.zip
r-libarrow-linux-x86_64-openssl-1.0-X.Y.Z.zip
r-libarrow-linux-x86_64-openssl-1.1-X.Y.Z.zip
r-libarrow-linux-x86_64-openssl-3.0-X.Y.Z.zip
r-libarrow-windows-x86_64-X.Y.Z.zip

Are these changes tested?

No.

Are there any user-facing changes?

Yes. Custom arrow_repo doesn't work because URL is changed.

@github-actions
Copy link

⚠️ GitHub issue #45921 has been automatically assigned in GitHub to PR creator.

@kou
Copy link
Member Author

kou commented Mar 28, 2025

@jonkeane @amoeba @assignUser We want to stop using apache.jfrog.io because it's unstable. See also: #40760
FYI: apache.jfrog.io will not be down this year: https://issues.apache.org/jira/browse/INFRA-26324

I think that we can use GitHub Release or https://repo1.maven.org/maven2/org/apache/arrow as an alternative of apache.jfrog.io. This PR is for the GitHub Release path.

GitHub Release will be easier maintain but we can't use "directory" in GitHub Release. Pre-built R binaries uses directory. For example:

https://apache.jfrog.io/ui/native/arrow/r/19.0.1/

libarrow/bin/darwin-arm64-openssl-1.1/arrow-19.0.1.zip
libarrow/bin/darwin-arm64-openssl-3.0/arrow-19.0.1.zip
libarrow/bin/darwin-x86_64-openssl-1.1/arrow-19.0.1.zip
libarrow/bin/darwin-x86_64-openssl-3.0/arrow-19.0.1.zip
libarrow/bin/linux-openssl-1.0/arrow-19.0.1.zip
libarrow/bin/linux-openssl-1.1/arrow-19.0.1.zip
libarrow/bin/linux-openssl-3.0/arrow-19.0.1.zip
libarrow/bin/windows/arrow-19.0.1.zip

They will be the following with GitHub Release:

(We can remove the r-lib__libarrow__bin__ prefix and the __arrow-19.0.1.zip suffix.)

If we use GitHub Release, we need to change r/tools/nixlibs.R like 30587cb#diff-935746c34b16289a07b0d9bf7642dbd268b18059b6187f7cdec7c464be47a3de . (We can create https://github.com/apache/arrow/releases/download/X.Y.Z/... URLs for existing releases by copying existing artifacts in apache.jfrog.io to GitHub Release.)

This changes binary URL pattern. So this is a backward incompatible change for existing users who are using arrow.repo. (I'm not sure how many users use it.)

What do you think about this approach?

Should we use https://repo1.maven.org/maven2/org/apache/arrow not GitHub Release? We can use directory with https://repo1.maven.org/maven2/org/apache/arrow . So our new APT/Yum repositories use https://repo1.maven.org/maven2/org/apache/arrow not GitHub Release.

@amoeba
Copy link
Member

amoeba commented Mar 28, 2025

Thanks @kou. I think GitHub Releases should work fine though others may be aware of other issues. Do I have it right that release assets aren't subject to rate limiting?

@kou
Copy link
Member Author

kou commented Mar 28, 2025

Right.

GitHub mentions API rate limit explicitly: https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28
But GitHub doesn't mention rate limit for GitHub Release explicitly.

@assignUser
Copy link
Member

I am ok with this in principle but it might cause issues with CRAN as they are known to dislike GIthub for not being a 'reliable' source. Not that I agree, I think it's a pretty ridiculous statement but what ever. I don't think that they have any blocks in place but we should be aware anyway.

There are a number of packages that download stuff from github so there's that: https://github.com/search?q=org%3Acran+%2Fgithub.com%5C%2F.*%5C%2Freleases%2F+language%3AR&type=code&l=R

@kou
Copy link
Member Author

kou commented Mar 29, 2025

Thanks for sharing the note.

This is only for the pre-built C++ binaries not the source archive. The source archive URL isn't change. I thought that we doesn't use the pre-built C++ binaries on CRAN. We use bundled C++ on CRAN instead. Is it correct?

@jonkeane
Copy link
Member

I thought that we doesn't use the pre-built C++ binaries on CRAN. We use bundled C++ on CRAN instead. Is it correct?

We use bundled C++ for macos and linux on CRAN, but for windows we do use pre-built binaries. @assignUser has been working on getting arrow stuff ported to MXE so that we can be in compliance with that process for CRAN but IIUC, that's not done yet. We might be ok to have a github link to the binary there (I don't believe that they specifically check the URL / block github, just that if we talk about it loudly they will complain that that's not acceptable — maybe that "just" makes it so we need to speed up the MXE work?)

@kou
Copy link
Member Author

kou commented Mar 30, 2025

Oh, sorry. I misunderstood...

@assignUser
Copy link
Member

This changes binary URL pattern. So this is a backward incompatible change for existing users who are using arrow.repo. (I'm not sure how many users use it.)

We are not planning to remove the previous binaries from jfrog right? So I don't have an issue with that and I don't think we need to create past releases as Github releases or in maven.

We might be ok to have a github link to the binary there (I don't believe that they specifically check the URL / block github, just that if we talk about it loudly they will complain that that's not acceptable

Agreed, I think we can try it out.

(We can remove the r-lib__libarrow__bin__ prefix and the __arrow-19.0.1.zip suffix.)

I think we should mark their use for r somehow in the file name but agree that we can reduce the name by a lot.

@kou
Copy link
Member Author

kou commented Mar 30, 2025

We are not planning to remove the previous binaries from jfrog right?

Right. We must not remove the previous binaries for keeping backward compatibility.

(We can remove the r-lib__libarrow__bin__ prefix and the __arrow-19.0.1.zip suffix.)

I think we should mark their use for r somehow in the file name but agree that we can reduce the name by a lot.

How about r-libarrow-darwin-arm64-openssl-1.1-X.Y.Z.zip?

@kou kou force-pushed the release-r-github-release branch 3 times, most recently from 40737ae to 30fd756 Compare April 4, 2025 05:48
@eitsupi
Copy link
Contributor

eitsupi commented May 5, 2025

GitHub Actions:

r-lib__libarrow__bin__darwin-arm64-openssl-1.1__arrow-X.Y.Z.zip
r-lib__libarrow__bin__darwin-arm64-openssl-3.0__arrow-X.Y.Z.zip
r-lib__libarrow__bin__darwin-x86_64-openssl-1.1__arrow-X.Y.Z.zip
r-lib__libarrow__bin__darwin-x86_64-openssl-3.0__arrow-X.Y.Z.zip
r-lib__libarrow__bin__linux-openssl-1.0__arrow-X.Y.Z.zip
r-lib__libarrow__bin__linux-openssl-1.1__arrow-X.Y.Z.zip
r-lib__libarrow__bin__linux-openssl-3.0__arrow-X.Y.Z.zip
r-lib__libarrow__bin__windows__arrow-X.Y.Z.zip

Related to #36193, if you do change the URLs, I think it is useful that the Linux and Windows ones be made arm64 compliant as well, including the CPU architecture.

@amoeba
Copy link
Member

amoeba commented Jun 30, 2025

Hi @kou, I think that URL pattern looks good. Do we want to try this for the 21 release? cc @assignUser

@kou
Copy link
Member Author

kou commented Jul 1, 2025

Ah, sorry. I haven't completed this yet.

Let's try this in the 22 release. I want to try automated release signing for source archive in the 21 release.

@amoeba
Copy link
Member

amoeba commented Jul 1, 2025

Sounds good. Thanks for your work on this (and that).

@kou kou force-pushed the release-r-github-release branch from 30fd756 to c8fe76a Compare September 8, 2025 03:03
@kou
Copy link
Member Author

kou commented Sep 8, 2025

I restart this for 22.0.0. I'll use the following filenames:

r-libarrow-darwin-arm64-openssl-1.1--X.Y.Z.zip
r-libarrow-darwin-arm64-openssl-3.0-X.Y.Z.zip
r-libarrow-darwin-x86_64-openssl-1.1-X.Y.Z.zip
r-libarrow-darwin-x86_64-openssl-3.0-X.Y.Z.zip
r-libarrow-linux-x86_64-openssl-1.0-X.Y.Z.zip
r-libarrow-linux-x86_64-openssl-1.1-X.Y.Z.zip
r-libarrow-linux-x86_64-openssl-3.0-X.Y.Z.zip
r-libarrow-windows-x86_64-X.Y.Z.zip

@kou kou force-pushed the release-r-github-release branch from c8fe76a to 08dee0a Compare September 8, 2025 08:02
@kou

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@kou kou force-pushed the release-r-github-release branch from 08dee0a to c919967 Compare September 8, 2025 08:06
@kou
Copy link
Member Author

kou commented Sep 8, 2025

@github-actions crossbow submit r-binary-packages

@github-actions

This comment was marked as outdated.

@amoeba
Copy link
Member

amoeba commented Sep 19, 2025

Oh, right. I was downloading the Actions artifact which wraps the file up in a zip. Sorry for the mistake.

Yeah, removing the inner folder would be good I think.

@kou
Copy link
Member Author

kou commented Sep 19, 2025

Hmm. arrow-21.0.0.zip in https://packages.apache.org/ui/native/arrow/r/21.0.0/libarrow/bin/windows/ also has the top-level directory:

$ unzip -l arrow-21.0.0.zip | head
Archive:  arrow-21.0.0.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  2025-07-11 18:04   arrow-21.0.0/
        0  2025-07-11 16:58   arrow-21.0.0/include/
        0  2025-07-11 16:58   arrow-21.0.0/include/arrow/
        0  2025-07-11 16:58   arrow-21.0.0/include/arrow/acero/
     5985  2025-07-11 16:58   arrow-21.0.0/include/arrow/acero/accumulation_queue.h
     2201  2025-07-11 16:58   arrow-21.0.0/include/arrow/acero/aggregate_node.h
     1151  2025-07-11 16:58   arrow-21.0.0/include/arrow/acero/api.h

Where did you download arrow-21.0.0.zip?

@amoeba
Copy link
Member

amoeba commented Sep 19, 2025

Odd. Maybe there is just some platform difference and everything is fine.

# I get the URL from the R package by asking it to install with the binaries
$ wget https://apache.jfrog.io/artifactory/arrow/r/21.0.0/libarrow/bin/darwin-arm64-openssl-3.0/arrow-21.0.0.zip

$ unzip -l arrow-21.0.0.zip | head
Archive:  arrow-21.0.0.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  07-11-2025 00:57   lib/
        0  07-11-2025 00:57   lib/pkgconfig/
     1195  07-11-2025 00:55   lib/pkgconfig/arrow-dataset.pc
     1006  07-11-2025 00:55   lib/pkgconfig/arrow-csv.pc
     1176  07-11-2025 00:55   lib/pkgconfig/parquet.pc
     1008  07-11-2025 00:55   lib/pkgconfig/arrow-json.pc
     1096  07-11-2025 00:55   lib/pkgconfig/arrow-compute.pc

Edit: It looks like all the packages on https://packages.apache.org/ui/native/arrow/r/21.0.0/libarrow/bin/windows/ are as you say so I'm comparing the wrong packages. Maybe everything is fine?

@kou
Copy link
Member Author

kou commented Sep 19, 2025

Maybe everything is fine?

I think so.

FYI: I don't know why but only Windows binaries use different directory structure:

arrow/r/tools/nixlibs.R

Lines 993 to 1000 in 479662e

# configure.win uses a different libarrow dir and the zip is already nested
if (on_windows) {
lib_dir <- "windows"
dst_dir <- lib_dir
} else {
lib_dir <- "libarrow"
dst_dir <- file.path(lib_dir, arrow_versioned)
}

I can unify them (all binaries have the top-level directory or all binaries don't have the top-level directory) but a follow-up task is better.

@amoeba
Copy link
Member

amoeba commented Sep 19, 2025

but a follow-up task is better.

Absolutely. Thanks again for the work on this.

@kou
Copy link
Member Author

kou commented Sep 22, 2025

I'll merge this in this week if nobody objects this.

Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the limited knowledge I have on the R binaries / package, this seems reasonable to me.
Thanks @kou for working on this!

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting committer review Awaiting committer review labels Sep 23, 2025
Copy link
Member

@jonkeane jonkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. As we said at the start, there is a small chance that this will bonk us with CRAN and we will need to do something else for the windows builds. But I'm ok to try it. Our windows builds on CRAN have been pretty stable recently so I'm not super worried that will be the flag that goes up next release.

Thanks, again!

Comment on lines +264 to +265
name: r-libarrow-darwin-{{ arch }}-openssl-{{ openssl_version }}
path: repo/libarrow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much cleaner, thank you!

@kou
Copy link
Member Author

kou commented Sep 23, 2025

@github-actions crossbow submit r-binary-packages

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting merge Awaiting merge awaiting changes Awaiting changes labels Sep 23, 2025
@github-actions
Copy link

Revision: 32d6c41

Submitted crossbow builds: ursacomputing/crossbow @ actions-5df309f890

Task Status
r-binary-packages GitHub Actions

@kou kou merged commit 19e3f90 into apache:main Sep 26, 2025
19 checks passed
@kou kou removed the awaiting change review Awaiting change review label Sep 26, 2025
@kou kou deleted the release-r-github-release branch September 26, 2025 13:34
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 19e3f90.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them.

thisisnic pushed a commit that referenced this pull request Oct 7, 2025
### Rationale for this change

#45964 changed paths of pre-built Apache Arrow C++ binaries for R. But we forgot to update the nightly upload job.

### What changes are included in this PR?

Update paths in the nightly upload job.

### Are these changes tested?

No...

### Are there any user-facing changes?

Yes.
* GitHub Issue: #47704

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Nic Crane <[email protected]>
raulcd pushed a commit that referenced this pull request Oct 8, 2025
### Rationale for this change

#45964 changed paths of pre-built Apache Arrow C++ binaries for R. But we forgot to update the nightly upload job.

### What changes are included in this PR?

Update paths in the nightly upload job.

### Are these changes tested?

No...

### Are there any user-facing changes?

Yes.
* GitHub Issue: #47704

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Nic Crane <[email protected]>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Oct 15, 2025
…pache#45964)

### Rationale for this change

We want to stop using apache.jfrog.io as much as possible. See also: apache#40760

We can use https://github.com/apache/arrow/releases as alternative.

### What changes are included in this PR?

If we use GitHub Release, we can automate uploading by GitHub Actions.

But GitHub Release doesn't support directory structure. So we need to put all artifacts to https://github.com/apache/arrow/releases/download/apache-arrow-X.Y.Z/ .

For example:

apache.jfrog.io:

```text
libarrow/bin/darwin-arm64-openssl-1.1/arrow-X.Y.Z.zip
libarrow/bin/darwin-arm64-openssl-3.0/arrow-X.Y.Z.zip
libarrow/bin/darwin-x86_64-openssl-1.1/arrow-X.Y.Z.zip
libarrow/bin/darwin-x86_64-openssl-3.0/arrow-X.Y.Z.zip
libarrow/bin/linux-openssl-1.0/arrow-X.Y.Z.zip
libarrow/bin/linux-openssl-1.1/arrow-X.Y.Z.zip
libarrow/bin/linux-openssl-3.0/arrow-X.Y.Z.zip
libarrow/bin/windows/arrow-X.Y.Z.zip
```

GitHub Actions:

```text
r-libarrow-darwin-arm64-openssl-1.1--X.Y.Z.zip
r-libarrow-darwin-arm64-openssl-3.0-X.Y.Z.zip
r-libarrow-darwin-x86_64-openssl-1.1-X.Y.Z.zip
r-libarrow-darwin-x86_64-openssl-3.0-X.Y.Z.zip
r-libarrow-linux-x86_64-openssl-1.0-X.Y.Z.zip
r-libarrow-linux-x86_64-openssl-1.1-X.Y.Z.zip
r-libarrow-linux-x86_64-openssl-3.0-X.Y.Z.zip
r-libarrow-windows-x86_64-X.Y.Z.zip
```

### Are these changes tested?

No.

### Are there any user-facing changes?

Yes. Custom `arrow_repo` doesn't work because URL is changed.
* GitHub Issue: apache#45921

Lead-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Oct 15, 2025
…he#47727)

### Rationale for this change

apache#45964 changed paths of pre-built Apache Arrow C++ binaries for R. But we forgot to update the nightly upload job.

### What changes are included in this PR?

Update paths in the nightly upload job.

### Are these changes tested?

No...

### Are there any user-facing changes?

Yes.
* GitHub Issue: apache#47704

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Nic Crane <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants