Jump to content

Shallow Kernel Git trees via GitHub Actions


Recommended Posts

Hello, I've finally gotten around to writing the GitHub Actions based Linux Kernel Shallow git trees exporter.

It's very simple, a single shell script and GHA workflow.

It prepares shallow bundle (and ready-to-go .tar too) everyday on a schedule and publishes to GitHub Releases.

Please take a look https://github.com/rpardini/armbian-git-shallow - theres a README that shows how to use the bundles outside of Armbian.

If this is a good idea, I'd move it to armbian/kernel-git-shallow in GitHub.

And then I'd start using those in armbian-next first as a test ground.

 

 

I mention I few people I know are interested in the subject @Igor and @going but of course everyone welcome to pitch in

Thanks

 

(PS: No idea why I'm required to tag a board for this post)

Link to comment
Share on other sites

1 hour ago, rpardini said:

(PS: No idea why I'm required to tag a board for this post)

tagging is kind a stupid. the lesser of two evils was to simply enforce tagging in general. However I can disable entirely for particular forums. I guess this could be one of those.

Link to comment
Share on other sites

Thanks for the attention guys. I guess I did not explain this correctly...

 

The shallow_kernel_tree.sh script is run on GitHub Actions (not any user machine, ever), it downloads a lot of stuff (including a 2.4GB bundle, indeed), updates it, massages it into shallow, and publishes the output.

The output is here: https://github.com/rpardini/armbian-git-shallow/releases/tag/latest 

Each kernel version gets a shallow bundle (file) around 250mb each. 

 

That output would then (later) be used by armbian/build to seed a kernel tree, downloading from GH CDNs via HTTPS, instead of pulling everything from kernel.org's (or mirrors) git server.

After seeding, pulling from the git server is very fast / small download (a few megabytes).

That way people or CI servers building Armbian will only download those 250mb + a few.

 

Combining all this allows us to have a small, shallow git tree, that has the tags we need, and can be kept updated cheaply, without ever fetching/cloning it using shallow, and thus does not put any load on the git server.

 

Link to comment
Share on other sites

15 часов назад, rpardini сказал:

That output would then (later) be used by armbian/build to seed a kernel tree, downloading from GH CDNs via HTTPS, instead of pulling everything from kernel.org's (or mirrors) git server.

After seeding, pulling from the git server is very fast / small download (a few megabytes).

That way people or CI servers building Armbian will only download those 250mb + a few.

This is what can be confidently called the most correct part of the algorithm.

 

15 часов назад, rpardini сказал:

You've already done everything! Well done!

This is something that I am happy to store in a local folder.

And the ability to extract from the archive to the target folder cache/sources/linux-mainline.

This algorithm will greatly save free space inside the virtual machine.

 

But then the process of creating these archives in the build system is not necessary. It becomes a simple periodic manual work. Update this storage once a month.

Link to comment
Share on other sites

21 hours ago, going said:

But then the process of creating these archives in the build system is not necessary. It becomes a simple periodic manual work. Update this storage once a month

 

Yes. GitHub Actions workflow is setup to run everyday at 3am UTC. https://github.com/rpardini/armbian-git-shallow/blob/main/.github/workflows/main-latest.yml#L6

If you look at the GHA logs it is already working, everytime I push a commit but also everyday at 3am by itself: https://github.com/rpardini/armbian-git-shallow/actions

👍

 

-----

 

Apart from that, I am still investigating some vendor kernels (HardKernel, but also others) specially in 4.19 series, if I pre-seed the local copy with linux 4.19 shallow bundle, and then pull from that vendor's kernel tree, it comes down a lot (1.5Gb).

I suspect that is because the vendor started his kernel Git from somewhere _way_ before 4.19-rc1, and there is a mismatch during git negotiation for the fetch, and ends up pulling a lot.

@going I think you also regularly pull from "megous" repository that would be interesting to test too.

 

------

 

Another possible problem are developers/users in China which can't get to github.com for downloads.

For this case we could publish, in addition to GH Releases, also to some other FTP or rsync that we already have @Igor?

 

 

Link to comment
Share on other sites

2 hours ago, rpardini said:

For this case we could publish, in addition to GH Releases, also to some other FTP or rsync that we already have @Igor?


Yeah, this way, just into some other folder:
https://github.com/armbian/scripts/blob/master/.github/workflows/pack-debian.yml#L230-L241

You need to move repository under /armbian in order that key works.

Link to comment
Share on other sites

26.05.2022 в 12:29, rpardini сказал:

I suspect that is because the vendor started his kernel Git from somewhere _way_ before 4.19-rc1, and there is a mismatch during git negotiation for the fetch, and ends up pulling a lot.

@going I think you also regularly pull from "megous" repository that would be interesting to test too.

Today I have no problems with this.

The existing algorithm for the sunxi build line works well for repositories "https://github.com/", "https://source.denx.de/", "https://git.kernel.org/".

But it doesn't work for "https://kernel.googlesource.com/".

Link to comment
Share on other sites

Pardon me for asking silly questions, but why doesn't armbian just maintain one master clone of the upstream git repository with branches set up to track all upstream branches for the boards, then just fetch from the master branch locally? I am trying to create a patch based on the Collabora WIP patches for V4L2 HEVC work, and dealing with the build script re-fetching the mainline kernel from the master repository is just too much. So I have cloned the tree myself into my home dir, and changed lib/configuration.sh MAINLINE_KERNEL_SOURCE to point at the $HOME/linux-source/.git directory. Armbian then takes over and does the full clone locally, and I can tweak/rebase my local copy as needed, without needing to change OFFLINE_WORK and dealing with refetching different current/edge versions.

Link to comment
Share on other sites

@hege In a sense, all the participants of the armbian project can be called free artists. In this sense, the saying can be applied to all of us: "he is an artist and he sees this way." I implemented the "OFFLINE_WORK" key a long time ago. The main purpose of this work was to ensure that the assembly system did not make requests to distant countries and did not wait for long-term responses. This greatly slowed down the assembly process.
And also do not update local repositories, but only return them to their original state.
Today, on my local host, I implemented support for the work of the git daemon. And all requests from the VM are forwarded there. The relevance of this key has become minimal. But it remained in the build system and I hope that it will be useful to someone.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines