git-annex.git
9 months agosafer git sha object filename
Joey Hess [Tue, 4 Mar 2025 18:54:13 +0000 (14:54 -0400)]
safer git sha object filename

Rather than use the filename provided by INPUT, which could come from user
input, and so could be something that looks like a dashed parameter,
use a .git/object/<sha> filename.

This avoids user input passing through INPUT and back out, with the file
path then passed to a command, which could do something unexpected with
a dashed parameter, or other special parameter.

Added a note in the design about being careful of passing user input to
commands. They still have to be careful of that in general, just not in
this case.

9 months agocycle detection
Joey Hess [Tue, 4 Mar 2025 18:06:55 +0000 (14:06 -0400)]
cycle detection

9 months agoimprove error message when unable to get an input file
Joey Hess [Tue, 4 Mar 2025 17:13:18 +0000 (13:13 -0400)]
improve error message when unable to get an input file

In this case, the compute program is run the same as if addcomputed --fast
were used, so it should succeed, without outputting a computed file.

computeInputsUnavailable is in ComputeState for simplicity, but it is
not serialized with the rest of the ComputeState.

9 months agoupdate location log after getting input file from remote
Joey Hess [Tue, 4 Mar 2025 16:51:38 +0000 (12:51 -0400)]
update location log after getting input file from remote

9 months agobetter wording
Joey Hess [Tue, 4 Mar 2025 16:43:50 +0000 (12:43 -0400)]
better wording

Avoids this contradiction:

(Auto enabling special remote foo...)

  Not enabling compute special remote c2 because [..]

9 months agocompute remote: get input files from other remotes
Joey Hess [Tue, 4 Mar 2025 15:06:58 +0000 (11:06 -0400)]
compute remote: get input files from other remotes

This needed some refactoring to avoid cycles, since Remote.Compute
cannot import Remote.List. Instead, it uses Annex.remotes. Which must be
populated by something else, but we know it has been, because something
is using Remote.Compute, which it must have found in the remote list,
which populates that.

In Remote.Compute, keyPossibilities' is called with all loggedLocations,
without the trustExclude DeadTrusted that keyLocations does. There is
another cycle there. This may be a problem if a dead repository is still
a remote.

This is missing cycle prevention, and it's certianly possible to make 2
files in the compute remote co-depend on one-another. Hopefully not in a
real world situation, but it an attacker could certainly do it. Cycle
prevention will need to be added to this.

9 months agomove showOutput into compute remote
Joey Hess [Tue, 4 Mar 2025 14:02:33 +0000 (10:02 -0400)]
move showOutput into compute remote

9 months agorename config to annex.security.allowed-compute-programs
Joey Hess [Mon, 3 Mar 2025 20:07:04 +0000 (16:07 -0400)]
rename config to annex.security.allowed-compute-programs

And require for enable as well as autoenable.

It seemed asking for trouble for `git-annex enable foo` to use whatever
compute program is stored in the git config, without verifying that the
user wants that program to be used.

Note that it would be good to allow `git-annex enable foo program=...`
to be used without the program being in the git config. Not implemented yet
though.

9 months agoautoenable security for compute special remote
Joey Hess [Mon, 3 Mar 2025 19:47:09 +0000 (15:47 -0400)]
autoenable security for compute special remote

Added annex.security.autoenable-compute-programs and only allow
autoenabling special remotes that use compute programs on that list.

The reason this is needed is a user might have some compute programs
that are less safe to use than others. They might want to use an unsafe
one only with one repository, where they are the only committer or other
committers are trusted. They might be ok with others being used by any
repository, and if so they can add them to the list.

Another reason would be a user who has installed a compute program by
accident. Eg, it might be included with git-annex at some point, or
pulled in by some dependency. That user doesn't necessarily want that
compute program to be used in an autoenabled special remote.

9 months agorecompute: display one of the changed files
Joey Hess [Mon, 3 Mar 2025 19:12:19 +0000 (15:12 -0400)]
recompute: display one of the changed files

9 months agoavoid recomputing every time on git inputs
Joey Hess [Mon, 3 Mar 2025 18:56:49 +0000 (14:56 -0400)]
avoid recomputing every time on git inputs

9 months agosupport git files as input to computations
Joey Hess [Mon, 3 Mar 2025 15:59:04 +0000 (11:59 -0400)]
support git files as input to computations

Using GIT keys, like are used when exporting git files to special
remotes. Except here the GIT key refers to a file checked into the git
repo.

Note that, since the compute remote uses catObject to get the content,
a symlink that is checked into git does not get followed. This is important
for security, because following a symlink and adding the content to the
repo as an annex object would allow exfiltrating content from outside
the repository.

Instead, the behavior with a symlink is to run the computation on the
symlink target. This may turn out to be confusing, and it might be worth
addcomputed checking if the file in git is a symlink and erroring out.
Or it could follow symlinks as long as the destination is a file in the
repisitory.

9 months agofactor out Annex.GitShaKey
Joey Hess [Mon, 3 Mar 2025 15:08:36 +0000 (11:08 -0400)]
factor out Annex.GitShaKey

9 months agorecord VURL key hashes in addcomputed and recompute
Joey Hess [Mon, 3 Mar 2025 14:57:56 +0000 (10:57 -0400)]
record VURL key hashes in addcomputed and recompute

9 months agorecord VURL key hashes when getting from compute remote
Joey Hess [Thu, 27 Feb 2025 20:19:41 +0000 (16:19 -0400)]
record VURL key hashes when getting from compute remote

Like when getting from the web special remote, when the output of the
computation has changed, record the new hash of the content as an
equivilant key for the VURL key.

Still needs to be done for addcomputed and recompute.

9 months agofix build
Joey Hess [Thu, 27 Feb 2025 20:18:04 +0000 (16:18 -0400)]
fix build

9 months agorefactor
Joey Hess [Thu, 27 Feb 2025 20:17:42 +0000 (16:17 -0400)]
refactor

9 months agomany recompute improvements
Joey Hess [Thu, 27 Feb 2025 19:12:29 +0000 (15:12 -0400)]
many recompute improvements

I've lost track of them all, but it includes:

* Using the same key backend as was used in the original computation.
* Fixing bug that prevented updating the source file key in the compute
  state
* Handling --reproducible and --unreproducible.
* recompute --original of a file using VURL, when the result is
  different, but the key remains the same, makes the object file
  be updated with the new content
* Detecting some other ways the program behavior can change, just for
  completeness.
* Also adds --backend to addcomputed.

9 months agorefactoring
Joey Hess [Thu, 27 Feb 2025 18:54:03 +0000 (14:54 -0400)]
refactoring

9 months agofix recompute of renamed files
Joey Hess [Thu, 27 Feb 2025 15:10:44 +0000 (11:10 -0400)]
fix recompute of renamed files

When a computed file has been renamed, a recompute needs to write to the
new filename.

I decided to remove --others because it's not clear what it should do in
the face of renames. Should it update only other files that have not
been renamed? Or update files that use the old key to the new key
anywhere in the tree? Or write the other files to the cwd, ignoring
renames? Since --others is just a way to save on compute time, adding
this complexity at this point seems like a bad idea. May revisit later.

Added temporary TODO-compute file

9 months agotodo
Joey Hess [Wed, 26 Feb 2025 19:59:47 +0000 (15:59 -0400)]
todo

9 months agorecompute closer to working properly
Joey Hess [Wed, 26 Feb 2025 19:51:31 +0000 (15:51 -0400)]
recompute closer to working properly

Proper behavior without --others implemented.

And eliminated most of the code duplication through refactoring.

Also, changed it to not stage recomputed files. This way, git diff will
show files that have differences.

9 months agorefactor
Joey Hess [Wed, 26 Feb 2025 18:05:37 +0000 (14:05 -0400)]
refactor

9 months agostarted git-annex recompute
Joey Hess [Wed, 26 Feb 2025 15:25:32 +0000 (11:25 -0400)]
started git-annex recompute

The perform action of this still needs work to do the right thing.
In particular, it currently behaves as if --others was always set.
And, it duplicates a lot of code from addcomputed.

9 months agoshowOutput
Joey Hess [Wed, 26 Feb 2025 13:47:56 +0000 (09:47 -0400)]
showOutput

when the compute program eg displays usage, it needs to start on its own
line

9 months agoaddcomputed inherits extra initremote parameters
Joey Hess [Wed, 26 Feb 2025 13:45:35 +0000 (09:45 -0400)]
addcomputed inherits extra initremote parameters

This is limited because the remote config is a field/value map. So order
is not preserved, and when 2 parameters have the same field name, only
the last one will be passed.

9 months agotodo
Joey Hess [Tue, 25 Feb 2025 22:45:55 +0000 (18:45 -0400)]
todo

9 months agoadd compute remote uuid to compute state url
Joey Hess [Tue, 25 Feb 2025 22:44:40 +0000 (18:44 -0400)]
add compute remote uuid to compute state url

Otherwise, two different compute remotes that happen to take the same
input would use the same compute state url. Which seems wrong.

9 months agowording
Joey Hess [Tue, 25 Feb 2025 21:26:28 +0000 (17:26 -0400)]
wording

9 months agopdate demo program
Joey Hess [Tue, 25 Feb 2025 21:23:38 +0000 (17:23 -0400)]
pdate demo program

needed a mkdir

9 months agouse compute program REPRODUCIBLE by default
Joey Hess [Tue, 25 Feb 2025 21:10:41 +0000 (17:10 -0400)]
use compute program REPRODUCIBLE by default

9 months agoingest when --unreproducible is used without --fast
Joey Hess [Tue, 25 Feb 2025 21:00:00 +0000 (17:00 -0400)]
ingest when --unreproducible is used without --fast

9 months agoaddcomputed --fast and --unreproducible working
Joey Hess [Tue, 25 Feb 2025 20:36:22 +0000 (16:36 -0400)]
addcomputed --fast and --unreproducible working

For these, use VURL and URL keys, with an "annex-compute:" URI prefix.

These URL keys will look something like this:

URL--annex-compute&cbar4,63pconvert,3-f4d3d72cf3f16ac9c3e9a8012bde4462

Generally it's too long so most of it gets md5summed. It's a little
ugly, but it's what fell out of the existing URL key generation
machinery. I did consider special casing to eg
"URL--annex-compute&c4d3d72cf3f16ac9c3e9a8012bde4462". But it seems at
least possibly useful that the name of the file that was computed is
visible and perhaps one or two words of the git-annex compute command
parameters.

Note that two different output files from the same computation will get
the same URL key. And these keys should remain stable.

9 months agoadd git-annex addcomputed
Joey Hess [Tue, 25 Feb 2025 19:45:14 +0000 (15:45 -0400)]
add git-annex addcomputed

Working pretty well. Mostly. But:

* Does not yet support inputs that are non-annexed files checked into git
* --fast is currently broken (will need something like VURL keys)
* --unreproducible still uses a checksumming backend, so drop and get
  again will likely fail (needs probably to use an URL key or something
  like one)

The compute special remote seems to work pretty well too. Eg,
getting from it works, and dropping content that is present in it works.

9 months agohandle comutations in subdirs of the git repository
Joey Hess [Tue, 25 Feb 2025 19:08:38 +0000 (15:08 -0400)]
handle comutations in subdirs of the git repository

Eg, a computation might be run in "foo/" and refer to "../bar" as an
input or output.

So, the subdir is part of the computation state.

Also, prevent input or output of files that are outside the git
repository. Of course, the program can access any file on disk if it
wants to; this is just a guard against mistakes. And it may also be
useful if the program comunicates with something less trusted than it,
eg a container image, so input/output files communicated by that are not
the source of security problems.

9 months agoadd field desc
Joey Hess [Mon, 24 Feb 2025 20:39:55 +0000 (16:39 -0400)]
add field desc

9 months agoupdate for new interface
Joey Hess [Mon, 24 Feb 2025 20:15:04 +0000 (16:15 -0400)]
update for new interface

9 months agoreimplement using new compute program interface
Joey Hess [Mon, 24 Feb 2025 19:48:42 +0000 (15:48 -0400)]
reimplement using new compute program interface

9 months agosupport addcomputed --fast
Joey Hess [Mon, 24 Feb 2025 17:48:46 +0000 (13:48 -0400)]
support addcomputed --fast

This complicates the interface but it's still simpler to understand than
the old interface.

9 months agonew compute program interface
Joey Hess [Mon, 24 Feb 2025 16:41:25 +0000 (12:41 -0400)]
new compute program interface

This is much more flexible, and also simpler to understand.

9 months agoupdate
Joey Hess [Fri, 21 Feb 2025 19:09:46 +0000 (15:09 -0400)]
update

9 months agocompute special remote mostly implemented
Joey Hess [Fri, 21 Feb 2025 19:02:53 +0000 (15:02 -0400)]
compute special remote mostly implemented

Except for some of the hard parts: progress displays, incremental
verification, and getting inputs before running a computation.

Untested! In order to test this, git-annex addcomputed needs to be
implemented.

9 months agoremove unused adjustedBranchRefresh associated file parameter
Joey Hess [Fri, 21 Feb 2025 18:51:02 +0000 (14:51 -0400)]
remove unused adjustedBranchRefresh associated file parameter

9 months agowip
Joey Hess [Thu, 20 Feb 2025 17:29:05 +0000 (13:29 -0400)]
wip

9 months agoupdate
Joey Hess [Thu, 20 Feb 2025 17:27:59 +0000 (13:27 -0400)]
update

9 months agowip
Joey Hess [Thu, 20 Feb 2025 17:27:47 +0000 (13:27 -0400)]
wip

9 months agoupdate
Joey Hess [Wed, 19 Feb 2025 20:03:34 +0000 (16:03 -0400)]
update

9 months agocomments
Joey Hess [Wed, 19 Feb 2025 19:14:52 +0000 (15:14 -0400)]
comments

9 months agodocumentation for compute remote and associated commands
Joey Hess [Wed, 19 Feb 2025 18:29:18 +0000 (14:29 -0400)]
documentation for compute remote and associated commands

None of this is implemented yet.

9 months agoadd REPRODUCIBLE
Joey Hess [Wed, 19 Feb 2025 18:16:36 +0000 (14:16 -0400)]
add REPRODUCIBLE

9 months agooptional and required inputs and some other changes
Joey Hess [Wed, 19 Feb 2025 16:32:35 +0000 (12:32 -0400)]
optional and required inputs and some other changes

10 months agoimproved draft design
Joey Hess [Tue, 18 Feb 2025 19:46:47 +0000 (15:46 -0400)]
improved draft design

10 months agoimprove apiurl description
Joey Hess [Tue, 18 Feb 2025 18:46:10 +0000 (14:46 -0400)]
improve apiurl description

10 months agogit-lfs apiurl parameter
Joey Hess [Tue, 18 Feb 2025 18:11:11 +0000 (14:11 -0400)]
git-lfs apiurl parameter

git-lfs: Added an optional apiurl parameter.

This needs version 1.2.5 of the haskell git-lfs library to be used.
stack.yaml updated to use that.

Note that git-annex enableremote can be used to add apiurl= to an existing
git-lfs special remote. To allow unsetting the apiurl and instead use
the probed url, support enableremote with apiurl set to an empty string.

Sponsored-by: Luke T. Shumaker
10 months agoAdded a comment: Faced same issue for long time
sharad [Mon, 17 Feb 2025 19:30:28 +0000 (19:30 +0000)]
Added a comment: Faced same issue for long time

10 months agoOsPath build fix
Joey Hess [Mon, 17 Feb 2025 18:56:56 +0000 (14:56 -0400)]
OsPath build fix

10 months agoOsPath build fix
Joey Hess [Mon, 17 Feb 2025 18:46:43 +0000 (14:46 -0400)]
OsPath build fix

10 months agoOSX build fix
Joey Hess [Mon, 17 Feb 2025 18:06:06 +0000 (14:06 -0400)]
OSX build fix

10 months agoOSX build fixes
Joey Hess [Mon, 17 Feb 2025 18:05:19 +0000 (14:05 -0400)]
OSX build fixes

10 months agoOSX build fixes
Joey Hess [Mon, 17 Feb 2025 18:04:08 +0000 (14:04 -0400)]
OSX build fixes

10 months agoOSX build fix
Joey Hess [Mon, 17 Feb 2025 18:01:54 +0000 (14:01 -0400)]
OSX build fix

10 months agoOSX build fixes
Joey Hess [Mon, 17 Feb 2025 17:59:52 +0000 (13:59 -0400)]
OSX build fixes

10 months agoMerge branch 'ospath'
Joey Hess [Mon, 17 Feb 2025 15:58:20 +0000 (11:58 -0400)]
Merge branch 'ospath'

10 months agoAdded a comment
datamanager [Sat, 15 Feb 2025 21:46:33 +0000 (21:46 +0000)]
Added a comment

10 months ago(no commit message)
puck [Sat, 15 Feb 2025 10:36:03 +0000 (10:36 +0000)]

10 months agoOsPath conversion for OSXMkLibs
Joey Hess [Fri, 14 Feb 2025 20:53:00 +0000 (16:53 -0400)]
OsPath conversion for OSXMkLibs

10 months agoMerge branch 'master' into ospath
Joey Hess [Fri, 14 Feb 2025 20:28:43 +0000 (16:28 -0400)]
Merge branch 'master' into ospath

10 months agoMerge branch 'master' of ssh://git-annex.branchable.com
Joey Hess [Fri, 14 Feb 2025 19:41:23 +0000 (15:41 -0400)]
Merge branch 'master' of ssh://git-annex.branchable.com

10 months agofurther fix OSX packaging program builds
Joey Hess [Fri, 14 Feb 2025 19:40:48 +0000 (15:40 -0400)]
further fix OSX packaging program builds

Broken by commit e5be81f8d4bf7f6cef5ac4ff0b059efbdf6055ea

10 months agomore details on my issues
anarcat [Fri, 14 Feb 2025 17:54:24 +0000 (17:54 +0000)]
more details on my issues

10 months agoAdded a comment: similar topic
anarcat [Fri, 14 Feb 2025 17:51:29 +0000 (17:51 +0000)]
Added a comment: similar topic

10 months agoAdded a comment: similar topic
anarcat [Fri, 14 Feb 2025 17:47:02 +0000 (17:47 +0000)]
Added a comment: similar topic

10 months agodraft
Joey Hess [Thu, 13 Feb 2025 20:12:07 +0000 (16:12 -0400)]
draft

10 months agocomment
Joey Hess [Thu, 13 Feb 2025 17:51:21 +0000 (13:51 -0400)]
comment

10 months agocomment
Joey Hess [Thu, 13 Feb 2025 17:01:15 +0000 (13:01 -0400)]
comment

10 months agoOsPath conversion of DistributionUpdate
Joey Hess [Wed, 12 Feb 2025 17:27:34 +0000 (13:27 -0400)]
OsPath conversion of DistributionUpdate

10 months agopush down OsPath into CopyFile
Joey Hess [Wed, 12 Feb 2025 17:11:27 +0000 (13:11 -0400)]
push down OsPath into CopyFile

10 months agostop exporting RawFilePath
Joey Hess [Wed, 12 Feb 2025 16:59:30 +0000 (12:59 -0400)]
stop exporting RawFilePath

10 months agoavoid head warnings with recent ghc versions
Joey Hess [Wed, 12 Feb 2025 16:43:03 +0000 (12:43 -0400)]
avoid head warnings with recent ghc versions

10 months agoremove the git-union-merge command
Joey Hess [Wed, 12 Feb 2025 16:37:36 +0000 (12:37 -0400)]
remove the git-union-merge command

This has never been built and shipped as part of git-annex,
and including it as a pedagolical example in
the source code doesn't have much benefit. The program was not currently
buildable after recent OsPath changes.

Of course, Git/UnionMerge.hs is still available and can be used.

10 months agofix description of ParallelBuild
Joey Hess [Wed, 12 Feb 2025 16:32:22 +0000 (12:32 -0400)]
fix description of ParallelBuild

10 months agoRevert "stack.yaml: temporarily build with older ghc"
Joey Hess [Tue, 11 Feb 2025 20:57:32 +0000 (16:57 -0400)]
Revert "stack.yaml: temporarily build with older ghc"

This reverts commit 2f9a384e48cb4407e6b5b70d1db6efa593654f0e.

10 months agoMerge branch 'master' into ospath
Joey Hess [Tue, 11 Feb 2025 20:56:17 +0000 (16:56 -0400)]
Merge branch 'master' into ospath

10 months agofix windows and OSX packaging program builds
Joey Hess [Tue, 11 Feb 2025 20:53:01 +0000 (16:53 -0400)]
fix windows and OSX packaging program builds

Broken by commit e5be81f8d4bf7f6cef5ac4ff0b059efbdf6055ea

10 months agoMerge branch 'ospathwin2' into ospath
Joey Hess [Tue, 11 Feb 2025 20:46:01 +0000 (16:46 -0400)]
Merge branch 'ospathwin2' into ospath

10 months agofix convertToWindowsNativeNameSpace bug
Joey Hess [Wed, 12 Feb 2025 04:37:40 +0000 (20:37 -0800)]
fix convertToWindowsNativeNameSpace bug

This fixes a test suite failure. The OsPath conversion made that be used
in more places, including addurl, which exposed an existing bug.

10 months agoavoid build warning on windows
Joey Hess [Tue, 11 Feb 2025 20:30:47 +0000 (16:30 -0400)]
avoid build warning on windows

10 months agoOsPath transition Windows build fixes
Joey Hess [Wed, 12 Feb 2025 03:23:02 +0000 (19:23 -0800)]
OsPath transition Windows build fixes

This gets it building on Windows again, with 1 test suite failure
(addurl).

Sponsored-by: Kevin Mueller
10 months agofix comment
Joey Hess [Tue, 11 Feb 2025 18:07:01 +0000 (14:07 -0400)]
fix comment

10 months agoimproved OsPath conversion
Joey Hess [Tue, 11 Feb 2025 18:05:56 +0000 (14:05 -0400)]
improved OsPath conversion

10 months agomore OsPath conversion
Joey Hess [Tue, 11 Feb 2025 18:03:20 +0000 (14:03 -0400)]
more OsPath conversion

this avoids 1 copy

10 months agomore OsPath conversion
Joey Hess [Tue, 11 Feb 2025 18:00:01 +0000 (14:00 -0400)]
more OsPath conversion

10 months agouse to/fromOsPath
Joey Hess [Tue, 11 Feb 2025 17:54:17 +0000 (13:54 -0400)]
use to/fromOsPath

Just to reduce the number of from/toRawFilePath calls, which I would
like to minimize.

In this build path, the two are the same though.

10 months agoremove unused functions from Utility.RawFilePath
Joey Hess [Tue, 11 Feb 2025 17:49:17 +0000 (13:49 -0400)]
remove unused functions from Utility.RawFilePath

10 months agoreplace removeLink with removeFile
Joey Hess [Tue, 11 Feb 2025 17:41:26 +0000 (13:41 -0400)]
replace removeLink with removeFile

same reasoning as in commit 5cc8d9d03b53f2e43d51e4f612f423178519e824

10 months agoupdate todo
Joey Hess [Tue, 11 Feb 2025 17:01:13 +0000 (13:01 -0400)]
update todo

10 months agoreplace R.doesPathExist with doesPathExist
Joey Hess [Tue, 11 Feb 2025 16:46:14 +0000 (12:46 -0400)]
replace R.doesPathExist with doesPathExist

Equivilant, just avoids some ugliness.

10 months agotest suite now passes after OsPath conversion
Joey Hess [Tue, 11 Feb 2025 16:37:09 +0000 (12:37 -0400)]
test suite now passes after OsPath conversion

The test suite was failing because of a bug in the Database/* modules.
I had replaced doesPathExist with doesDirectoryExist, but it was
checking the database file.

I have audited commit f1ba21d698c908ad84c08bce24fbbc376190fe83 for
other changes to doesPathExist, and checked that doesDirectoryExist and
doesFileExist were used correctly.

The only change I found is in youtubeDl', where it used to return
directories that might have been created by youtube-dl. But it was
supposed to return media files, so changing it to use doesFileExist is
actually an improvement. Although only of theoretical benefit.

Note that it would actually be possible to keep using doesPathExist,
there is a version of that for OsPath as well. But the rest of these
changes seem safe.

Sponsored-by: Nicholas Golder-Manning
10 months agoOsPath conversion of linuxstandalone builder
Joey Hess [Tue, 11 Feb 2025 16:12:27 +0000 (12:12 -0400)]
OsPath conversion of linuxstandalone builder

Sponsored-by: Joshua Antonishen
10 months agoMerge branch 'master' of ssh://git-annex.branchable.com
Joey Hess [Mon, 10 Feb 2025 21:23:31 +0000 (17:23 -0400)]
Merge branch 'master' of ssh://git-annex.branchable.com