With Fossil source control, does the "fossil clone" command have the exact same effect as if you just copy the repository file manually? - fossil

With Fossil source control, does the "fossil clone" command have the exact same effect as if you just copy the repository file manually?
Could I just put the repository file on a usb stick and transfer it to another computer instead of calling fossil clone?

No, it is not the same.
Differences are:
User account control is per-repository.
Artifacts such as private branches and their contents are not pulled
Everything that fossil scrub --verily --private would suppress is typically not pulled (see fossil help scrub)
Autosync URL is automatically set when you clone, creating a dependency link. Of course you can set it later with fossil remote-url
But if you use the repositories for your personal use on both sides you can just copy the repository, there is no problem doing a simple file copy. In order to merge your repositories, do not copy the most recent to the oldest, but set the sync URL and do a sync between them. It is quicker and safer as it will merge changes made on both sides, preventing any loss of changes.

Related

Best way to s3 sync changed files in a folder only

I have a job that clones a repo then s3 syncs changes files over to an s3 bucket. I'd like to sync only changed files. Since the repo is cloned first, the files always have a new timestamp so s3 sync will always upload them. I thought about using "--size-only", but my understanding is that this can potentially miss files that have legitimately changed. What's the best way to go about this?
There are no answers out of the box that will sync changed files if the mtime cannot be counted on. As you point out, this means that if a file does not change in size, then using the "--size-only" flag will cause aws s3 sync to skip those files. To my mind there are two basic paths, the solution you use will depend on your exact needs.
Take advantage of Git
First off, you could use the fact you have the files stored in git to help update the modified time. git itself will not store the metadata, the maintainers have a philisphy that doing so is a bad idea. I won't argue for or against this, but there are two basic ways around this:
You could store this metadata in git. There are multiple approaches to doing this, one such is metastore which uses a tool that's installed alongside git to store the metadata and apply it later. This does require adding a tool to all users of your git repo, which may or may not be acceptable.
Another option is to attempt to recreate the mtime from metadata that's already in git. For instance, git-restore-mtime does this by using the timestamp of the most recent commit that modified the file. This would require running an external tool before running the sync command, but it shouldn't require any other workflow changes.
Using either of these options would allow a basic aws sync command to work, since the timestamps would be consistent from one run to another.
Do your own thing
Fundamentally, you want to upload files that have changed. aws sync attempts to use file size and modification timestamps to detect changes, but if you wanted to, you could write a script or program to enumerate all files you want to upload, and upload them along with a small bit of extra metadata including something like a sha256 hash. Then on future runs, you can enumerate the files in S3 using list-objects and use head-object on each object in turn to get the metadata to see if the hash has changed.
Alternatively, you could use the "etag" of each object in S3, as that is returned in the list-objects call. As I understand it, the etag formula isn't documented and subject to change. That said, it is known, you can find implementations of it here on Stack Overflow and elsewhere. You could calculate the etag for your local files, then see if the remote files differ and need to be updated. That would save you having to do the head-object on each object as you check for changes.

Can I sync two bazel-remote-cache's using rsync

I have a build pipeline that builds and tests changes before they are merged to the main line. Once that happens, it would be great if the Bazel actions from that build are available to developers. Unfortunately, the build pipeline runs in the cloud and uses an in-cloud cache, but the developers use an on-premises cache.
I am using https://github.com/buchgr/bazel-remote
Does anyone know if I can just rsync the artifacts from the data directory of the cloud cache to the developers' cache in order to give them access to the pre-built artifacts? Normally, I would just try it out, but I'm concerned about subtle issues that might poison the cache or negatively effect the hit rate, so I'm hoping to hear from someone who understands the code before I go digging.
You can rsync the cache directory contents and use them from another location, but this won't work with a running bazel-remote- the items will be ignored until bazel-remote is restarted.
Another option would be to use the http_proxy configuration file setting to automatically put/get cache items to/from another bazel-remote instance. An example configuration file was recently added to README.md in the bazel-remote git repository.

Backing up symlinks using AWS s3 sync

I'm attempting to backup our system using the aws s3 sync command, however this will either backup the entire directory behind a symlink (default behaviour), or not backup the symlink at all.
I'd like some way of backing up the symlink so it can be restored from S3 if need be.
I don't want to archive the entire directory first, else I'll lose the ability to only backup the changed files.
My current thought is to scan the dir for symlinks, and create metadata files containing the symlink's target, which, after restore, could be read to rebuild the symlink, but I'm not quite sure how to do this.
Any advice would be very welcome. Thanks in advance.
As is, S3 has no standard way to represent a symlink. Note that you could decide of a custom representation, and store that in the metadata of an empty S3 Object, but you would be on your own. AFAIK, aws s3 doesn't do that.
Now, for purpose of backing up to S3 (and Glacier), you may want to take a look at OpenDedup. It does use the same type of rolling checksum as used in rsync to minimize the actual storage used (and the bandwidth).
I've been doing a lot of cp -rl and rsync custom scripts to backup my own system to local drives, but was always frustrated about the unnecessary extra storage due to many duplicate files I may have. Imagine what happens in those simple schemes when you rename a directory (mv dirA dirB): the next backup typically stores a brand new copy of that dir.
With OpenDedup (and other similar systems, such as bup, zpaq, etc.), the content is stored uniquely (thanks to the rolling checksum approach). I like that.
Right now, Amazon S3 does not support symbolic links. It will follow them when uploading from the local disk to S3. According to the AWS documentation, the contents of the symlink are copied or sync’d under the name of the symlink.
The rsync command does have options for symbolic links. One of them --copy-links will copy the destination exactly. So if your symlinks use absolute paths (my/absolute/path), it will copy that path and the symlink on S3 would point to the directory on your local box. If you use relative paths (.../.../path) then the symlink would be pointing to that path on S3.
Rsync would be a way to keep your symlinks to use after restoring the files back to your local box.
Another method would be to use an AWS S3 sync or backup service, such as NetApp’s Cloud Sync, which would catalog your data with each operation. Each service provider offers different features, so how symbolic links would be handled depends on the vendor chosen.

Using Fossil SCM with Master Repositories on a Flash Drive?

I am new to Software Configuration Management systems, but am now interested in using Fossil. I have been reviewing the documentation on-and-off for a few days, and have played with the program a little, but I am still unsure how to most appropriately use it to meet my needs, so I would appreciate any advice anyone would like to offer on the following use scenario.
I am working exclusively in Windows environments. I am a sole developer, often working on a number of relatively small projects at a time. For the time being at least, I do not expect to make much use of forking and branching capabilities – I like to think my code development generally progresses fairly linearly. But I regularly need to access and update my code at a number of usually standalone PCs - that is, they are never networked to each other and often do not even have internet access.
I am hoping that Fossil will assist me in two ways, keeping track of milestones in my codebases including providing the ability to easily restore a previous version for testing purposes, and also making it as simple as possible for me to ensure I always have all versions of the code for every project accessible to me when I sit down to work at any particular PC.
To achieve the second objective, I expect to make a point of always carrying a USB Flash Drive with me as I move from PC to PC. I expect this Flash Drive should contain a number of repository files, one for each project I am concerned with. I expect when I sit down at any particular PC I should be able to extract from this Flash Drive whichever version of whichever project I need to access. Similarly, when I “finish” working at this PC if I wish to retain any changes I have made I expect I should “commit” these changes back to relevant repository on the Flash Drive in some way. But the most appropriate way to do all this is unclear to me.
I understand Fossil is generally intended to work with a local copy of a project’s repository on each machine’s local hard disk, and with a master repository accessed remotely when required via a network or internet connection. In my case, it seems to me the master repository would be the relevant repository file on my Flash Drive, but when my Flash Drive is plugged into the machine I am working on, the files on it are effectively local, not remote. So, when I sit down to work at a PC, should I copy the repository file for the project I need to work on onto the PC’s local hard drive, then open the version of the code I need to access from this copy of the repository, or should I just open the project repository directly from my Flash Drive ? Additionally, if I should copy the repository onto the local hard disk, should I simply copy the repository file using the operating system, or should I use Fossil to clone it to the local hard disk (I do not really understand the difference here) ? Then, when I finish working at the PC, if I wish to incorporate any changes I have made back into the repository on my Flash Drive, should I update this directly into the repository on my Flash Drive, or into a copy of the repository on the PC’s local hard disk ? If the later, should I then simply copy the updated repository file onto my Flash Drive (overwriting the previous repository file), or should I “pull” or “push” the changes into the repository file on the Flash Drive – can I even do this, when the hard disk based repository and the Flash Drive based repository files are effectively both local files on the same PC ? I’m guess I'm getting a bit confused here…
A possible additional complicating factor in the “right” way to do all this is that typically, when I finish working at a PC I will not want to leave a copy of the source code or the repository on the PC (i.e., the customer’s hardware). I understand deleting the local copies of the repositories undermines the redundancy and backup benefits of using a Distributed SCM system, but I guess I will address this by keeping copies of the repositories on my own PCs and ensuring I backup the repository files on the Flash Drive itself reliably.
So any thoughts, experience or advice on the most appropriate way to use Fossil in the above scenario would be most welcome, thank you.
Hope this is still actual :)
I would suggest following process:
On your usb drive do:
mkdir fossil - to keep your fossil repo files
mkdir src - to keep your project files.
Go to the fossil folder and create repos for your projects A and B
cd fossil
fossil init a.fossil
fossil init b.fossil
Use .fossil extensions as this will simplify work with repos later.
Create fossil_server.cmd batch file to start fossil as a server.
SET REPO_PATH=X:\fossil
SET FOSSIL_CMD=Path_to_fossil_exe/fossil.exe
start %FOSSIL_CMD% server %REPO_PATH% --repolist --localhost --port 8089
Start fossil_server.cmd, open browser and go to localhost:8089
You will see page with your repos, so you can configure them, write wiki/tickets and so on.
Go to the src folder
mkdir a
mkdir b
cd a
fossil open ../../fossil/a.fossil
cd ../b
fossil open ../../fossil/b.fossil
So you have initial repository for your files in src/a, src/b
Add new files to A/B projects and do
cd src/a
fossil addremove
REM to add new files to the repository
fossil commit
REM to commit changes.
Now you can add/modify files in your projects, commit them and rolling back.
just use:
fossil commit --tag new_tag
to add easy to understand tag to your commit,
more on https://fossil-scm.org/home/doc/trunk/www/quickstart.wiki

Common header and footer in fossil

I have a master remote repository, and two local repositories.
However, it appears that changes to header/footer/css do not propagate throughout the repos.
How can I achieve this?
Fossil will treat the configuration of each of your repository interfaces (whatever you put in the header/footer/css etc) as something unrelated to the versioning of your files. This makes sense when you are working with repositories that you you are not the admin of. You will always be in control of the UI on your local machine (where you are by default the admin) and make it look and behave whichever way you want, even if you do not have priviliges to make the same changes to the central repo.
To propagate changes to the configuration (including the header/footer/css etc) you can use the fossil configuration options. Just type fossil configuration --help to see how you can export, import and synchronize your configuration accross repositories.