Git deleting things mysteriously (edit: actually django-storages)

Git deleting things mysteriously (edit: actually django-storages) - django

The issue: sometimes, but not every time, Git deletes the static directory of a repo. We're not sure what triggers it, but it appears to happen when either merging between branches or sometimes even just checking out branches. It does this without asking, and eats tracked files.
The background:
I have a (private) project which has a few branches, 'release', 'develop', multiple feature lines.
There are two of us (me and #stevejalim) working on the repo. This problem happens to both of us.
I am using purely the command line for my git commands; Steve is using a mixture of the command line and Git Tower.
It's a Django project with a static directory. We may have git rmed the static directory at some point in the past, or put it in .gitignore, but not recently. And the head of our develop branch doesn't have static in .gitignore and has files in static tracked.
This happens so infrequently that we're not sure if it's something we're doing, or an intermittent problem, a bug with Git, or a corrupted tree
It might happen only when merging another branch back into develop. But branches are always branched from develop and back into develop. But we're not sure.
We are using git-flow, but the issue happens when using non-git-flow commands, too.
As examples of when this can strike:
1) Steve had a develop branch that was clean (no changes to commit or stage) and stable. He cut a new release with git flow release start|finish and in the process (possibly the back-merge from master to develop), the entire /static/ tree got deleted.
2) Steve repaired the delete by discarding the changes (to essentially undelete the file). But then, Steve simply switched from master back to develop and the /static/ dir got zapped again (this was with Git Tower)
3) Sometimes just merging from a feature branch to develop as an interim merge can trigger it. It does seem to happen most often when cutting a new release, though
Could it be related to how we repair the zapping of the /static/ dir? What is the best way to bulk-undelete things that have been deleted? Neither discarding local changes or a hard reset to HEAD seems to cure things. Might a rebase help us?
UPDATE We've just experienced this again simply with a git add . - no changing branches, no merging. Does that help diagnosis at all?
Here is the content of Steve's .git/config:
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
ignorecase = true
[remote "origin"]
fetch = +refs/heads/*:refs/remotes/origin/*
url = git#github.com:foobarbazbam/bar.git
[branch "master"]
remote = origin
merge = refs/heads/master
[gitflow "branch"]
master = master
develop = develop
[gitflow "prefix"]
feature = feature/
release = release/
hotfix = hotfix/
support = support/
versiontag =
[difftool "tower"]
cmd = \"/Applications/Tower.app/Contents/Resources/CompareScripts/kaleidoscope.sh\" \"$LOCAL\" \"$REMOTE\"
Here is the content of .gitignore:
.DS_Store
*.pyc
*.log
*.log.*
*.bak
*~
settings_local.py
/build/
/static_collected/*
/static/uploads/*
/static/theme_files/*
/static/picture/*
pip-log.txt
*.tmproj
*.dot
*.db
*.sublime-project
*.sublime-workspace
/docs/_*

Okay ladies and gents I have a public apology to make. Git was not to blame. I will leave this question here as a lesson to other people who may pass the same way.
We were using the django-storages backend (a 'plugin' to enable Django to store files on Amazon S3 transparently). This has a test called HashPathStorageTest. The tear-down this test deletes settings.MEDIA_ROOT, which was set to ./static. This is faulty, in my opinion. It has no business blanket-deleting files that it didn't create.
We were running our tests, like good citizens, before checking in. Most of the time we ran only tests for our code, but occasionally we ran tests for the whole project (including 3rd party plugins). This was producing the behaviour in the question. Because we ran test and git things together, it wasn't easy to pin down which command was doing the deleting (and the deleted files only showed up when we ran git status).
So problem solved. Again, sorry for casting aspersions on the good name of Git!

This would appear to be git deciding to merge to a branch which A) has the folder deleted B) it thinks is more recent than the branch you are merging to. Might there be a machine which has american date type and one which has european date type? Or does any of your testing involve resetting machine date time?
I would
a) check the regions of all machines included in your developement.
b) choose a fixed point in your development, copy the files out and create a new reprository and go from there.

Ok so this just happened - I'm adding it a separate answer because its so silly. We have a repositiory for our SQL scripts and a new developer accidentally edited .gitignore to contain *.sql - needless to say changes stopped being propagated on that branch - and fortunately it was caught prior to any signifficant loss of development time. However its not inconcievable that this could produce the issues above - so have you checked your git directive files? If the folder is ignored (and .gitignore is in there as well - i.e ignored) it could dissapear and reappear almost by magic.

Related

Is it safe to clone a git repository into its own .git directory?

I recently put my website under Git version control. I'm using git-ftp to upload files, and I've set it up so that the master branch is the content that should be on the server, and the live branch is the content that actually is on the server. I need to make this distinction since I sometimes develop where I don't have an Internet connection, and therefore can't upload right away. Therefore, in order to upload, I do: git checkout live;git merge master;git ftp push; I'd like to automate this process and not lose the rest of working directory state. In the future I'm also going to set up automatic backups of several files which will be ftp'ed down and committed to the live branch.
In order to do all of this, I'm going to need another clone that will only be used by automated processes, and will always have a clean working directory. The obvious place to put it, since it will never need to be looked at, is inside the .git directory of the main repository. That way, I don't have a clone floating off somewhere else in my filesystem, it stays hidden, and it's obvious that it's part of the internals and shouldn't be poked about with. If this works well, I may even submit a patch to fix git-ftp issue #180 with this method.
However, I'm not sure if it's a good idea to put random things inside the .git folder--I can't find any documentation anywhere on what the side effects may be.

A .git directory seems to ignore completely any extra elements, so adding a git repo directly in .git should work.
That wouldn't work if you added your nested repo in a sub-directory like, for example, .git/refs (see Git Internals - Git References)
Note that you could also have that git repo anywhere in your current git repo, as a directory named, for instance, .ftprepo (a name beginning with a dot, which should be hidden by default)

I found the answer by mistake while looking for something else!
According to https://www.kernel.org/pub/software/scm/git/docs/#_file_directory_structure, "Higher level SCMs may provide and manage additional information in the $GIT_DIR". This appears to mean that additional info can be stored in .git without causing problems, including, presumably, another clone of the repository.

What's wrong with this user ignore file for Mercurial?

A little retrospective now that I've settled into Mercurial. Forget forget files combined with hg remove. It's crazy and bass-ackwards. You can use hg remove once you've established that something is in a forget file that isn't forgetting because the item in question was tracked before the original repo was created. Note that hg remove effectively clears tracked status but it also schedules the file for deletion in anything that gets changes from your repo. If ignored, however the tracking deactivation still happens but that delete-me change set won't ever reach another repo and for some reason will never delete in yours which IMO is counter-intuitive. It is a very sure sign that somebody and I don't know these guys, is UNWILLING TO COMPROMISE ON DUH DESIGN PROBLEMS. The important thing to understand is that you don't determine what's important, Mercurial does. Except when you're merging on a pull of course. It's entirely reasonable then. But I digress...
Ignore-file/remove is a good combo for already-tracked but very specific files you want forgotten but if you're dealing with a larger quantity of built files determined with broader patterns it's not worth the risk. Just go with a double-repo and pull -u from the remote repo to your syncing repo and then pull -u commits from your working repo and merge in a repo whose sole purpose is to merge changes and pass them on in a place where your not-quite tracked or untracked files (behavior is different when pulling rather than pushing of course because hey, why be consistent?) won't cause frustration. Trust me. The idea that you should have to have two repos just to get 'er done offends for good reason AND THAT SO MANY OF US ARE DOING IT should suggest a serioush !##$ing design problem, but it's much less painful than all the other awful things that will make you regret seeking a sensible alternative.
And use hg help. It's actually Mercurial's best feature and often better than the internet (which I don't fault for confusion on the matter of all things hg) for getting answers to everything that is confusing and counter-intuitive in this VCS.
/retrospective
# switch to regexp syntax.
syntax: regexp
#Config Files
#.Net
^somecompany\.Net[\\/]MasterSolution[\\/]SomeSolution[\\/]SomeApp[\\/]app\.config
^somecompany\.Net[\\/]MasterSolution[\\/]SomeSolution[\\/]SomeApp_test[\\/]App\.config
#and more of the same following
And in my mercurial.ini at the root of my user directory
[ui]
username = ereppen
merge = bcomp
ignore = C:\<path to user ignore file>\.hgignore-config
Context:
I wrote an auto-config utility in node. I just want changes to the files it changes to get ignored. We have two teams and both aren't on the same page with making this a universal thing so it needs to be user-specific for now.
The config file is in place and pointed at by my ini file. I clone. I run the config utility and change the files and stat reveals a list of every single file with an M next to it. I thought it was the utf-8 thing and explicitly set the file to utf-16 little endian. I don't think I'm doing with the regEx that any modern flavor of regEx worth actually calling regEx wouldn't support.

The .hgignore file has no effect for files that are tracked. Its function is to stop you from seeing files you want ignored listed as "untracked". If you're seeing "M" then they're already added (you got them with the clone) so .hgignore does nothing.
The usual way config files that differ from machine to machine are handled is to put a app.config.sample in source control, have app.config in .hgignore and have people do a copy when they're making their config edits.
Alternately if your config files allow for includes and overrides you end them with include app-local.config and override any settings in a app-local.config which you don't add and do include in .hgignore.

Should these auxiliary files be under Git version control?

I decided to start using the Git version control system for my C++ project. I'm new to version control. For the trunk things are simple, I just commit all the project versions I have. I kept each version as a separate folder because I knew I'd very soon use Git. But I encountered a problem with my branches.
At some stage of the development, I decided there's one class I want to develop in a branch. Without version control, I had to use make a "manual" branch. I copied the most recent header file and source file of that class to a separate folder and started working there. I made several versions there to work with simultaneously. One version was the first prototype of the class according to the plan (for which I made the "branch"). Then I added another file, in which I copied the first one but removed things that seemed to not be necessary. This way I have 2 versions, one with all my ideas and features, the the other one just with what I really use in my code, without what's not in use at the moment.
But then I added more. As development went on, I decided it may be a good idea to make that class a template. So I added a third version, which is just like the second one, but now some functionality implemented using polymorphism is implemented using a template. And I can't tell yet which version is the best, as it's too early to tell, so I want to have all 3 together.
Then I made another special file: A copy of the third version header file, in which each line can be marked or not marked. Marked means I use that specific method or I'm sure it's going to be in use very soon, otherwise the line isn't marked.
Then, some time later, I started a new branch. And for that branch I needed a new version of that class developed in the first branch. So I just copied one of the versions to the new branch's folder and started working there. Now again I had some kind of auxiliary file: I had 2 files, one from which I delete class methods I use, and one into which I write new methods I need to have.
Now I want to start using Git and I wonder: For all the project's text files, plans, diagrams, etc., it obvious - I keep them outside the Git repo. Whenever collaborative editing is needed I can set up a wiki or something like that. But for all those copies of the same header file, and for those auxiliary "marked" files, what do I do with them? I mean, it's fine by me to have them all in a branch, but what happens when I merge a branch into the trunk? I don't want to have all these copies and versions and lists, just the one final class file I've made.
On one hand, these are C++ source files used while coding. On the other hand they're not part of the pure source code of the software package, they just help me while I work but will never get compiled because in the end there's just the final version of the class which I chose to merge, and all other aux files, lists, etc. are kept just for reference.
What would be the best thing to do?
Thanks for reading my long story :)
EDIT: It's a local repo on my personal computer

Always keep documentation in same repository as source code. If you do not, your documentation will rot. It is becouse documentation is written agains some version of your software, so it has to develop the same way as software develops.
If your documentation is automaticaly generated or compiled into another format, commit only source data, makefile and configuration of generator, just like you do with source code.

What you describe is the normal use of branches: You have your master branch ("official", if it where) and a branch to develop a new feature (it doesn't really have to live in a separate directory, if I understand you correctly). Periodically you synchronize the feature branch with the master, either by rebasing it on the master or merging its changes in. In its turn, you can well have subordinate branches in which you try out approaches to develop the feature, handled with respect to the feature branch just like that one respect to the master. But in that case you have to be careful whenever you rebase.
You should keep any data that isn't easy to recreate in the repository, be it source code, documentation or even design sketches. Stuff that can be recreated (object code, automatically formated documentation, ...) should be kept out (any change there will create a difference to be checked in). Your repository (particularly not published branches) is your own workspace, it can be all the messy you like.
Take a look at the book mentioned at the git homepage.

Well, that’s clearly documentation and not source code, so you should separate it from your source code. As your documentation seems to be branch dependent, you should still check it into the repo, but in a separate doc directory.
About the merging: How a merge works is up to you in the end. Git just has a default merge strategy which is what most people want most of the time. But if you say that a merge into the main branch should just bring the code and not the docu, then that’s fine. Just merge that way:
git merge mybranch --no-commit
rm -rf **docu-dir**
git add -A
git commit

Git tells me my changes haven't been committed, even though they have been?

I was trying to clean up my directory by deleting no longer relevant files. Since then, whenever I try to push, I get the error "Your local changes to the following files would be overwritten by merge: [list of some files I tried deleting]. Please commit or stash them before you can merge" which is odd because all my other file deletions were committed successfully.
EDIT: I renamed my local copy and cloned the repo again, and discovered that the changes I made /were/ going through, but for some reason the web app itself hasn't been reflecting those changes. I tried clearing my browser's cache and viewing the website on other browsers, but those changes just don't appear anywhere except in the files I cloned, which baffles me completely.
Is there any explanation for why the web app is not displaying the most recently updated files?
I still get this error, if it sheds any light:
"Your local changes to the following file would be overwritten by merge:
static/css/newcss.css
Please, commit your changes or stash them before you can merge.
Despite the fact that I have not altered that file in any way since cloning it.
(Also note, I'm using Django for this web app)
EDIT EDIT: So I guess to close this question, the updating of the site was just taking a long time to process (like, two weeks long).

When i use GIT, i usually get this kind of strange errors.
Have you try this: Git telling me to pull, then commit, then pull?

Are you certain you committed your changes? Run this:
git diff
If something comes up, it means there are changes in your working directory that don't match your last commit, e.g., that is the change that needs to be committed.
Try committing the file that changed:
git add myfile.txt
git commit myfile.txt -m "Committing my file"

Avoiding unneccessry recompilations using "branchy" development model

I'm using Mercurial for development of quite a large C++ project which takes about 30 minutes to get built from the scratch(while incremental builds are very quick).
I'm usually trying to implement each new feature in the new branch(using "hg clone") and I may have several new features developed during the day and it's quickly getting very boring to wait for the new feature branch to get built.
Are there any recipes to somehow re-use object files from other already built branches?
P.S. in git there are named branches within the same repository which make re-usage of the existing object files possible for the build system, however I prefer the simpler Mercurial separate branches model...

I suggest using ccache as a way to speed up compilation of (mostly) the same code tree. The way it works is as following:
You define a place to be used as the cache (and the maximum cache size) by using the CCACHE_DIR environment variable
Your compiler should be set to ccache ${CC} or ccache ${CXX}
ccache takes the output of ${CC} -E and the compilation flags and uses that as a base for its hash. As long as the compiler flags, source file and the headers are all unchanged, the object file will be taken from cache, saving valuable compilation time.
Note that this method speeds up compilation of any source file that eventually produces the same hash. If you share source files across projects, ccache will handle them as well.
If you already use distcc and wish to use it with ccache, set the CCACHE_PREFIX environment variable to distcc.
Using ccache sped up our source tree compilation around tenfold.

A simple way to speed up your builds could be to use a local "build directory" on your disk. This way you can checkout into this directory and start the build. The first time it will take the full time, but after that it will (hopefully) only rebuild the files where the source code changed.

My Localbranch extension was designed partly around this use case. It uses a single working directory, but I think it's simpler than git. It's essentially a mechanism for maintaining multiple repository clones under one working directory, where only one is active at a given time.

Woops, I missed your P.S. where you don't like having multiple named branches in the same repo and that you prefer separate clones.. sorry about that.
I too have somewhat large C++ projects and the clone-per-feature workflow didn't work for me very well. Firstly, I had to close down my Vim session and then reopen (many of the same) files once I've created the clone. Secondly, like you said, a lot of code must be recompiled unnecessarily. Thirdly, I have to keep track of where I've pushed to and pulled from - gets confusing when you start a new feature and then get sidetracked onto a new one. Before you know it you have many clones and not sure which ones need to be pushed back to your main.
You definitely don't want to use named branches (as I'm sure you know) to handle this as they are quite permanent.
What you need are bookmarks: https://www.mercurial-scm.org/wiki/BookmarksExtension
Bookmarks allow you to create lightweight (and otherwise anonymous) branches per feature by facilitating the naming of heads in your repo. These heads would normally be unnamed and you would have to look at the output of 'hg log' or use some graphical tool to find the revision numbers for the tip of your feature-branch. With bookmarks you can name them descriptive names like 'my-cool-feature' or 'bugfix-392'.
If you like the idea of bookmarks, I'd also recommend my own extension called 'tasks': http://bitbucket.org/alu/hgtasks. This extension works like bookmarks but adds some more functionality. It allows you created feature-branches (now called tasks) and suppress the pushing of incomplete tasks. This is handy when you have a few feature-branches at once. You may not be ready to push your 'my-cool-feature' task, but 'bugfix-392' is ready to go. Because tasks track a set of changesets (and not just one 'tip' changeset) there are some things you can do with tasks that you can't with bookmarks. See an example workflow here: http://x.zpuppet.org/2009/03/09/mercurial-tasks-extension/.

Mercurial also has local named branches, see the hg branch command.
If you insist on using hg clone to do branchy development, I guess you could try creating a folder link (shortcut under windows) in your repo to a shared obj folder. This will work with hg clone, but I'm not sure your build tool will pick it up.
Otherwise, you probably keep all your repos in one folder - just put your obj folder there (it shouldn't be under source control anyways, imo). Use relative paths to refer to it.

A word of warning: many .o symbol tables (or equivalent) contain the full path name of the source file. If that other file changes (or if the path is not visible from the new directory) you may encounter weirdness when debugging.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js