Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#367875 - 18/11/2016 00:12 Git question. Any Git experts here?
tfabris
carpal tunnel

Registered: 20/12/1999
Posts: 31563
Loc: Seattle, WA
This is another one where I can't seem to find the answer on Google. My issue goes like this...

Background: Until recently, we used TFS for our version control. We recently switched to Git. In the past, if I wanted to see who was responsible for a particular code edit, I would load up that code in Visual Studio, right-click on it, and select "Source Control/Annotate".

It would show a list, in the left column, of the most recent person to modify each line of code.

Now that we are on Git, it does the very same thing: pressing "Source Control/Annotate" still obtains the most recent person to have modified each line of code. So far so good. Now I want to do the same thing at the command line for automation purposes. I'm using "Git Blame" which is the same thing as "Git Annotate" with slight formatting differences. For the purposes of discussing my issue, both of these commands produce the same output, including the same issue.

The issue is:
- The resulting list of names in Visual Studio is different from the list of names in Git Blame. I like the one in Visual Studio better and I want to make Git Blame work like Visual studio does.

Here is a more detailed description:

- There are the normal modifications to the files, line-by-line, day by day. Dick changes line 4 in february. Jane changes line 5 in march. Etc. History is there, all is well. The names show up correctly in both Git Blame and in Visual studio, output is the same.

- Then in April, Bob comes along and re-encodes the entire file from UTF-16 down to UTF-8.

- This is where things diverge. In Visual studio, it still correctly shows that Dick was the last person to change line 4, and Jane was the last person to change line 5. However, Git Blame says the entire file is now Bob's file, he has changed every line in the file.

- For similar situations in the past, sometimes using the parameter "-w" (ignore whitespace) to Git Blame helps with this kind of issue a little bit. But for this particular case (the UTF-16->UTF-8 conversion) it's not helping.

- I have tried every parameter to Git Blame that is shown in the Git Blame docs, and nothing changes it. Doing Git Annotate instead doesn't change it. Every time I still have the same issue.

- I understand that Git Blame is being technically correct here, but it's not being "nice". It's only useful to see who actually made functional changes to the code, not just who reformatted it. Visual studio does just fine for this, showing me exactly what I need to see, so I know that somehow this must be do-able. I need Git Blame to act the same, in order to become a successful part of my automation flow.

- I assumed that Visual Studio was just running Git Blame under the hood when you pressed Annotate. It's not. It's apparently using LibGit2: https://libgit2.github.com/ and some custom code. Whatever Visual studio is doing, that's what I want to do. I don't know what it is, though, and I don't know how to find out.

So my questions:
- Anyone know what Visual Studio is doing to get me the "nicer" version of the results?
- Anyone know what tricky command line stuff I can use in Git that'll get me the same results?

Thanks!
_________________________
Tony Fabris

Top
#367891 - 21/11/2016 20:50 Re: Git question. Any Git experts here? [Re: tfabris]
canuckInOR
carpal tunnel

Registered: 13/02/2002
Posts: 3212
Loc: Portland, OR
Originally Posted By: tfabris
So my questions:
- Anyone know what Visual Studio is doing to get me the "nicer" version of the results?
- Anyone know what tricky command line stuff I can use in Git that'll get me the same results?

That has to be some custom stuff on Visual Studio's end. The conversion from UTF-16 to UTF-8 threw away half the bytes in the file, so from Git's perspective, Bob did change every line in the file.

My guess is that VS is getting the list of blamed commits, and walking back earlier through the blame history. You could do similarly. Get the blame data via git blame -p -- that will output a header line in the form <commit> <orig line #> <final line #> [# lines in this group]. The first time a commit is shown, a summary of the commit is given, including the previous commit. The actual line of code follows that, indented by a tab.

Then, for every commit in that list, do a second blame, and for each of the lines of code blamed on that commit, normalize the line of code for each commit (compress whitespace to a single space; convert to UTF-8) and see if they are equal. If they are, replace the first blame data with the second blame data -- and repeat your way down the history stack.

Careful you don't blow your 6-day budget, though. smile

Top
#367892 - 21/11/2016 22:08 Re: Git question. Any Git experts here? [Re: canuckInOR]
tfabris
carpal tunnel

Registered: 20/12/1999
Posts: 31563
Loc: Seattle, WA
If that's what it's doing under the hood...

BLECH. No way I want to do that. Certainly that's not within my budget.
_________________________
Tony Fabris

Top
#367893 - 21/11/2016 22:11 Re: Git question. Any Git experts here? [Re: tfabris]
tfabris
carpal tunnel

Registered: 20/12/1999
Posts: 31563
Loc: Seattle, WA
And honestly...

It seems to me that the procedure you just described is exactly how the "-w" parameter to Git Blame should be working anyway. I don't know why it's failing for me now in this case. I could swear it used to work.
_________________________
Tony Fabris

Top
#367902 - 24/11/2016 01:27 Re: Git question. Any Git experts here? [Re: tfabris]
tfabris
carpal tunnel

Registered: 20/12/1999
Posts: 31563
Loc: Seattle, WA
Heh. Okay, here's some interesting stuff just revealed.

Visual Studio 2017 is in release candidate stage. My boss has a copy running. And they have revamped the Annotate command. Now you right-click and instead of saying "Annotate" it says something like "(Blame) Annotate". And looking at it in process explorer, they are no longer using LibGit2 library, and instead are calling "git.exe" directly from the command line.

But... and here's the kicker... they still get the same "good" result.

How? Just like you said, they're walking the tree themselves. Process monitor shows them doing a "cat-file" over and over again to retrieve the file contents for each one of the checked-in revisions for that file. They're diffing each revision by hand, ignoring whitespace and bitness as they go.

In other words, to get those good results, they had to re-invent the wheel and do an end-run around Git Blame entirely.

Sigh.
_________________________
Tony Fabris

Top
#367903 - 24/11/2016 08:26 Re: Git question. Any Git experts here? [Re: tfabris]
Roger
carpal tunnel

Registered: 18/01/2000
Posts: 5680
Loc: London, UK
Originally Posted By: tfabris
In other words, to get those good results, they had to re-invent the wheel and do an end-run around Git Blame entirely.


Of course, they could have submitted their changes to git for merging upstream...

Sigh.
_________________________
-- roger

Top