Missed Team ’24? Catch up on announcements here.

×
Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Bitbucket Server: How to obtain real commit hashes from an effective diff hash?

Synthead November 16, 2017

When querying the API for commit comments, I sometimes get anchors that have a diffType of "EFFECTIVE" with a toHash that appears to be generated somehow.  Is there a way that I can use this data to get a list of commits used for this "effective" diff?

Here's an example from the "activities" endpoint:

{"fromHash"=>"87fabe1ef09821868e789b5bde5b5cfb20c901fa",
"toHash"=>"da2f8b463cd5b28854958dea27f8f5e71884f445",
"line"=>7,
"lineType"=>"ADDED",
"fileType"=>"TO",
"path"=>"Rakefile",
"diffType"=>"EFFECTIVE",
"orphaned"=>true}

 

1 answer

0 votes
Juan Palacios
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
May 17, 2018

Hi @Synthead

Thanks for reaching out. I'm not 100% of what information you are after. If you are looking for a list commits in the pull request, then the pull request commits endpoint (or the PullRequestService.getCommits endpoint in Java) could give you what you are after.

The EFFECTIVE diff type is the diff that pull requests display by default. To produce this diff, the fromHash is the HEAD of the PullRequest.getToRef. The toHash is calculated by merging the PullRequest.getFromRef and PullRequest.getToRef. In essence the effective diff shows users "how the target branch will change if the PR was merged as is". You can find more information about the diff in this post.

Hope this is the information you were looking for.

Regards

Juan Palacios

Tyler Mann September 15, 2018

This EFFECTIVE diff type does seem to make it very difficult to use the API and tell what commit a comment is on since there doesn't seem to be a way for the caller to relate an effective hash to an existing hash in the repo (or a hash returned by the commits API). Even if the toHash is calculated by merging the fromRef into the toRef, it is a constantly changing value and even if I do merge a pull request through bitbucket UI this hash will change since the commit timestamp will change.

Is there any way through the API to convert an effective toHash to its corresponding commit toHash, or list all commits with their effective hashes? Or alternatively is there any way to list all comments with their anchors expressed as real commit hashes instead of effective hashes?

Juan Palacios
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
September 16, 2018

Hi @Tyler Mann,

Thanks for your feedback. Allow me to break down your comment to more clearly address your concerns.

This EFFECTIVE diff type does seem to make it very difficult to use the API and tell what commit a comment is on since there doesn't seem to be a way for the caller to relate an effective hash to an existing hash in the repo (or a hash returned by the commits API)

Comments are not anchored at a commit. They are anchored at a diff. The hashes in the anchor tell us which diff the path and line refer to.

Even if the toHash is calculated by merging the fromRef into the toRef, it is a constantly changing value and even if I do merge a pull request through bitbucket UI this hash will change since the commit timestamp will change.

That is correct. However comment threads anchored at an effective diff are processed on every update to the source or the target branch to update their anchor to the new diff. We call this comment drifting. Whenever a comment thread can't be drifted (e.g.: an update has removed the anchor file/line from the diff) it is marked as outdated, it no longer shows up on the diff, but can be seen in the activity stream.

Is there any way through the API to convert an effective toHash to its corresponding commit toHash, or list all commits with their effective hashes? Or alternatively is there any way to list all comments with their anchors expressed as real commit hashes instead of effective hashes?

There are three types of anchors available in the system.

  • EFFECTIVE: These anchors reference the effective diff described in my previous comment. The fromHash is the HEAD of the target branch and the toHash is the calculated merge commit hash. Whenever the pull request is rescoped (i.e.: the source or the target branch are updated) these anchors are processed so that we can update them to reference the new hashes (drifting their path and line if necessary). At any given time when retrieving these anchors you can be sure that the commits used to produce the merge hash are the HEADs of the pull request's source and target branch.
  • COMMIT: These anchors reference the diff between a commit and its first parent. They are used when looking at the diff for a commit either in the Commit page or when selecting a commit in the pull request diff drop down menu. These anchors are never drifted.
  • ITERATIVE: These anchors reference a diff for a commit range. The fromHash is an ancestor of the toHash. They are used in iterative review diffs. These diffs are produced when a reviewer comes back to a pull request they've marked as "Needs work" in the past after changes have been added to it. In this case we display for the reviewer a diff which shows them "what's new" by calculating a diff from the old HEAD of the source branch to the new one. These anchors are never drifted.

Hope this helps clarify how our Comment API works.

Regards

Juan Palacios

Tyler Mann September 17, 2018

Hi Juan,

Thanks so much for the detailed explanation and quick response, this definitely makes sense to me and helps clarify how your API works. Unfortunately I still have the same issues though which is basically needing to know what line a comment is on on a known commit hash/path (this is how bitbucket cloud works as well, it gives you a real commit hash to reference the comment's location). However I think I have a hacky workaround for now which at leasts let me convert the EFFECTIVE comments to the hash of the tip of the branch.

My workaround is to call something like the <pullRequest>/changes API to get the current toHash which seems to be the same used for EFFECTIVE comments that are still visible. Then call the API to get the pull request details and take the FromRef.LatestCommit. Then call the /changes API one more time to make sure that the toHash hasn't changed. If it hasn't then EFFECTIVE comments that have this hash seen from the /changes API can be mapped to the real sha retrieved from the pull request FromRef.LatestCommit.

I am basically looking to see if there is any easier to way to get this mapping or an existing commit and path/line of a comment. Since although the comments are on the diff as you mentioned, you likely also have this kind of mapping internally in order to do the drifting you mention.

Thanks so much for your help and quick response!

Tyler

Juan Palacios
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
September 17, 2018

Hi @Tyler Mann

Glad my comment helped clarify how the API works.

Regarding your concerns, would it be possible for you to provide a little bit of context? I think I'll be able to provide a better solution if I understand what it is you are building.

Cheers

Juan Palacios

Tyler Mann September 17, 2018

Hi @Juan Palacios,

Yes, essentially am building an integration that allows rendering of the diff of a pull request with comments overlaid as well as posting comments back to the pull request through the integration. Essentially can think of it as similar to the "Diff" tab for viewing a pull request on bitbucket server.

The modeling that we use references a comment's location using a commit hash, file path, and line number. If we have those 3 data points then we can tell where a comment was posted and do the "drifting" you mentioned ourselves using other git/commit data since we can inspect the commit at the specific hash. This works for us for all other providers (github, gitlab, and bitbucket cloud) as they all in some way reference a comment in a way that can be directly tied back to a commit hash, file path, and line number. The thing I am having trouble with is that the effective hashes are not something I can directly understand or calculate on my side and relate to any other git data/commits to tell where the comment should be located.

If there was some way to list the comment anchors with a commit hash that is backing the EFFECTIVE diff that exists in the repo that would be useful for me. Like possible a query parameter `?diffType=COMMIT` that would translate these into referencing commit hashes.

Or alternatively just a way list effective diffs that have existed for this pull request with both the effective hash and head hashes that were used to compute them.

If its not possible then that is okay. I do have the workaround I mentioned above which works, but it is just awkward to have to call 3+ APIs to get 1 piece of information and will only work for comments that are not orphaned yet.

Thanks!
Tyler

Juan Palacios
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
September 18, 2018

Hi @Tyler Mann,

Thanks for providing the extra context. Let me see if I can provide some information to help you out.

The modeling that we use references a comment's location using a commit hash, file path, and line number.

This doesn't seem like it will work for iterative diff comments where the fromHash in the diff can be any ancestor of the toHash.

The thing I am having trouble with is that the effective hashes are not something I can directly understand or calculate on my side and relate to any other git data/commits to tell where the comment should be located.

Technically you should be able to change the refspec configuration in your local repository to fetch the pull request refs which would bring the effective diff objects into your local copy allowing you to work with the hash the same way you would any other commit. To do so you'll need to add the following:

fetch = +refs/pull-requests/*:refs/remotes/origin-pr/*

NOTE: I set the target to origin-pr to avoid overlapping with someone naming a branch with the "pull-requests/" prefix.

Finally, please consider the following:

  • Effective diff merges are produced using the HEAD of the pull request's source and target branches which you should be able to get from the comment itself. In the Java API commont.getThread().getCommentable() (the Commentable is either the CommitDiscussion or the PullRequest and you can use the commentable.accept(CommentableVisitor) to run type specific logic). In the REST API the getComments response has the pullRequest field.
  • In some extraordinary circumstances the system can fail to calculate an effective diff (e.g.: if we are out of disk git is unable to write the new objects). In these scenarios Bitbucket Sever falls back to the common ancestor strategy: it calculates the merge-base between the branches and displays a diff from the merge-base to the HEAD of the source branch. When this happens we still drift the comments so the hashes in a comment anchor may not be from an effective diff
  • Effective diffs can produce conflicts. Bitbucket Server has some pretty intricate logic to deal with all possible conflicts. It means though, that in a content conflict for instance, the diff may include conflict markers.

Hope this helps you get started.

Cheers

Juan Palacios

Tyler Mann September 18, 2018

Hi @Juan Palacios,

The modeling that we use references a comment's location using a commit hash, file path, and line number.

This doesn't seem like it will work for iterative diff comments where the fromHash in the diff can be any ancestor of the toHash.

This technique does actually work well from our use so far. Essentially we are tracking comments similar to the way git blame works. If you have one point of reference of a line/path/hash then you can see the blame where that line was added/edited and use that to drift the comment to other commits regardless of if the fromHash or toHash is changing.

Technically you should be able to change the refspec configuration in your local repository to fetch the pull request refs which would bring the effective diff objects into your local copy allowing you to work with the hash the same way you would any other commit. 

Awesome, thanks for this! I will definitely check it out.

Effective diff merges are produced using the HEAD of the pull request's source and target branches which you should be able to get from the comment itself. In the Java API commont.getThread().getCommentable() (the Commentable is either the CommitDiscussion or the PullRequest and you can use the commentable.accept(CommentableVisitor) to run type specific logic). In the REST API the getComments response has the pullRequest field.

Ah thanks yes this could also work, we are using the REST API. Was detouring from using this getComment API at first because it seems to require a path to be specified which could make it need to be called many times even if there are no comments. Was instead using the getActivities API, but will keep it in mind as another option to play around with.

In some extraordinary circumstances the system can fail to calculate an effective diff (e.g.: if we are out of disk git is unable to write the new objects). In these scenarios Bitbucket Sever falls back to the common ancestor strategy: it calculates the merge-base between the branches and displays a diff from the merge-base to the HEAD of the source branch. When this happens we still drift the comments so the hashes in a comment anchor may not be from an effective diff

Effective diffs can produce conflicts. Bitbucket Server has some pretty intricate logic to deal with all possible conflicts. It means though, that in a content conflict for instance, the diff may include conflict markers.

Some other really great information here, will keep this in mind for testing.

Thanks for all the help here and walking me through all of this. I feel like I have a much better understanding of how things are working now.

Cheers,

Tyler

Juan Palacios
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
September 18, 2018

Glad I could be of service @Tyler Mann!

I'd point out one more thing:

Was detouring from using this getComment API at first because it seems to require a path

If you use the getComments REST API you can get all comments for a pull request (in pages) and filter them by diff type (e.g.: EFFECTIVE if you don't want to work with COMMIT and ITERATIVE comments).

Good luck!

Juan

Tyler Mann September 19, 2018

Thanks @Juan Palacios

If you use the getComments REST API you can get all comments for a pull request (in pages) and filter them by diff type (e.g.: EFFECTIVE if you don't want to work with COMMIT and ITERATIVE comments).

Yes if I try to call this API without a `?path=` query parameter then I get a validation error saying 'The path query parameter is required when retrieving comments.' which seems a little surprising that is required since it wouldn't appear so from the documentation. If you know of any way around this, that would be great to be able to page through all of the comments on the pull request. But calling this once per every file in the pull request seems somewhat tricky which is why I have been using the /activities API and filtering to COMMENTED activities.

Juan Palacios
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
September 19, 2018

Hi @Tyler Mann

There's currently no way around the path parameter. The way it's intended to work is, you'd go through the /changes API to get the diff-tree and then you'd request the comments for each file as you display its diff.

Do you have to access all comments in the pull request at once? If so, could you elaborate as to why?

Cheers

Juan

Tyler Mann September 20, 2018

Gotcha that makes sense to use the changes API first. There is no hard requirement on needing all of the comments at once, but it is the way our APIs are modeled since this is how other providers we have worked with model their APIs (bitbucket cloud, github, gitlab). not a huge deal though since you can use the changes API as you mentioned and then only call the API for each file that there are comments are on.

Thanks for all the help, think I have things in a better state for now :)

Tyler

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events