Reading the contents of a file in a pre-commit hook filter

Bjarte Stien Karlsen October 28, 2017

Hello, 

I am trying to create a pre-commit hook that will gather up all changes in a push and send them off to a remote server for validation. 

The code is here https://github.com/Skatteetaten/bitbucket-precommit-hook-boober-validate/blob/master/src/main/java/no/skatteetaten/aurora/bitbucket/hook/BooberVerifyer.java

The version of bitbucket server it is going to run on is 5.4.0

When I run this with atlas-debug -u 6.3.7 i get the following error message on a push:

[INFO] com.atlassian.bitbucket.content.NoSuchPathException: The path "foo.json" does not exist at revision "3c22f6cdefdeaef5694dcbd83c3dc6e75703f301"
...
[INFO] at no.skatteetaten.aurora.bitbucket.hook.BooberVerifyer.lambda$null$2(BooberVerifyer.java:42)

If you look in my repository in git that i push from:

> git show 3c22f6cdefdeaef5694dcbd83c3dc6e75703f301 

commit 3c22f6cdefdeaef5694dcbd83c3dc6e75703f301 (HEAD -> master)
Author: Bjarte Stien Karlsen <bjarte.karlsen@skatteetaten.no>
Date: Sun Oct 29 00:58:46 2017 +0200

foo

diff --git a/foo.json b/foo.json
index b343dfb..1b67eb3 100644
--- a/foo.json
+++ b/foo.json
@@ -1,3 +1,3 @@
{
- "foo" : "ban"
+ "foo" : "baz"
}


What am I doing wrong? Or is there another easier way of doing this?

The hook is basically going to verify that only json files can be added and they have to be valid. It would also be nice if It could show a hint in a PR if that PR is valid.

 

2 answers

1 vote
Juan Palacios
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
May 17, 2018

EDIT: This answer is incorrect. Please move on to the next answer.

 

Hi @Bjarte Stien Karlsen,

Thanks for reaching out. I think the problem in your code is in how you are calling the content service. The second argument you are providing is "changesreq.getUntilId()", however the API) requires the `objectId`.

You should be able to get this information from the "Change.getContentId" instance ("change" in your code). However please read the javadoc thoroughly so that you properly handle deleted files (you'll need to check the change type)

I'm not sure how the pre-commit is supposed to work but I'd point out a couple of things:

  • Be careful when loading content into memory. Pushes can be _very_ large so if you load them all into memory you could cause an "OutOfMemoryError"
  • Consider that pre-receive hooks will hold the client until they all complete before returning. This means people pushing to the repository will have to wait for all these validations to complete. Long running validations are usually not a good idea.

An alternative could be to turn this into a merge check. You could enforce that changes to master happen via pull requests using Branch Restrictions. Then the remote server is not pushed changes, instead it receives webhook calls about pull requests being opened and updated and it fetches from the repository, evaluates the contents "lazily" and produces a result which it can either store locally (for querying from the merge check) or send it back to a Bitbucket Server plugin to store using AO. This approach would take care of your second requirement which is showing in the PR that the changes cannot be introduced to "master".

Anyways, best of luck with your integrations

Cheers

Juan Palacios

liborp May 28, 2018

Hi ,

I've tried adjust the example of pre-commit hook plugin as you suggested and instead of using 

changesreq.getUntilId()

I used 

change.getContentId()

when I want to get content of committed file by ContentService , source (line 42) here

https://github.com/Skatteetaten/bitbucket-precommit-hook-boober-validate/blob/master/src/main/java/no/skatteetaten/aurora/bitbucket/hook/BooberVerifyer.java

 

But I'm getting error 

[INFO] 2018-05-28 16:50:55,858 WARN [threadpool:thread-3] admin @1496WNVx1010x4x0 127.0.0.1 "POST /scm/project_1/rep_1.git/git-receive-pack HTTP/1.1" c.a.s.i.h.r.DefaultRepositoryHookService [PROJECT_1/rep_1[1]] Error calling no.skatteetaten.aurora.bitbucket.hook.BooberVerifyer.preUpdate (no.skatteetaten.aurora.bitbucket.boober-verifyer:boober-verifyer)
[INFO] com.atlassian.bitbucket.content.NoSuchPathException: The path "test3.java" does not exist at revision "294ef9ef5cea533cb363fb2a9dec2bb12e6397b9"
[INFO] at com.atlassian.bitbucket.scm.git.command.GitCommandExitHandler.newNoSuchPathException(GitCommandExitHandler.java:190)
[INFO] at com.atlassian.bitbucket.scm.git.command.GitCommandExitHandler.evaluateStdErr(GitCommandExitHandler.java:91)
[INFO] at com.atlassian.bitbucket.scm.git.command.GitCommandExitHandler.onError(GitCommandExitHandler.java:197)
[INFO] at com.atlassian.bitbucket.scm.DefaultCommandExitHandler.onExit(DefaultCommandExitHandler.java:31)

The path is correct as I have really pushed test3.java file.

Any suggestion please?

I need to get content of pushed file to run check for code style. And I'm really struggling with it, please help :-)

Cheers

Libor

Juan Palacios
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
May 29, 2018

Hi @liborp,

I'll have to apologise for my previous answer. Event thought the parameter is named objectId the expected value is actually the "the identifier for a point in the repository. Can be a commit ID, branch or tag" so what you really want to use is refChange.getToHash(). 

I realise this is what was originally there in the code. The reason is not working is actually a bit more intricate.

Starting with git 2.11 the objects in a push are placed into a quarantine directory until all pre-receive hooks run successfully. If any hook rejects the push the quarantine directory is discarded. If the push is accepted the objects are copied across to the repository.

Bitbucket Server 4.13 added support for quarantine environments allowing users to run git commands to inspect the contents of a push and find all the necessary objects (repository objects + quarantine objects). Unfortunately the way it works is tied to the thread the hook is running on.

In the pre-receive hook you are attempting to build you are nesting two git commands. The callback provided to `streamChanges` is running on a different thread, as a result, when you call `ContentService.streamFile` the quarantine environment is not applied and the objects are not available.

Now the good news is there might be a workaround available for you. the ScmService provides access to the ScmBulkContentCommandFactory which can produce the information you are looking for with a single command via the contents(@Nonnull BulkContentCommandParameters parameters, @Nonnull BulkContentCallback callback) method.

I've tested this locally and it works as expected

request.getRefChanges().stream()
.filter(change -> change.getRef().getId().equalsIgnoreCase("refs/heads/master"))
.forEach(refChange -> scmService.getBulkContentCommandFactory(request.getRepository())
.contents(new BulkContentCommandParameters.Builder(refChange.getToHash())
.sinceCommitId(refChange.getFromHash())
.build(), new BulkContentCallback() {
@Override
public void onFile(@Nonnull BulkFile file, @Nonnull InputStream content) {
log.info("Streaming file {} with content id {}", file.getPath(),
file.getContentId());
}
}).call());
return RepositoryHookResult.accepted();

NOTE: I used a BulkContentCallback anonymous class here for clarity but you should be able to use a lambda

I hope this helps you build the plugin you are working on. However I must once again point out that it sounds like a very expensive (time and resource wise) plugin you are building. The PreRepositoryHooks are called whenever a user OR the system attempts to update a ref. This means the hook you are building will potentially keep users waiting a long time for a push to complete, but it will also keep the system waiting for it as well (potentially triggering time-outs and causing errors). It is also possible for ref changes to get very large (e.g.: on rebases) so loading their contents into memory can be a massive memory hog.

I recommend you are very careful moving forwards and you consider alternatives to offload the heavy lifting. One idea that comes to mind is setting up a PR workflow enforced via branch permissions. You can combine it with a green build requirement to merge the PR. You can then add a task to the build which runs this check style.

Regards

Juan Palacios

Julius Davies _bit-booster_com_
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
May 29, 2018

Amazing followup and explanation @Juan Palacios !

To OP:  if you're wanting to do static analysis of code before allowing merges, the Sonar plugin for Stash/Bitbucket is a pretty interesting way to go about that.  Requires quite a bit of setup (including a separate running Sonar instance), but once you have it working I think it can be very helpful.

(p.s. I toyed with posting an answer to this question a couple days ago, but quickly realized I had no idea why things were not working for OP.  Your explanation makes a lot of sense.)

Bjarte Stien Karlsen May 29, 2018

Thanks for letting me know. 

What I am trying to do is to send the contents of the files to a remove service for validation. The hook is intended to run to validate some custom files using a separate logic.

liborp May 31, 2018

@Juan Palacios many thanks for such deep and full explain. I will try your suggest.

@Julius Davies _bit-booster_com_ yes I stopped working on this plugin and we are thinking to use Sonar plugin, thank you

gil February 1, 2019

Hi Juan - Looking at your code above, where does the scmService variable come from?  Thanks.

gil February 1, 2019

I just figured this out.  scmService can be passed in from the constructor.

0 votes
Bjarte Stien Karlsen May 29, 2018

@Juan Palacios thanks for answering. This does not have priority for me at work right now but I will look into it. We ended up just using a local pre-push commit hook that we instrument with out tooling. 

I would prefer this solution though 

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events