Stages and artifact passing

Jörg Godau February 5, 2012

Hi all,

I'm trying to setup Bamboo for our organisation, so that we can evaluate it and hopefully purchase it, but I'm stumped...

Problem: Splitting a build into Stages and passing the right artifacts

Ideas from: https://answers.atlassian.com/questions/19562/plans-stages-jobs-best-practices

I image my Plan to look something like this:

  • Stage
    • Job
      • Task

  • SCM Stage
    • Checkout
      • Checkout
  • Build Stage (creates jar artifact)
    • Compile
      • maven clean
      • maven install
  • Unit Tests Stage (consumes jar artifact)
    • JUnit
      • maven test

But this really doesn't work. The Build stage doesn't have any sources, so it can't compile.

So I combined SCM and Build into one Stage, which works and produces the Jar but now the JUnit Stage fails because it can't find any pom (again no sources).

I see all this information about Bamboo is great, split into stages, blah blah, but PLEASE I need a working example of how this can be implemented.

Cheers

Jack...

6 answers

1 accepted

2 votes
Answer accepted
Jörg Godau February 6, 2012

Atlassian, please read this!

I really feel like I'm sitting in a brand new car, with lots of brochures telling me how fast it is, how well it handles, etc... Everyone tells me it's great and can get me from A to B super fast - but the handbrake is on and noone seems to have spotted it.

Your Jira team uses Bamboo to get their monster project built and tested in short order - why not post the configuration and the "custom maven plugin" mentioned by Luis.

*Please note my unanswered question/comment on that blogpost from last year!

For everyone else - the "answer" (kind of)

I've spent three days cracking my skull against this, and now have something that "works" but is not ideal, more on that later.

Create a Plan, Add an "SCM Stage", in that add an "SCM Job" and in that add a "Source Code Checkout" Task do the checkout. Actually the checkout task will be there by default (which is useful this and only this one time, and really annoying all the rest of the times when creating jobs as you need to delete it everywhere else).

optional: Add any other tasks that do pre-compile stuff (dos2unix or whatever wierd things you might need) after the Checkout Task

Go to the artifacts tab for the Job and click "create definition":

Name: all (call it what you like)

Location: . (just a period)

Copy pattern: **/* (just * didn't grab the subdirectories)

Shared: true

Add this artifact as a Artifact Dependencies to any Stage that requires access to the checkout from SCM *probably all Stages.

What's not ideal about this?

  1. we have some big projects, hundreds of MB in SCM, A Plan with 10 Stages takes up 10 Times that space, plus all the space for Test Results, Real Artifacts etc..
  2. It would probably be possible to define smaller artifacts, if one put enough thought into it, so that the space requirement wouldn't be as big, but really nearly every part of the project requires the sources in some form *during the build process, so the time I could invest to slim this down will cost more than the disk space I gain
  3. I need to work out some way to clean these "all" artifacts - preferably in Bamboo.

What's annoying about this?

  1. This seems like something easy, but it's not clearly described anywhere.
  2. Manually having to add an artifact to every Stage is annoying
  3. Having a default SCM Task in every new Job makes people think this is the "right thing to do", which it isn't!!
  4. Manually having to delete the useless Source Code Checkout from every Job is annoying - cant new jobs just be empty and new and clean?

UPDATE

It seems (see Piotr's comment below) that a checkout with just updates and not the full source in each job will be the fastest and most space efficient way to go as Bamboo "locks" onto a particular set of sources from SCM. So given:

  • Developer commits CodeChange1
  • Bamboo CompileStage checks out and starts building
  • Developer commits CodeChange2
  • Bamboo TestStage starts testing (does a checkout and gets the same code as the CompileStage!) so the run is not affected by CodeChange2

Well done Atlassian. It's a great feature, perhaps it just needs to be a little bit more obvious in the documentation...

Cheers

Jack...

Jörg Godau February 6, 2012

Something just occured to me. The SCM Tasks keep track of changes, and only check out the updates - so the source is stored somewhere centrally anyway yes? If that's the case why cant it just be accesed, instead of needing to create dodgy workarounds duplicating vast tracts of data?

SGD
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
February 7, 2012

Thanks for the feedback, Jack. I'll make sure the product manager and development team lead for Bamboo see it. (I totally agree about SCM tasks appearing by default when you create a new job, btw.) With regard to managing all the data -logs, build artifacts, build results- go to Administration > Plans > Build Expiry to configure how long to keep all these things. More info in the documentation, too.

ReneR
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 7, 2012

Something just occured to me. The SCM Tasks keep track of changes, and only check out the updates - so the source is stored somewhere centrally anyway yes?

No. The server gets a list of changes from the upstream source code repository, then the agent is the one that does the checkout directly from the upsteram source repo service. All the server knows about are lists of repo revisions numbers/hashes.

You're essentially asking that Bamboo act as a source repo proxy service. I'm not sure that's the way to go but that's more a question for Sarah.

The paradigm of "every Job needs to check out its own source" has been at the heart of every CI implementation out there since the beginning. But this was before Pipelining and Chaining. In a world where you have multiple parallel stages, that have multiple jobs and multiple tasks the yes, perhaps CI tools can provide some efficiencies by caching the source code checkout itself that corresponds to a particular revison of the repo. But that means that your CI server is going to end up having to store possibly as many copies of the source repo as there are distinct revisions that you are building. So that's not really a strategy that's going to minimize disk space. By placing the checkout ONLY on the agent you make the space requirement a flat, transient one: all agents need to have only as much disk space available as your biggest build requires -- assuming you clean the build.working.directory after each build. Having the server checkout physical copies of every revision on your repo centrally and then passing them around to the agents on the wire is just going to increase the load and througput required of all connections into the Bamboo server. That's a job that's traditionally best handled by whatever service is serving up your source code repo.

Jörg Godau February 7, 2012

Hi Sarah,

is it possible to selectively clean things from the builds? eg. we need to keep the generated EAR and the Clover results but don't want to keep all of the copies of the checkout "all" that we've been passing between stages.

Hi Rene,

ok that makes sense. I guess no matter how one does it

  • multiple checkouts leading to load hit on the SCM servers and a time hit on the builds
  • single checkout and artifact passing the way i've done it - time hit passing large code bases to Agents
  • Disk space is hit no matter what, unless one can find a clever way to check out selectively for each Stage, but I think that's risky, missing one file or not adding a new file to the list will potentially cause strange failures in the build

I do think it would be something for Bamboo to look at - having a checkout per Plan option, or a Checkout per Stage, instead of forcing it down into each Job. I realise that's probably hard with different Jobs running on different Agents, on multiple Servers.

The biggest problem with Checkouts in each Job, is that you can end up with different code at different times:

  • DeveloperA checks in some code
  • Bamboo picks up the change does the checkout and starts building
  • DeveloperB checks in another change that affects the same area
  • Bamboo finishes the compile stage, starts the test Stage with a fresh checkout
  • Build Fails because the changes made by Developers A and B cause an error

This is something that my central checkout and artifact passing workaround solves - you're running one codebase throughout the whole plan, any failures are a direct result of checkins that lead to that build - making it easier to track them down in a huge project.

Cheers

Jack...

PiotrA
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 7, 2012

The biggest problem with Checkouts in each Job, is that you can end up with different code at different times: (...)

Er, no. The Bamboo is designed to 'lock on' the particular SCM revision at the beginning of plan execution and later it uses the remembered revision on each Job's checkout task. It seems to me that you think the Checkout task updates to the tip/head of the SCM repository *at the moment* of the Job execution - no, it works differently. Checkout tasks will update source code with the revision 'seen' at the start of the whole build, so even if DeveloperB pushes another changes in the meanwhile to the upstream repo it won't affect the checkouted sources during "test Stage" from your example above.

Does it make sense what am I saying?

cheers,

PS

Jörg Godau February 7, 2012

Hi Piotr,

that makes sense and is cool!

This is making me rethink everything I've done!

A checkout in each Job then isn't as fatal as I had assumed it to be, in fact it's probably much more efficient and a better way to just have the checkout run at the start of every Job the way it comes out of the box!

If one doesn't do a complete checkout then the change hits on the SCM servers are probably going to be minimal, hence much faster and more space efficient than copying the whole checkout tree over.

Atlassian - I take it all back *almost.

It turns out I've just spent three days removing the handbrake, and installing an anchor from the 1700s, mainly because I didn't understand how the handbrake worked!

Cheers

Jack...

ernestm February 23, 2016

This comment thread was very helpful because after going through all the bamboo docs I couldn't find anything about the revision "lock-on" and I was similarly concerned about multiple stages and jobs given that they might get different code. Maybe an addition to https://confluence.atlassian.com/bamboo/checking-out-code-289277060.html mentioning this important attribute?

PavicZ May 23, 2016

I'd agree that the documentation needs to be updated to mention this point. I was under the same impression that each Source Code Checkout task updates to the head rather than a fixed revision for the entire build. It does take quite a bit of digging to find the mention in this thread...

Brad Riching October 6, 2016

This thread was very helpful in understanding how to architect our flow, however the only way I found it was stumbling upon it.  

The principle of maintaining the exact checkout at the start of the build, NEEDS to be more prominent on the getting started tutorials, otherwise cursory evaluations of your product based solely on the information presented in the quick starts casts a HUGE amount of doubt on the plausibility of the whole thing working at all.  I was deflected away for days researching hacks on how to get around sharing workspaces, or setting everything up in one job, in one stage.

My initial concerns were exactly the same as the other developers – a spurious commit in the midst of different stages of an executing plan would potentially be fatal, which it clearly is NOT.  This was initially a non-starter for me, but now it has caused me to rethink my whole design.  Fix this one thing in the quick starts, and you will make it a TON easier for people to latch on to your seemingly very well designed product.

3 votes
ReneR
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 6, 2012

If you're talking about this:

http://blogs.atlassian.com/2011/07/pipelining_the_build_for_fun_and_profit/

What you are missing is creating a test-runner artifact. Bamboo can't do that for you, you have to do it in Maven.

The test-runner.zip artifact needs:

* the test code

* any dependency jars

* the app jars and webapp configuration (if applicable) so that you can start an app against which to run the tests

* a POM file to run the tests

You can generate it using the maven-assembly-plugin or the maven-groovy-plugin with a bit of groovy to collect all the bits you need into a zip file when the build finishes.

Then your stages become:

Stage 1 - Prepare (generate test-runner.zip), share it. I.e. 'mvn -P create-test-runner deploy'

Stage 2 - Unzip test-runner.zip, run 'mvn run-the-tests'

Notes: Stage 1 will need a checkout task, stage 2 won't

The sample POM for the test-runner should look like:

<project>
    <modelVersion>1.0.0</modelVersion>
    <groupId>com.yourorg.yourgroup</groupId>
    <artifactId>tests-runner</artifactId>
    <version>1.0</version>
    <name>Test Runner</name>

    <properties><!-- properties will be added here --></properties>

    <dependencies><!-- dependencies will be added here --></dependencies>

    <build>
        <defaultGoal>test</defaultGoal>
        <plugins>
            <plugin>
                <artifactId>maven-surefire-plugin</artifactId>
                <configuration>
                    <!-- surefire config ... -->
                 </configuration>
            </plugin>
        </plugins>
    </build>
</project>

I would recommend first work with your project's pom to create the self-contained test-runner, with its embedded POM. Once you got that working locally, it's trivial to do it in Bamboo many times over.

If you get stuck, I believe our sales@atlassian.com folks are happy to help customers evaluating. You might even raise a support issue for some technical help and mention you're evaluating. If support doesn't help you outright, they'll route you to sales engineering. Or post here again with your progress, what you've done, any POMs you have written so you can get additional, specific help.

2 votes
Przemek Bruski
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
February 5, 2012

You need to merge your SCM and Build Stages. And probably also checkout in test stage too.

Remember that each Job is separate and does not see anything that happened in other Jobs, unless you explicitely share it via artifacts.

If you're just starting with Bamboo, stick to a single stage/Job and let it grow from there.

Jörg Godau February 5, 2012

Hi, thanks for the reply.

By "Job" do you mean a Job under a stage or a Stage itself?

I did combine SCM and Build stages - though I feel this goes against what stages are for on a logical level. That works fine and the SCM&Build Stage completes and produces a Jar, which I share as an artifact.

The Test stage fails even though the Jar is found because it cannot find any pom.xml (no sources).

It's a huge waste of time to checkout code in every stage. It goes against what is being talked about

here https://answers.atlassian.com/questions/19562/plans-stages-jobs-best-practices

and here http://youtu.be/AHX7dE9KRhQ

How do the Jira guys do it where they only have one Checkout and one Compile?

We have large projects (hundreds of thousands of LoC, thousands of files) if I do an SCM checkout at the start of every stage this will waste a lot of time.

If I wanted to crush everything into one stage and one job with a list of tasks, then I can stay with what we have and don't need to buy Bamboo...

I'm just starting with Bamboo - and for that reason - I need a proper example of how to set up a real world project.

Atlassian: Please help me sell this to my management!

Cheers

Jack...

Przemek Bruski
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
February 5, 2012

The first question you have to ask yourself is: what can be parallelised in your build process? These things should go into separate Jobs. If you don't need to parallelise, you probably don't need more Jobs.

If the Jobs need some common stuff to work (like a compiled result), you should have a Job in a prior stage that prepares this. If you don't want to check out stuff in Stage 2, you have to make sure that the artifact created in Stage 1 are self contained - i.e. can be run without checking out anything extra.

Jörg Godau February 5, 2012

That's what I'm trying to do!

  • SCM should be a prior stage for everything else!
    • (Is it somehow possible to have SCM as a Stage and copy "all" of it as an artifact for later stages)
  • Then In the Build Stage, several Jobs for different kinds of builds (augmented for clover, normal for later deployment etc...)
  • Then in the Test Stage, several Jobs for batches of Tests
  • etc...

"If you don't want to check out stuff in Stage 2, you have to make sure that the artifact created in Stage 1 are self contained" - How?? Where are there proper examples for reasonable sized projects?

Surely I'm not the only one who wants to save time on builds and not checkout on every stage - from the video link the Jira developers manage this, but they don't give a detailed example I can follow.

We have big projects, we know we need to run stuff in parallel, we don't want to repeat actions that waste time in the build cycle and I need Atlassian's help if I'm going to convince management that Bamboo is the way to go.

If I go to management and say "we'll get Bamboo and can run Tests in parallel, but to make that work we have to checkout the code 27 or more times per build" they will say that's crazy!

I don't want to seem rude, but I need some real answers, not generic comments.

Thanks

Jack...

0 votes
Larry Wilson June 11, 2019

I designed all of this outside Bamboo, since I'm used to applying several tools through a proprietary intermediate.

BUT, the key is in the Bamboo Environment Variables and Name spaces.

A Manifest identifies each Component and how to Deliver and Provision the Package into a single Directory (PACKAGE)

Delivery consists of PREP BUILD Topology, CLEAN, BUILD, PACKAGE STAGING, PACKAGE CREATION, REPORT

A CLEAN stage consists of staging raw sources for all Components in a common layout in a unique directory Path on THE Build Host and the Path is maintained throughout the Plan as an environment variable.

The BUILD stage consists of processing raw sources into work products (production or debug) which may mean no change, compilation, simple rendering, ...

The work products can then be staged for unit or function testing by developers...

The PACKAGE STAGING stage is the process of transferring the interim work products into the correct form for PACKAGING, for example... a chroot staging under ".../stage/opt/product/..." for creating a binary self extractor that will place the product under "/opt/product/..."

The PACKAGE CREATION stage, copies the Product Package and whatever tools needed for Provisioning into a .../PACKAGE/... directory.

If multiple Components exist in the manifest, each Delivery is performed (serial where dependency exist, or concurrent where possible); in the end the PACKAGE directory holds ALL Packages for the Product with the means for Provisioning as well as the Manifest that declares order et al.

A Bamboo BUILD Plan can run the above for any BASELINE (Per component Branch Model: Reference, Project, Task, Release Candidate, and Generally Available). The PACKAGE directory is the Bamboo "Artifact" that can be passed to a DEPLOYMENT plan. The DEPLOYMENT Plan then applies Provisioning.

Provisioning consists of PREP Production Topology, DEPLOY, OPERATE, MONITOR, SHUTDOWN, REPORT.

The DEPLOY stage pulls the PACKAGE down, deploys and configures while maintaining the ability to rollback on error.

The OPERATION stage dispatches the Operational services from the Product

The MONITOR stage reviews Health Monitors for assessment and acceptance of initial dispatches.

The SHUTDOWN Stage analyzes the MONITORED data and optionally applies SHUTDOWN and Rollback when catastrophic anomaly is identified.

The REPORT send s notification of the Provisioning Status.

If multiple Components exist in the manifest, each Provisioning is performed (serial where dependency exists, or concurrent where possible).

PREP is performed to validate, verify, and optionally create Hosts based on a Topology Model, which identifies Network (Routes and Domains) and Host layout declared as Machine, Platform, Authentication, and Roles associated to each Host.

Both Delivery and Provisioning Tasks are optionally identified by Role, in whole or in part, which limits WHERE each Task is performed.

BUILD Plans are applied to Deliver the Package to an Archive with or without Test Cycles, ...

  1. Delivery only, to supply a Package into a Corporate Archive
  2. Delivery and Provisioning (using separate Topology Models) to perform the controlled placement of the Package to populate a Test Bed.
  3. Delivery and Provisioning of Product Components and automated Test Cycle Components for Continuous Integration or Continuous System Integration

DEPLOYMENT plans can be applied for , ...

  1. Delivery and Provisioning with CI and CSI Test Cycles.
    1. for Continuous Deployment
    2. for seeding a QA System Test and Integration Test Environment
  2. Provisioning of a Package from BUILD Plan Artifacts or a Package Archive

 

But, again, the key is the environment variables and producing Namespaces for BAMBOO and Product for each BUILD that can be passed along from the initial stage through in BUILD through to the last stage in DEPLOYMENT.

0 votes
SGD
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
February 6, 2012

There are screenshots of the artifact configurations toward the top of the blog, in the "Sharing is Caring" section. And here are a couple screenshots of the configs for the job that produces the artifact, and for the downstream job that consumes it. I'm afraid my real world examples aren't drastically different from what is in the online documentation, but hopefully it will help you.

The artifact is generated by this job:

...and is consumed by this job. Note that there is no "checkout" task contained in this job. The deploy script executed here (deployToQA.sh) grabs the latest .jar file from ~/atlassian-cahce-api/target (which is where I told Bamboo to place the artifact when it it produced by the upstream job), then proceeds with the rest of the deploy steps.

Jörg Godau February 6, 2012

Thanks for the update.

Jack..

Deleted user October 11, 2017

@Jörg Godau @everyone else I still don't get it. How can you check out the code and build in the same stage? Tasks within the same stage run in parallel. build might happen before checkout is complete. So the two must live in different stages.

Can someone please post a link to simple instructions how to use checked out code in later stages?

Jörg Godau October 27, 2017

@[deleted] 

The order biggest to smallest units of work are

  1. Stages (each stage runs after the previous stage)
  2. Jobs (all jobs in one Stage can run in parallel, if there are enough agents)
  3. Tasks (each Task in a job runs after the previous task)

 

 

Each job has a "Source Code Checkout" task by default. Just leave it there and put your other tasks in after that to compile.

Later Stages do their own checkout (they may be on a different agent) and will _always_ checkout the same code as the first one. So it all just works.

Cheers

Jack...

Lars Niestrad January 10, 2018

Okay, I'm very disappointed by now. Serveral users in this thread asked for a working example (and in other threads, too btw). HOW do you, atlassian, design your build plans and deployment projects regarding the common aspects of a normal maven based java project?

We want to: compile, unit test, deploy to at least one environment, run integration tests against it, run a sonarqube analysis and publish the tested artifact to an artifact server (artifactory/ nexus).

This is the very basic build process for most of the projects today. How do you implement it the way bamboo should be used and the experts recommend it?

PLEASE provide us a complete and running example.

Chris Nollstadt June 7, 2018

Just started working with Bamboo and hit this. Maybe I'm missing something but honestly I'm finding it absurd that jobs running in parallel can't share a common working folder with source code. In my case I have multiple builds that can all complile in parallel (extjs, .net, dacpac database projects) on the same source checkout followed by a sequential packaging stage from that same source working folder with the new build outputs. I would think this would be a simple no-brainer option.

Grzegorz Lewandowski
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
June 7, 2018

@Chris Nollstadt, I suppose you use local agents only. In that particular case, the only benefit of using multiple jobs is running them in parallel (in case it's actually benefit and your hardware can handle whole load) and build composition/organization. 

Jobs are meant to be run on different remote agents, which in fact will usually be different machine. There's no point of sharing directory between machines. 

If you want your build to be compiled in parallel it's up to configure the build scripts accordingly. 

0 votes
SGD
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
February 6, 2012

Hi Jack-

Just wanted to make sure you've seen the Bamboo documentation regarding artifacts. The pages on configuring the artifacts produced by a Job and configuring artifact sharing within your Plan might be helpful. Also, if you check out this blog post on setting up a continuous delivery pipeline, there are some screenshots of the artfact stuff.

I hope this helps a little!

Jörg Godau February 6, 2012

Hi Sarah,

thanks for the pointers. The configuration pages are quite frankly useless. There's a lack of real world examples.

Your blog post was good, especially the part about "You Can Checkout Any Time You Like (but don’t do it more than once)" that is exactly what I'm talking about. It's a pity you don't include the configuration for the artifacts that make this happen.

Cheers

Jack...

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events