Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in
Celebration

Earn badges and make progress

You're on your way to the next level! Join the Kudos program to earn points and save your progress.

Deleted user Avatar
Deleted user

Level 1: Seed

25 / 150 points

Next: Root

Avatar

1 badge earned

Collect

Participate in fun challenges

Challenges come and go, but your rewards stay with you. Do more to earn more!

Challenges
Coins

Gift kudos to your peers

What goes around comes around! Share the love by gifting kudos to your peers.

Recognition
Ribbon

Rise up in the ranks

Keep earning points to reach the top of the leaderboard. It resets every quarter so you always have a chance!

Leaderboard

Come for the products,
stay for the community

The Atlassian Community can help you and your team get more value out of Atlassian products and practices.

Atlassian Community about banner
4,466,290
Community Members
 
Community Events
176
Community Groups

Keep on getting resultOrError: FAILED in "Not uploading caches/artifacts"

Hi Team,

So I have started experimenting with runners. My use-case is GPU based i.e. I need to run some GPU bound tests when a PR is generated.

My bitbucket_pipelines.yml looks like this:

 

image: dspd/rtq:env
pipelines: default:

- step: name: CUDA test

      runs-on:

        - 'self.hosted'

        - 'linux'

      script:

        - python test.py

        - echo "Cuda test run!"

dspd/rtq:env is a docker image build upon the nvidia's cuda image.

test.py looks like below:


import torch

assert torch.cuda.is_available() == True, "Cuda is not installed properly!"

 The build fails, in the runner logs there are 2 FAILED statements:

[2021-04-16 13:55:40,535] Not uploading caches. (numberOfCaches: 0, resultOrError: FAILED)
[2021-04-16 13:55:40,536] Not uploading artifacts. (numberOfArtifacts: 0, resultOrError: FAILED)

 

To test, I changed

assert torch.cuda.is_available() == True, "Cuda is not installed properly!"

to

assert True == True, "Cuda is not installed properly!"

 Then, this step runs successfully.

[2021-04-16 13:52:42,323] Not uploading caches. (numberOfCaches: 0, resultOrError: PASSED)
[2021-04-16 13:52:42,324] Not uploading artifacts. (numberOfArtifacts: 0, resultOrError: PASSED)

Can anyone help me out in debugging this? number of caches, artifacts is 0 in both the cases. Any way get more detailed logs?

 

Thanks

3 comments

lassian Atlassian Team Apr 18, 2021

Hi Rochak,

Your build is failing as we dont allow access at the moment to devices from the host such as GPUs we would have to provide a feature to allow for this. As to access host devices such as the GPU you have to pass them to the docker container create request as by default it doesnt pass them.

In regards to your cache and artifact logs, your step has no caches or artifacts defined from what I can see above and that is why it is saying it is not uploading any, those are trace logs so we can easily diagnose what is happening when a step is running for support.

Do you expect caches or artifacts to be uploaded? if so you will need to add them to your step in your bitbucket-pipelines.yml.

Kind Regards,
Nathan Burrell

HI @lassian thanks for the reply. Can you help me out in accessing GPU.

Here's what I have achieved so far.

- My host machine can access GPU just fine. nvidia-smi works.

- My base docker image in bitbucket_pipelines.yml can standalone access the GPU too. nvidia-smi works when I run the image with the --gpus all flag.

 

What do I need to add/pass extra to make the runner instance access it? The docker run command that we run to make the runner online, I am passing the --gpus all flag to it too. What else do I need to do to get cuda to work during the runner run time. Basically, all my merge checks etc. are GPU bound and being able to access it is very critical for my use case.

 

Also, understood the part regarding caches and artifacts.

Again, Thanks.

lassian Atlassian Team Apr 19, 2021

Hi Rochak,

As I mentioned above we dont currently supporting adding extra devices to the container create request we make to allow adding extra host devices, we would have to add this as a feature.

We will take this feedback into consideration for a future release.

Kind Regards,

Nathan Burrell

Comment

Log in or Sign up to comment
TAGS

Atlassian Community Events