Hi Team,
So I have started experimenting with runners. My use-case is GPU based i.e. I need to run some GPU bound tests when a PR is generated.
My bitbucket_pipelines.yml looks like this:
image: dspd/rtq:env
pipelines: default:
- step: name: CUDA test
runs-on:
- 'self.hosted'
- 'linux'
script:
- python test.py
- echo "Cuda test run!"
dspd/rtq:env is a docker image build upon the nvidia's cuda image.
test.py looks like below:
import torch
assert torch.cuda.is_available() == True, "Cuda is not installed properly!"
The build fails, in the runner logs there are 2 FAILED statements:
[2021-04-16 13:55:40,535] Not uploading caches. (numberOfCaches: 0, resultOrError: FAILED)
[2021-04-16 13:55:40,536] Not uploading artifacts. (numberOfArtifacts: 0, resultOrError: FAILED)
To test, I changed
assert torch.cuda.is_available() == True, "Cuda is not installed properly!"
to
assert True == True, "Cuda is not installed properly!"
Then, this step runs successfully.
[2021-04-16 13:52:42,323] Not uploading caches. (numberOfCaches: 0, resultOrError: PASSED)
[2021-04-16 13:52:42,324] Not uploading artifacts. (numberOfArtifacts: 0, resultOrError: PASSED)
Can anyone help me out in debugging this? number of caches, artifacts is 0 in both the cases. Any way get more detailed logs?
Thanks
Rochak Saini
3 comments