Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

xvfb-run broken on elastic bamboo Ubuntu AMIs

ernestm November 19, 2015

I'm running Xvfb on an Elastic Bamboo Ubuntu instance (AMI ami-bba0fede, Ubuntu 15.04), like explained in the docs.  It works if I just do the DISPLAY= and run Xvfb directly.

But I'd like to use the xvfb-run wrapper script (/usr/bin/xvfb-run, part of the Xvfb package) instead - it picks a free display, so you can run multiple tests on the same system without them conflicting. If you just hardcode a display number then you can only run one at a time.

This works on Ubuntu in my local virtualbox but hangs on the Atlassian-provided AMI (both run from EB and also if I pull it up manually).  It does work on a stock Canonical Ubunti AMI (ami-3ad5af50) so there has to be something specifically changed on the Elastic Bamboo instances that's breaking it.

This works:

export DISPLAY=:99
Xvfb :99 -ac
pybot --nostatusrc --variablefile variables/vars.py -e TBD -e broken -x xunit -d reports/ tests/
<test output here>

This works on virtualbox but hangs on an instance created from the EB AMI (both by Bamboo and also if I just manually start one off the same AMI):

xvfb-run --server-args="-screen 0 1024x768x24" -a -e xvfb.log pybot --nostatusrc --variablefile variables/vars.py -
e TBD -e broken -x xunit -d reports/ tests/
<hangs>

I added a set -x to the /usr/bin/xvfb-run script to see what up and it hangs within xvfb-run while it's trying to find a clear display, before it gets to my code.

+ XVFB_RUN_TMPDIR=/tmp/xvfb-run.DJcsEj
+ tempfile -n /tmp/xvfb-run.DJcsEj/Xauthority
+ AUTHFILE=/tmp/xvfb-run.DJcsEj/Xauthority
+ mcookie
+ MCOOKIE=e1587c6161fa85ccc22010271b252b73
+ tries=10
+ [ 10 -gt 0 ]
+ tries=9
+ XAUTHORITY=/tmp/xvfb-run.DJcsEj/Xauthority xauth source -
+ trap : USR1
+ XVFBPID=2861
+ wait
+ trap USR1
+ exec Xvfb :99 -screen 0 1024x768x24 -nolisten tcp -auth /tmp/xvfb-run.DJcsEj/Xauthority
<hangs forever>

This is before my test code, it's basically a loop in xvfb-run that runs and tries to find a free display - 

...
# Start Xvfb.
MCOOKIE=$(mcookie)
tries=10
while [ $tries -gt 0 ]; do
    tries=$(( $tries - 1 ))
    XAUTHORITY=$AUTHFILE xauth source - << EOF >>"$ERRORFILE" 2>&1
add :$SERVERNUM $XAUTHPROTO $MCOOKIE
EOF
    # handle SIGUSR1 so Xvfb knows to send a signal when it's ready to accept
    # connections
    trap : USR1
    (trap '' USR1; exec Xvfb ":$SERVERNUM" $XVFBARGS $LISTENTCP -auth $AUTHFILE >>"$ERRORFILE" 2>&1) &
    XVFBPID=$!
 
    wait || :
    if kill -0 $XVFBPID 2>/dev/null; then
        break
    elif [ -n "$AUTONUM" ]; then
        # The display is in use so try another one (if '-a' was specified).
        SERVERNUM=$((SERVERNUM + 1))
        SERVERNUM=$(find_free_servernum)
        continue
    fi
    error "Xvfb failed to start" >&2
    XVFBPID=
    exit 1
done
...

It hangs on that trap/exec line. My bash-fu isn't good enough to figure out why.  Nothing appears in any log including the xvfb-run log if I configure one. 

Is anyone else running xvfb-run successfully, or know what this issue might be?

 

Addition: I've learned that it's Xvfb behavior to send a SIGUSR1 back to the parent process if the child process disconnects, which is what the parent shell is wait-ing on.  But that signal's either not getting sent or not getting received.

 

Solution: (why the heck can't you post an answer to your own question here?)

I found the issue.  Atlassian adds a /usr/local/bin/Xvfb wrapper script that sets listen tcp.   having another shell in between Xvfb and xvfb-run blocks the SIGUSR1 send.  It also seems like a security hole to turn on tcp listen without you knowing, but eh.  If anyone else is trying to use xvfb-run, just deleting their wrapper script will do the trick.

2 answers

1 accepted

1 vote
Answer accepted
Earl McCutcheon
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
January 1, 2016

Update for Any one else that runs into this, the Bug is being tracked here:

0 votes
Alex Soto November 19, 2015

Don't have an answer for you, but I just want to let you know that in my experience, the images get updated without notice.  I've had my company's build service taken out by Atlasssian image updates multiple times.  I think if you can wait, file a support request.  Otherwise, it's on you to figure out and fix

ernestm November 20, 2015

Right, I have logged a support ticket and am also working on figuring and fixing - hence asking here. I wonder if it works for people that have built their own elastic bamboo instance from the instructions as opposed to using the stock one?

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events