It's not the same without you

Join the community to find out what other Atlassian users are discussing, debating and creating.

Atlassian Community Hero Image Collage

My service start/stop script stops the application process but doesn't release sockets held by a child process. Help?!

Sam Caldwell Atlassian Team Aug 26, 2016

Background

Ok.  So the question I am going to propose tonight is straight from the actual problems that come across my desk here within the walls of Atlassian.  I figured I would share this with the community to help save someone else the pain of experiencing this.

(1) There exists a start/stop script for an application.  It starts an application process as a service user account and when appropriate a stop command will stop the application process.  

(2) However, recently when the start and stop did not correspond with a startup and shutdown of the underlying machine, the application could not be started after it was stopped.

(3) Log analysis revealed the following:

SEVERE: Failed to initialize end point associated with ProtocolHandler ["http-bio-127.0.0.1-8080"]

java.net.BindException: Address already in use /127.0.0.1:8080

(4) After stopping the application, netstat showed: 

tcp6       0      0 127.0.0.1:8080          :::*          LISTEN      23681/java      
tcp6       0      0 127.0.0.1:9080          :::*          LISTEN      23681/java      
tcp6       0      0 127.0.0.1:8005          :::*          LISTEN      23681/java      
tcp6       0      0 :::8009                 :::*          LISTEN      23681/java

(5) The problem is that the PID_FILE for the application process (before stop) had contained a different process id: 

#cat $PID_FILE
23680

(6) After shutdown, process id 23680 had been stopped successfully

(7) If process id 23681 is killed manually, the application will start without issue.

Problem Statement

The application in question does not shutdown properly, leaving a child process running as a daemon with TCP sockets open.  This prevents the application from being restarted without a manual process termination or a restart of the underlying machine.

1 answer

1 accepted

2 votes
Answer accepted
Sam Caldwell Atlassian Team Aug 26, 2016

Root Cause:

The init script that starts/stops the application terminates the parent process with "kill -9 <PID>" rather than a more graceful "kill <PID>"

The application does not have a chance to cleanup after itself.

Solution:

Change the init script to use 

kill $(cat $PID_FILE)

rather than 

kill -9 $(cat $PID_FILE)
Steven Behnke Community Leader Aug 26, 2016

neat.png

Suggest an answer

Log in or Sign up to answer
Community showcase
Posted in United States

One week from next event!

We're having breakfast next Monday at 8AM.  We'd love to see you all come out!  Details and RSVP can be found at:   https://aug.atlassian.com/events/details/atlassian-boise-present...

10 views 0 0
View post

Community Events

Connect with like-minded Atlassian users at free events near you!

Find an event

Connect with like-minded Atlassian users at free events near you!

Unfortunately there are no Community Events near you at the moment.

Host an event

You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events

Events near you