Opened 11 years ago

Closed 10 years ago

#1796 closed enhancement (fixed)

Improve support for jobs running on external servers

Reported by: Nicklas Nordborg Owned by: everyone
Priority: critical Milestone: BASE 3.3
Component: core Version:
Keywords: Cc:

Description

The current job system in BASE is more or less built around the plug-in system which is managed by the BASE server with the help of job agents. There is also a possibility to set a job to type=OTHER, but this is not so useful since much of the functionality such as progress reporting, error handling, aborting, etc. is not easy to implement.

I think this needs to be improved so that it is easier to manage/control jobs submitted to external managers. I am currently not sure exactly what is needed, but we have a current use case that submit jobs to the Open Grid Scheduler (see http://gridscheduler.sourceforge.net/)

I am thinking that we need some kind of extension point that BASE can use internally to get information about jobs on the external system. We probably need some more columns in the database so we know which extension to query for information.

Change History (8)

comment:1 by Nicklas Nordborg, 11 years ago

(In [6432]) References #1796: Improve support for jobs running on external servers

Added an extension point (net.sf.basedb.core.signal.job) and ExtensionSignalTransporter that uses this extension point to send signals to jobs. The web interface has been updated to send 'ABORT' and 'STATUS' signals also to jobs in the 'WAITING' state.

The 'WAITING' state is somewhat redefined so that external jobs can use this state to indicate that the job has actually been queued. This means that jobs are not put in 'WAITING' state as soon as parameters has been added to it, but Job.setScheduled() must be called first. This change probably affect a lot of the test code only (must check this), but not regular usage with the request/response api.

Added externalId property to job so that external jobs have a place to store an identifier.

comment:2 by Nicklas Nordborg, 11 years ago

(In [6433]) References #1796: Improve support for jobs running on external servers

Fixed issues with jobs in UNCONFIGURED state in the test code.

comment:3 by Nicklas Nordborg, 11 years ago

(In [6436]) References #1796: Improve support for jobs running on external servers

Adding some variants to 'start' and 'done*' methods that include dates since the actual date the job was started/ended on the external server may be different from the current system date that was used by default.

comment:4 by Nicklas Nordborg, 11 years ago

(In [6438]) References #1796: Improve support for jobs running on external servers

Requests for status updates to externally executing jobs need to be given WRITE permission for the job, or the status update may fail depending on which user that is logged in.

comment:5 by Nicklas Nordborg, 11 years ago

(In [6439]) References #1796: Improve support for jobs running on external servers

Added a function to the internal job queue that also sends a request for status update to external jobs. Maybe this should be in it's own class but it feels like overkill. Better to rename InternalJobQueue to something else...

Also added 'external ID' colum to jobs list page and to the view page.

comment:6 by Nicklas Nordborg, 11 years ago

(In [6448]) References #1796: Improve support for jobs running on external servers

Added possibility to use a PluginSessionControl with a job but no plug-in. This makes it possible to set the active project to the project for the job and also ensure that change history logging is working as expected (otherwise changes were always listed as made by the root user account).

comment:7 by Nicklas Nordborg, 11 years ago

(In [6453]) References #1796: Improve support for jobs running on external servers

Allow priority to be set on unconfigured jobs.

comment:8 by Nicklas Nordborg, 10 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.