ResourceScheduler Examples

PAGE IS STILL IN CONSTRUCTION

ResourceScheduler.js is a Javascript library for tasks scheduling. Useful for sue-cases of tasks that require a lot of resource like CPU, network bandwidth, etc. The library is espacially useful for tasks which can be prioritized.

Creating a scheduler

To begin the work with the ResourceScheduler.js library, include the following script in your page:

<script src="http://MaMazav.github.io/cdn/resource-scheduler.dev.js"></script>

The library contains two types of scheduler available: LifoScheduler and PriorityScheduler. The most simplest scheduler to use is LifoScheduler. It just schedule the most recent job in queue when it has an available resource. The LifoScheduler constructor accepts two mandatory arguments:

createResource - a non-parametric function that returns a resource. On the beginning of the execution, when new jobs are to be scheduled but resources has not been allocated yet, the LifoScheduler will use this function to create the resources.
jobsLimit - count of available resources (i.e. maximum amount of parallel jobs).

The LifoScheduler is useful if jobs do not have special priority over each other, and you just want to limit the amount of parallel jobs.

var scheduler = new LifoScheduler(
     function createResource() {
       return new DbConnection();
     },
     /*jobsLimit=*/4);

The PriorityScheduler is a more complicated scheduler. It schedules the job accoding to some priority defined by the user. Its constructor accepts three mandatory arguments:

createResource - as in LifoScheduler.
jobsLimit - as in LifoScheduler.
prioritizer - an object contains getPriority method. We will demonstrate it in the prioritizer section.

In addition, a fourth optional argument can contain more options. They will be discussed in the PriorityScheduler options section.

Enqueue job to scheduler

After creating a scheduler, we can request the scheduler to schedule a job. Enqueueing a job is performed by calling the function enqueueJob with three arguments which declare the job:

jobFunc - a function which actually performs the job. The function should accept two arguments - a resource (created by the createResource function) and the jobContext passed as the second argument.
jobContext - an object declaring the specific job. The job context is used only by the jobFunc - the scheduler doesn't care about its content. If the jobFunc has enough information information to execute the job and it doesn't need the job context, this argument may be even ignored.
jobAbortedFunc - a function that will be called if the job has been aborted before scheduling. This argument is also not mandatory, if the job may not be aborted.

The secheduler holds the jobFunc and jobContext until the jobFunc is executed. Right after executing the function, the scheduler has no information about the executed job until either jobDone or tryYield functions are called.

scheduler.enqueueJob(
     function jobFunc(dbConnection, jobContext) {
       var result = dbConnection.executeQuery(jobContext.sql);
       jobContext.callback(result);
     },
     /*jobContext=*/{
       sql: 'Select Name from Employees where salary > 5000',
       table: 'Employees',
       callback: function(employees) {
         console.log('Got ' + employees.length + ' employees');
       }
   );

Finishing a job and release resource

Once the job is done, the jobFunc is expected to return the resource back to the scheduler so the scheduler know it is released and more jobs may be scheduled. It is done by calling the jobDone function.

The jobDone function accepts also a second argument of the jobContext. However, in the usual scenario it is not a mandatory argument; the jobContext is used only when showLog option is true in the PriorityScheduler constructor and is used only to log the priority of the job.

After calling the jobDone function, the resource should not be used anymore as it may be passed to another job for its use.

scheduler.enqueueJob(
     function jobFunc(dbConnection, jobContext) {
       var result = dbConnection.executeQuery(jobContext.sql);
       scheduler.jobDone(dbConnection, jobContext);
       jobContext.callback(result);
     },
     /*jobContext=*/{
       sql: 'Select Name from Owners where stocks >= 51',
       table: 'Owners',
       callback: function(owners) {
         console.log('Got ' + owners.length + ' owners');
       }
   );

Yielding a resource

A job may decide to yield the resource in favour of higher priority jobs. To do that, simply call the tryYield function. The possible responses of the scheduler are:

The scheduler may decide to not yield the resource but let it continue running. In such situation the tryYield function will return false (This is the only possible response of the LifoScheduler).
The scheduler may decide to yield the job. Then the jobYieldedFunc callback will be executed, and the job is re-enqueued with the jobContinueFunc and jobAbortedFunc for later scheduling. After the jobYieldedFunc called, the resource should not be used anymore. Notice that the later jobContinueFunc function may be passed a different resource. The new resource is the only one that should be used.
The scheduler may decide to abort the job (see Using prioritizer to abort waiting jobs). Then the jobAbortedFunc callback will be executed. After the jobAbortedFunc called, the resource should not be used anymore.

Notice that even if the jobYieldedFunc was called, the later call to jobContinueFunc (or jobAbortedFunc in case of abort) may be called before the tryYield function has been returned. It may happen if other jobs scheduled during the tryYield function has been finished immediately. For that reason, it is not recommended to perform any operation related to the job after the tryYield function. Instead pass those operations to the end of the jobYieldedFunc callback.

The tryYield function accepts five arguments: the four mentioned callbacks and the resource passed in the jobFunc:

jobContinueFunc - a callback to execute when the function is rescheduled (after yielded). Its argument are identical to the jobFunc function passed to the enqueueJob function.
jobContext - the context of the job.
jobAbortedFunc - a callback to execute if the job is aborted. Identical to the jobAbortedFunc passed to the enqueueJob function.
jobYieldedFunc - a callback to execute if the job is yielded. It is passed only a single argument - the jobContext.
resource - the resource originally passed to the jobFunc function that was passed to the enqueueJob function.

If releasing of the resource on abort or yield takes some time, you may use the shouldYieldOrAbort function. The function will perform a check if tryYield should be called. If the function returns false, then calling tryYield would have no effect. If the function returns true, you should start to release the resource and call tryYield when the resource really released. Notice however that until you call the tryYield, the situation might be changed and tryYield would have act differently.

shouldYieldOrAbort accepts a single argument, the context of the job.

Implementing prioritizer for PriorityScheduler

If the various jobs have different priorities, one can use the PriorityScheduler. The constructor of the PriorityScheduler accepts a third argument, prioritizer, which is an object contains a getPriority method. This method accepts the job context as an argument and returns the job's priority, which is a non-negative number. As higher the number, as higher the job priority. The priority of a given job may change over time. Notice that a large number of priorities is not recommended, as it may degrade performance.

For example, the following prioritizer may match to the jobs demonstrated in the above snippets. The PriorityScheduler which accepts the following prioritizer will schedule queries from table 'Owners' before queries from table 'Employees':

var scheduler = new PriorityScheduler(
     function createResource() {
       return new DbConnection();
     },
     /*jobsLimit=*/4,
     /*prioritizer=*/ { getPriority: function(jobContext) {
        switch (jobContext.table) {
          case 'Employees':
            return 1;
          case 'Owners':
            return 2;
          default:
            return 0;
        }
     });

PriorityScheduler's Scheduling algorithm

First, it is important to understand that the PriorityScheduler does not guarantee to schedule high priority jobs first. The scheduler should improve performance, thus it is limited to perform short and efficient operations regularly. For the scenario we used the PriorityScheduler, we found it useful enough to compare the priorities of only the last enqueued jobs and schedule one of them (the number of the new enqueued jobs considered is controlled by the option argument numNewJobs). If you have a different scenario you are invited to contribute support of another behaviour. Older jobs are scheduled only after the new ones has been scheduled, or after a "rerank" is performed (see below).

Second, if the priority of jobs change over time, then the probability that highest priority job will be scheduled first is decreased. For performance reason the PriorityScheduler sorts the jobs when enqueued and do not recalculate priorities of jobs that their priorities were already calculated as low (that's true for old jobs only; for the new jobs, as described in the previous paragraph, the priority is recalculated each time that a resource is freed).

To partially overcome some problems raised because of the above mentioned behaviour, the PriorityScheduler performs "rerank" of priorities after some jobs have been scheduled. The "rerank" recalculates priorities of all jobs in the system, and takes the highest priority jobs to be the "newest jobs" so they will be scheduled first. The time to perform "rerank" is determined by the number of scheduled jobs and may be controlled by the numJobsBeforeRerankOldPriorities argument.

Finally, the PriorityScheduler may not schedule low priority jobs even if a resource is free and no higher priority jobs exist. It may happen if the programmer chose that higher priority jobs will always be guaranteed free resources. In such situation, lower priority job will be scheduled only if after scheduling there will still be enough free resources waiting for potential higher priority jobs. This option is controlled by the and options. This mechanism is not active by default (Values of zero in those options implicitly cause it to be inactive).

PriorityScheduler options

The fourth argument of the PriorityScheduler constructor, is an option object that may contain the following properties:

showLog - Boolean. Determines if to show log to the console. Defaults to false.
schedulerName - String. A scheduler name to show in the log messages (effective only if showLog = true).
numNewJobs - Number. Determines the number of the new jobs which are to be scheduled first (see PriorityScheduler's Scheduling algorithm section). Defaults to 20.
numJobsBeforeRerankOldPriorities - Number. Determines how many jobs will be scheduled before a "rerank" will be performed (see PriorityScheduler's Scheduling algorithm section). Defaults to 20.
resourcesGuaranteedForHighPriority - Number. Determines the amount of resources to guarantee for potential high priority jobs (see PriorityScheduler's Scheduling algorithm section). Defaults to 0.
highPriorityToGuaranteeResource - Number. The minimal priority to guarantee resources for. Defaults to 0.

Using prioritizer to abort waiting jobs

To cause a job to be aborted, the prioritizer just need to return a negative value in the getPriority function when accepting the requested job.

A job which has already been scheduled may not be aborted. Only jobs that are waiting for scheduling may be aborted. Jobs which have been yielded and waiting for rescheduling may also be aborted.

Demo

The following demo shows the basic usage of the PriorityScheduler.

Let's begin with setting the environment for the job scheduling:

In the above code, the PriorityScheduler constructor was passed the createResource and prioritizer arguments which are defined below. The createResource function defines a stub for a Database connection. The prioritizer defines that the priority of the job is defined by number of queries remaining for the job (using data that will be bookkeeped into the jobContext):

Now that we have the environment we can implement the job itself. In our example, the job will perform some queries to the Database. The following code performs a single query:

And a simple implementation of a job can use the above code:

Finally, if we run the above code, we get that the code does not preserve the priorities we defined. That's because after a job was scheduled it will hold the resource until performing all queries, although its priority degrades when the job performs more and more queries. To overcome this problem, we may use the tryYield() function:

To run the abvoe demo code, set the parameters as you wish and execute the demo.

newJobEveryMs
maxNumQueriesPerJob
jobsLimit
numNewJobs

numJobsBeforeRerankOldPriorities
resourcesGuaranteedForHighPriority
highPriorityToGuaranteeResource
useYield	Use yield