Heap and Priority Queue using heapq module in Python
Heaps are widely used tree-like data structures in which the parent nodes satisfy any one of the criteria given below.
- The value of the parent node in each level is less than or equal to its children’s values – min-heap.
- The value of the parent node in each level higher than or equal to its children’s values – max-heap.
The heaps are complete binary trees and are used in the implementation of the priority queues. The min-heaps play a vital role in scheduling jobs, scheduling emails or in assigning the resources to tasks based on the priority.
These are abstract data types and are a special form of queues. The elements in the queue have priorities assigned to them. Based on the priorities, the first element in the priority queue will be the one with the highest priority. The basic operations associated with these priority queues are listed below:
- is_empty: To check whether the queue is empty.
- insert : To insert an element along with its priority. The element will be placed in the order of its priority only.
- pop : To pop the element with the highest priority. The first element will be the element with the highest priority.
The priority queues can be used for all scheduling kind of processes. The programmer can decide whether the largest number is considered as the highest priority or the lowest number will be considered as the highest priority. If two elements have the same priority, then they appear in the order in which they appear in the queue.
heapq module in Python
Heapq module is an implementation of heap queue algorithm (priority queue algorithm) in which the property of min-heap is preserved. The module takes up a list of items and rearranges it such that they satisfy the following criteria of min-heap:
- The parent node in index ‘i’ is less than or equal to its children.
- The left child of a node in index ‘i’ is in index ‘(2*i) + 1’.
- The right child of a node in index ‘i’ is in index ‘(2*i) + 2’.
Priority queues using heapq module
The priority queue is implemented in Python as a list of tuples where the tuple contains the priority as the first element and the value as the next element.
Example : [ (1, 2), (2, 3), (4, 5), (6,7)]
consider (1,2) :
- Priority : 1
- Value/element : 2
Consider a simple priority queue implementation for scheduling the presentations of students based on their roll number. Here roll number decides the priority of the student to present. Since it is a min-heap, roll number 1 is considered to be of the highest priority.
The order of presentation is : 1 : Anish 2 : cathy 3 : Moana 5 : Rina 4 : Lucy
Now let us implement a simple scheduler that assigns the jobs to the processor. The priority queue is used by the scheduler to decide which task has to be performed. Apart from the tasks, there will be interrupts approaching the scheduler. So the scheduler has to decide whether to execute the interrupt or the existing task. If the interrupt has a higher priority, it is executed first otherwise, once all the jobs are completed, the interrupt will be serviced. To implement this the heapq module is used. The approach is given below.
- The tasks to be executed are assigned with priorities. The element that has ‘1’ as priority is considered to be the most important task.
- All the tasks are in a priority queue and are maintained with the min-heap property.
- The tasks are serviced and while in progress, just a message gets printed as an execution log stating which task is in progress.
- The interrupts along with their priorities approach the scheduler.
- The interrupts are pushed into the priority queue preserving the min-heap property.
- The task/interrupt with the highest priority will be serviced first and it is always the first element in the queue.
- Once a task.interrupt is serviced, it is popped out from heap queue.