A good mapping does not depends on which following factor
1. knowledge of task sizes
2.the size of data associated with tasks
3.characteristics of inter-task interactions
4.task overhead
All nodes collects _____ message corresponding to √p nodes to their respectively
1. √p
2.p
3. p+1
4. p-1
All-to-all personalized communication can be used in ____
1.fourier transform
2.matrix transpose
3.sample sort
4.All of the above
All-to-one communication (reduction) is the dual of ______ broadcast.
1.all-to-all
2. one-to-all
3.one-to-one
4.all-to-one
Cost Analysis on a mesh is
1. A. 2ts(sqrt(p) + 1) + twm(p - 1)
2.2tw(sqrt(p) + 1) + tsm(p - 1)
3.2tw(sqrt(p) - 1) + tsm(p - 1)
4.2ts(sqrt(p) - 1) + twm(p - 1)
Goal of good algorithm is to implement commonly used _____ pattern.
1. communication
2.interaction
3.parallel
4.regular
If we port algorithm to higher dimemsional network it would cause
1.error
2.contention
3.recursion
4.none
In ------------task are defined before starting the execution of the algorithmting?
1. dynamic task
2.static task
3.regular task
4.one way task
In a broadcast and reduction on a balanced binary tree reduction is done in ______
1.recursive order
2. straight order
3.vertical order
4.parallel order
In one -to- all broadcast there is
1. divide and conquer type algorithm
2.sorting type algorithm
3.searching type algorithm
4. simple algorithm
In Scatter Operation on Hypercube, on each step, the size of the messages communicated is ____
1. tripled
2.halved
3.doubled
4.no change
In the scatter operation ____ node send message to every other node
1.single
2.double
3.triple
4.none
In the second phase of 2D Mesh All to All, the message size is ___
1.m
2.p*sqrt(m)
3.p
4.m*sqrt(p)
In this decomposition problem decomposition goes hand in hand with its execution
1.data decomposition
2. recursive decomposition
3.explorative decomposition
4.speculative decomposition
It is not possible to port ____ for higher dimensional network
1.algorithm
2.hypercube
3.both
4.none
Reduction can be used to find the sum, product, maximum, minimum of _____ of numbers.
1.tuple
2.list
3.sets
4.all of above
subsets of processes in ______ interaction.
1.global
2.local
3.wide
4.variable
threads being block altogether and being executed in the sets of 32 threads called a
1. thread block
2.32 thread
3.32 block
4.unit block
When the topological sort of a graph is unique?
1.when there exists a hamiltonian path in the graph
2.in the presence of multiple nodes with indegree 0
3.in the presence of single node with indegree 0
4.in the presence of single node with outdegree 0
Which of the following is not a form of parallelism supported by CUDA
1.vector parallelism - floating point computations are executed in parallel on wide vector units
2. thread level task parallelism - different threads execute a different tasks
3.block and grid level parallelism - different blocks or grids execute different tasks
4.data parallelism - different threads and blocks process different parts of data in memory
which of the following is not a granularity type
1.course grain
2.large grain
3.medium grain
4.fine grain
which of the following is not an example of explorative decomposition
1.n queens problem
2.15 puzzal problem
3. tic tac toe
4.quick sort
__ can be performed in an identical fashion by inverting the process.
1.recursive doubling
2.reduction
3.broadcast
4.none of these
accumulate results and send with the same pattern is...
1.broadcast
2.naive approach
3.recursive doubling
4.reduction symmetric
All processes participate in a single ______ interaction operation.
1.global
2.local
3.wide
4.variable
all processes that have the data can send it again is
1.recursive doubling
2. naive approach
3.reduction
4.All of the above
All-to-all broadcast algorithm for the 2D mesh is based on the
1. linear array algorithm
2. ring algorithm
3.both
4.none
blocking optimization is used to improve temmporal locality for reduce
1. hit miss
2.misses
3.hit rate
4.cache misses
Broadcast and reduction operations on a mesh is performed
1. along the rows
2.along the columns
3.both a and b concurrently
4.none of these
Communication between two directly link nodes
1.cut-through routing
2.store-and-forward routing
3.nearest neighbour communication
4.none
Cost Analysis on a ring is
1. (ts + twm)(p - 1)
2. (ts - twm)(p + 1)
3. (tw + tsm)(p - 1)
4.(tw - tsm)(p + 1)
CUDA thought that 'unifying theme' of every form of parallelism is
1.cda thread
2.pta thread
3.cuda thread
4.cud thread
Each node first sends to one of its neighbours the data it need to....
1. broadcast
2.identify
3.verify
4.none
efficiency of data parallel algorithm depends on the
1.efficient implementation of the algorithm
2.efficient implementation of the operation
3.both
4.none
every node has to know when to communicate that is
1.call the procedure
2.call for broadcast
3. call for communication
4.call the congestion
every node on the linear array has the data and broadcast on the columns with the linear array algorithm in _____
1. parallel
2.vertical
3.horizontal
4.all
For sake of simplicity, the number of nodes is a power of
1.1
2.2
3.3
4.4
Generalization of broadcast in Which each processor is
1.source as well as destination
2.only source
3.only destination
4.none
Group communication operations are built using which primitives?
1.one to all
2.all to all
3.point to point
4.none of these
if "X" is the message to broadcast it initially resides at the source node
1.1
2.2
3.8
4.0
In a balanced binary tree processing nodes is equal to
1.. leaves
2. number of elemnts
3.branch
4.none
In a eight node ring, node ____ is source of broadcast
1.1
2.2
3.8
4.0
In All to All on Hypercube, The size of the message to be transmitted at the next step is ____ by concatenating the received message with their current data
1.doubled
2.tripled
3.halfed
4.no change
In collective communication operations, collective means
1.involve group of processors
2.involve group of algorithms
3. involve group of variables
4.none of these
In task dependency graph longest directed path between any pair of start and finish node is called as --------------
1.. total work
2. critical path
3.task path
4.task path
In the first phase of 2D Mesh All to All, the message size is ___
1. p
2.m*sqrt(p)
3.m
4.p*sqrt(m) discuss
kernel may contain only host code
1.a code known as grid which runs on GPU consisting of a set of A. 32 thread
2.unit block
3.32 block
4. thread block
logical operators used in algorithm are
1. xor
2.and
3.both
4.none
Nides with zero in i least significant bits participate in _______
1.algorithm
2.broadcast
3.communication
4.searching
one to all broadcast use
1.recursive doubling
2. simple algorithm
3.both
4.none
One-to-All Personalized Communication operation is commonly called ___
1.gather operation
2.concatenation
3. scatter operation
4.none
only connections between single pairs of nodes are used at a time is
1. good utilization
2.poor utilization
3. massive utilization
4.medium utilization
Renaming relative to the source is _____ the source.
1. xor
2.xnor
3.and
4.nand
Renaming relative to the source is _____ the source.
1.xor
2.xnor
3.and
4.nand
Similar communication pattern to all-to-all broadcast except in the_____
1. reverse order
2. parallel order
3.straight order
4.vertical order
source ____ is bottleneck.
1.process
2.algorithm
3.list
4.tuple
Task dependency graph is ------------------
1.directed
2.undirected
3.directed acyclic
4.undirected acyclic
The ____ do not snoop the messages going through them.
1. nodes
2.variables
3.tuple
4.list
The algorithm terminates in _____ steps
1.p
2.p+1
3.p+2
4. p-1
The all-to-all broadcast on Hypercube needs ____ steps
1.p
2.sqrt(p) - 1
3.log p
4.none
The dual of all-to-all broadcast is
1.all-to-all reduction
2.all-to-one reduction
3.both
4.none
The dual of the scatter operation is the
1.concatenation
2.gather operation
3.both
4.none
The gather Operation is exactly the inverse of _____
1.scatter operation
2. recursion operation
3.execution
4.none
the procedure is disturbed and require only point-to-point _______
1.synchronization
2.communication
3.both
4.none
the procedure is disturbed and require only point-to-point _______
1.synchronization
2.communication
3.both
4.none
The processors compute ______ product of the vector element and the loval matrix
1.local
2.global
3.both
4.none
The second communication phase is a columnwise ______ broadcast of consolidated
1.all-to-all
2.one -to-all
3.all-to-one
4.point-to-point
The style of parallelism supported on GPUs is best described as
1.misd - multiple instruction single data
2.simt - single instruction multiple thread
3.sisd - single instruction single data
4.mimd
using different links every time and forwarding in parallel again is
1. better for congestion
2.better for reduction
3.better for communication
4.better for algorithm
What is a high performance multi-core processor that can be used to accelerate a wide variety of applications using parallel computing.
1.cpu
2.dsp
3.gpu
4.clu
Which is also called "Total Exchange" ?
1.all-to-all broadcast
2.all-to-all personalized communication
3.all-to-one reduction
4.none
Which is known as Broadcast?
1.one-to-one
2.one-to-all
3.all-to-all
4.all-to-one
Which is known as Reduction?
1.. all-to-one
2.all-to-all
3. one-to-one
4.one-to-all
Which of the following correctly describes a GPU kernel
1. a kernel may contain a mix of host and gpu code
2.all thread blocks involved in the same computation use the same kernel
3. a kernel is part of the gpus internal micro-operating system, allowing it to act as in independent host
4.kernel may contain only host code
which of the following is a an example of data decomposition
1.matrix multiplication
2. merge sort
3.quick sort
4.15 puzzal
which of the following is not a granularity type
1. course grain
2. large grain
3. medium grain
4.fine grain
which of the following is not an parallel algorithm model
1.data parallel model
2.task graph model
3.task model
4.work pool model
which of the following is not the array distribution method of data partitioning
1.block
2.cyclic
3.block cyclic
4.chunk
which problems can be handled by recursive decomposition
1.backtracking
2.greedy method
3.divide and conquer problem
4.branch and bound
wimpleat way to send p-1 messages from source to the other p-1 processors
1.algorithm
2.communication
3.concurrency
4.receiver