Ab Initio can process data in parallel runtime environment.
Ab Initio provides 3 ways of parallelism
•
Component Parallelism
•
Pipeline Parallelism
•
Data Parallelism
Data Parallelism
Data is processed at the
different servers at the same time.
Data parallelism occurs when a graph separates
data into multiple divisions, allowing multiple copies of program components to
operate on the data in all the divisions simultaneously.
This is the most common parallelism when you
partition your data to be processed fast.This is achieved thru
partitioning. For example you have 1000 records and you divide them to 8
computers to process fast.
Pipeline Parallelism –
Pipeline parallelism occurs when several
connected program components on the same branch of a graph execute
simultaneously. If you are using a sort component the pipeline
parallelism does not occur.
A graph with multiple components running
simultaneously on the same data uses pipeline parallelism. Each component in
the pipeline continuously reads from upstream components, processes data, and
writes to downstream components. Since a downstream component can process
records previously written by an upstream component, both components can
operate in parallel.
NOTE: To limit the number of components running
simultaneously, set phases in the graph
For example you can keep on reading the data
from input file(say 10 records) but till now processed only 6
records. This is called pipeline parallelism when one component does not
wait for all the data to come and starts processing parallely in a
pipe.
Component Parallelism
In this two
or more components process the records in parallel.
A graph with multiple processes running
simultaneously on separate data uses component parallelism.
This kind of parallelism is specific to your
graph when 2 different components are not interrelated and they process
the data parallely. For example you have 2 input files and you sort the
data of both of them in 2 different flows. Then these 2 components are
under component parallelism.
Useful Links
1. | Ab Initio Sandbox |
2. | Ab Initio Components |
3. | Ab initio Intoduction |
4. | Ab Initio Basic Graph Development |
5. | Ab Initio Multifile |