准生证是什么| 壬午日五行属什么| 怜悯之心是什么意思| 京畿是什么意思| 层峦叠翠的意思是什么| 手足口病是什么症状| 经常腿麻是什么原因| 身上肉疼是什么原因| 印象是什么意思| 长颈鹿的脖子像什么| 辩证什么意思| 为什么要做包皮手术| 小孩小便红色是什么原因| 月季花什么时候开| 壁虎的尾巴有什么用| 足底麻木是什么原因| nda是什么意思| 向日葵代表什么意思| 小丑叫什么| 鸟吃什么东西| 甾体是什么意思| 为什么会打呼噜| 梦见出国了是什么意思| 港股通是什么| 摩西摩西是什么意思| 总恶心是什么原因| 去医院洗纹身挂什么科| 桃不能和什么一起吃| 玉簟秋是什么意思| rh血型是什么血型| 症瘕积聚是什么意思| 耕的左边读什么| 暗物质是什么| 脚气脱皮用什么药最好| 伏羲是什么意思| 靶向药是什么意思| s是什么牌子| 被臭虫咬了擦什么药| 为什么怀孕会孕酮低| 胃痛胃胀吃什么药| 血压偏高喝什么茶| 6月2日是什么星座| 小乌龟死了有什么预兆| 空调漏水是什么原因| 便秘和腹泻交替出现是什么意思| 狗是什么生肖| 内分泌失调是什么症状| robinhood是什么牌子| 电商五行属什么| 足跟痛用什么药| 牙疼脸肿了吃什么药| 造纸术什么时候发明的| 三位一体是什么意思| 人什么地灵| 什么东西放进去是硬的拿出来是软的| 贵州有什么特产| 什么东西可以美白| 梦见怀孕是什么预兆| 美国什么时候建国的| 王者风范是什么意思| 肠胃不好吃什么水果| 凉栀是什么意思| 例假是什么| 什么叫打飞机| 铁铁什么意思| 吃了就吐是什么原因| 雷替斯和优甲乐有什么区别| 梦见家里发大水了是什么征兆| 芒果对身体有什么好处| 躯体化障碍是什么病| 细胞角蛋白19片段是什么意思| 睡觉中途总醒什么原因| 素面朝天什么生肖| 黄辣丁吃什么食物| 右耳烫代表什么预兆| 腿麻木是什么原因引起的| 类风湿吃什么药| 经常便秘吃什么| 吃甲硝唑有什么副作用| 开心果是什么意思| 早泄吃什么中药| 兄弟是什么生肖| 7月25是什么星座| 手指甲出现竖纹是什么原因| 内秀是什么意思| 荷尔蒙分泌是什么意思| 腹胀是什么原因引起的| 偏头疼是什么原因| 双手脱皮是什么原因引起的| 额头出汗多是什么原因| 脓疱疮是什么原因引起的| 告诉我们什么道理| 吃什么排肝毒最快| 副教授是什么级别| 补体是什么| 加湿器加什么水最好| 雨五行属什么| 尿路结石吃什么药| 1983年出生是什么命| 一什么黑暗| 什么叫阳虚| 梦见被追杀是什么预兆| 喝啤酒头疼是什么原因| hct是什么意思| 痛经不能吃什么| 芹菜煮水喝有什么功效| 为什么硬不起来| 娃娃脸适合什么发型| 属猪和什么属相相冲| 硬度单位是什么| 吃什么能降胆固醇| 出行是什么意思| 动脉抽血为什么这么疼| 什么网名好听| 冬是什么结构| 肝血不足吃什么补最快| 小厨宝是什么| 肠化是什么意思| 左下腹有什么器官| 拉肚子吃什么食物好| 阴道痒用什么药好| 龙涎是什么| 波长是什么| 外甥是什么关系| 双插头是什么意思| 农历7月20日是什么星座| 鼻屎多是什么原因| 多巴胺分泌是什么意思| 狗贫血吃什么补血最快| 早上6点是什么时辰| 吃什么有助于伤口愈合| kissme什么意思| pcm是什么意思| 结婚40年是什么婚| 凤凰代表什么生肖| 9月30日是什么纪念日| 榴莲为什么会苦| 残留是什么意思| 什么让生活更美好作文| 结扎是什么| 肠胃痉挛吃什么药| 84年什么命| 黄埔军校现在叫什么| 免疫球蛋白有什么作用| 西药是用什么材料做的| 红男绿女是什么生肖| 慢性浅表性胃炎吃什么药| 生是什么结构的字| 韩红是什么军衔| 心跳过快吃什么药| 什么奶茶最贵| 孤独症有什么表现| 24小时动态脑电图能查出什么| 大肠杆菌感染吃什么药| mj什么意思| 阴历六月十九是什么日子| 胎盘老化对胎儿有什么影响| 什么人容易得布病| 勾践姓什么| 女性什么时间是排卵期| 腰闪了是什么症状| 内讧是什么意思| 水化是什么意思| 跑得什么| mc是什么| 早上喝一杯温开水有什么好处| 什么是嘌呤| 经常感觉饿是什么原因| 唇炎应该挂什么科室| 手足口病忌口什么食物| 一节黑一节白是什么蛇| 容易饿是什么原因| 骨刺吃什么药| 精华液是干什么用的| 人外是什么意思| 纹眉需要注意什么| 空调外机很响是什么原因| 胃肠功能紊乱是什么意思| 牙齿痛吃什么药最管用| 蜜蜂蜇人后为什么会死去| 日仄念什么| 格格是什么身份| 印泥干了用什么稀释| 农历9月28日是什么星座| 便秘吃什么药好| 儿童肥胖挂什么科| 黄精为什么要九蒸九晒| 小孩出汗多是什么原因造成的| 后羿射日告诉我们什么道理| 苏武牧羊是什么意思| 铁树开花是什么意思| 做什么生意最赚钱| prich是什么牌子| 什么是医学检验技术| 口臭用什么牙膏| 什么头什么面| 取环后要注意什么事项| 女人的动物是什么生肖| 节哀顺便是什么意思| 胎位不正是什么原因导致的| 戴黄金对身体有什么好处| 釜底抽薪什么意思| 料理机是干什么用的| 财神叫什么名字| 11月21是什么星座| 头晕挂什么科室| 大学211和985是什么意思| 看到壁虎是什么征兆| 痛经什么感觉| 卵巢囊性结构是什么意思| 王羲之兰亭序是什么字体| 一个火一个华念什么| 子宫内膜增厚是什么原因| 乌鸡白凤丸有什么功效| 华丽的近义词是什么| k3是什么| 口腔溃疡看什么科室| 辱骂是什么意思| 猪下水是什么| 什么发型好看| 梦见狗死了是什么预兆| 大象的鼻子有什么作用| 渝北区有什么好玩的地方| 遵命是什么意思| 月相是什么意思| 中医说的湿气重是什么意思| 文殊菩萨保佑什么| 初代是什么意思| 尿毒症什么原因引起的| 亭台楼阁是什么意思| 为什么拉屎会出血| 韧带拉伤吃什么药| 图注是什么| 文化大革命是什么时候开始的| 忌入宅是什么意思| 七月十号是什么日子| 关税是什么| below是什么意思| 哈尔滨机场叫什么名字| 口扫是什么| 传度是什么意思| superstar是什么意思| 尿中有泡沫是什么原因| 项链折了意味着什么| 恩替卡韦片是什么药| 检查艾滋病挂什么科| 尿酸偏高是什么病| 皮肤溃烂化脓用什么药| 什么逼人| 暗渡陈仓是什么生肖| 老被蚊子咬是什么原因| 来是什么生肖| 回南天是什么意思| 三伏贴能治什么病| 男生被口是什么感觉| 去越南要注意什么| 泌乳素偏高是什么原因| noisy是什么意思| 细菌高是什么原因| 格桑花什么时候开花| 松花蛋不能和什么一起吃| 星期五右眼皮跳是什么预兆| 低血钾吃什么| 痔疮不治会有什么危害| 百度

网上售卖“三无”保健品 被判“退款+10倍赔偿”

百度 7.

Loop-level parallelism is a form of parallelism in software programming that is concerned with extracting parallel tasks from loops. The opportunity for loop-level parallelism often arises in computing programs where data is stored in random access data structures. Where a sequential program will iterate over the data structure and operate on indices one at a time, a program exploiting loop-level parallelism will use multiple threads or processes which operate on some or all of the indices at the same time. Such parallelism provides a speedup to overall execution time of the program, typically in line with Amdahl's law.

Description

edit

For simple loops, where each iteration is independent of the others, loop-level parallelism can be embarrassingly parallel, as parallelizing only requires assigning a process to handle each iteration. However, many algorithms are designed to run sequentially, and fail when parallel processes race due to dependence within the code. Sequential algorithms are sometimes applicable to parallel contexts with slight modification. Usually, though, they require process synchronization. Synchronization can be either implicit, via message passing, or explicit, via synchronization primitives like semaphores.

Example

edit

Consider the following code operating on a list L of length n.

for (int i = 0; i < n; ++i) {
    S1: L[i] += 10;
}

Each iteration of the loop takes the value from the current index of L, and increments it by 10. If statement S1 takes T time to execute, then the loop takes time n * T to execute sequentially, ignoring time taken by loop constructs. Now, consider a system with p processors where p > n. If n threads run in parallel, the time to execute all n steps is reduced to T.

Less simple cases produce inconsistent, i.e. non-serializable outcomes. Consider the following loop operating on the same list L.

for (int i = 1; i < n; ++i) {
    S1: L[i] = L[i-1] + 10;
}

Each iteration sets the current index to be the value of the previous plus ten. When run sequentially, each iteration is guaranteed that the previous iteration will already have the correct value. With multiple threads, process scheduling and other considerations prevent the execution order from guaranteeing an iteration will execute only after its dependence is met. It very well may happen before, leading to unexpected results. Serializability can be restored by adding synchronization to preserve the dependence on previous iterations.

Dependencies in code

edit

There are several types of dependences that can be found within code.[1][2]

Type Notation Description
True (Flow) Dependence S1 ->T S2 A true dependence between S1 and S2 means that S1 writes to a location later read from by S2
Anti Dependence S1 ->A S2 An anti-dependence between S1 and S2 means that S1 reads from a location later written to by S2.
Output Dependence S1 ->O S2 An output dependence between S1 and S2 means that S1 and S2 write to the same location.
Input Dependence S1 ->I S2 An input dependence between S1 and S2 means that S1 and S2 read from the same location.

In order to preserve the sequential behaviour of a loop when run in parallel, True Dependence must be preserved. Anti-Dependence and Output Dependence can be dealt with by giving each process its own copy of variables (known as privatization).[1]

Example of true dependence

edit
S1: int a, b;
S2: a = 2;
S3: b = a + 40;

S2 ->T S3, meaning that S2 has a true dependence on S3 because S2 writes to the variable a, which S3 reads from.

Example of anti-dependence

edit
S1: int a, b = 40;
S2: a = b - 38;
S3: b = -1;

S2 ->A S3, meaning that S2 has an anti-dependence on S3 because S2 reads from the variable b before S3 writes to it.

Example of output-dependence

edit
S1: int a, b = 40;
S2: a = b - 38;
S3: a = 2;

S2 ->O S3, meaning that S2 has an output dependence on S3 because both write to the variable a.

Example of input-dependence

edit
S1: int a, b, c = 2;
S2: a = c - 1;
S3: b = c + 1;

S2 ->I S3, meaning that S2 has an input dependence on S3 because S2 and S3 both read from variable c.

Dependence in loops

edit

Loop-carried vs loop-independent dependence

edit

Loops can have two types of dependence:

  • Loop-carried dependence
  • Loop-independent dependence

In loop-independent dependence, loops have inter-iteration dependence, but do not have dependence between iterations. Each iteration may be treated as a block and performed in parallel without other synchronization efforts.

In the following example code used for swapping the values of two array of length n, there is a loop-independent dependence of S1 ->T S3.

for (int i = 1; i < n; ++i) {
    S1: tmp = a[i];
    S2: a[i] = b[i];
    S3: b[i] = tmp;
}

In loop-carried dependence, statements in an iteration of a loop depend on statements in another iteration of the loop. Loop-Carried Dependence uses a modified version of the dependence notation seen earlier.

Example of loop-carried dependence where S1[i] ->T S1[i + 1], where i indicates the current iteration, and i + 1 indicates the next iteration.

for (int i = 1; i < n; ++i) {
    S1: a[i] = a[i-1] + 1;
}

Loop carried dependence graph

edit

A Loop-carried dependence graph graphically shows the loop-carried dependencies between iterations. Each iteration is listed as a node on the graph, and directed edges show the true, anti, and output dependencies between each iteration.

Types

edit

There are a variety of methodologies for parallelizing loops.

  • DISTRIBUTED Loop
  • DOALL Parallelism
  • DOACROSS Parallelism
  • HELIX [3]
  • DOPIPE Parallelism

Each implementation varies slightly in how threads synchronize, if at all. In addition, parallel tasks must somehow be mapped to a process. These tasks can either be allocated statically or dynamically. Research has shown that load-balancing can be better achieved through some dynamic allocation algorithms than when done statically.[4]

The process of parallelizing a sequential program can be broken down into the following discrete steps.[1] Each concrete loop-parallelization below implicitly performs them.

Type Description
Decomposition The program is broken down into tasks, the smallest exploitable unit of concurrence.
Assignment Tasks are assigned to processes.
Orchestration Data access, communication, and synchronization of processes.
Mapping Processes are bound to processors.

DISTRIBUTED loop

edit

When a loop has a loop-carried dependence, one way to parallelize it is to distribute the loop into several different loops. Statements that are not dependent on each other are separated so that these distributed loops can be executed in parallel. For example, consider the following code.

for (int i = 1; i < n; ++i) {
    S1: a[i] = a[i-1] + b[i];
    S2: c[i] += d[i];
}

The loop has a loop carried dependence S1[i] ->T S1[i+1] but S2 and S1 do not have a loop-independent dependence so we can rewrite the code as follows.

loop1: for (int i = 1; i < n; ++i) {
    S1: a[i] = a[i-1] + b[i];
}
loop2: for (int i = 1; i < n; ++i) {
    S2: c[i] += d[i];
}

Note that now loop1 and loop2 can be executed in parallel. Instead of single instruction being performed in parallel on different data as in data level parallelism, here different loops perform different tasks on different data. Let's say the time of execution of S1 and S2 be   and   then the execution time for sequential form of above code is  , Now because we split the two statements and put them in two different loops, gives us an execution time of  . We call this type of parallelism either function or task parallelism.

DOALL parallelism

edit

DOALL parallelism exists when statements within a loop can be executed independently (situations where there is no loop-carried dependence).[1] For example, the following code does not read from the array a, and does not update the arrays b, c. No iterations have a dependence on any other iteration.

for (int i = 0; i < n; ++i) {
    S1: a[i] = b[i] + c[i];
}

Let's say the time of one execution of S1 be   then the execution time for sequential form of above code is  , Now because DOALL Parallelism exists when all iterations are independent, speed-up may be achieved by executing all iterations in parallel which gives us an execution time of  , which is the time taken for one iteration in sequential execution.

The following example, using a simplified pseudo code, shows how a loop might be parallelized to execute each iteration independently.

begin_parallelism();
for (int i = 0; i < n; ++i) {
    S1: a[i] = b[i] + c[i];
    end_parallelism();
}
block();

DOACROSS parallelism

edit

DOACROSS Parallelism exists where iterations of a loop are parallelized by extracting calculations that can be performed independently and running them simultaneously.[5]

Synchronization exists to enforce loop-carried dependence.

Consider the following, synchronous loop with dependence S1[i] ->T S1[i+1].

for (int i = 1; i < n; ++i) {
    a[i] = a[i-1] + b[i] + 1;
}

Each loop iteration performs two actions

  • Calculate a[i-1] + b[i] + 1
  • Assign the value to a[i]

Calculating the value a[i-1] + b[i] + 1, and then performing the assignment can be decomposed into two lines(statements S1 and S2):

S1: int tmp = b[i] + 1;
S2: a[i] = a[i-1] + tmp;

The first line, int tmp = b[i] + 1;, has no loop-carried dependence. The loop can then be parallelized by computing the temp value in parallel, and then synchronizing the assignment to a[i].

post(0);
for (int i = 1; i < n; ++i) {

    S1: int tmp = b[i] + 1;
    wait(i-1);

    S2: a[i] = a[i-1] + tmp;
    post(i);
}

Let's say the time of execution of S1 and S2 be   and   then the execution time for sequential form of above code is  , Now because DOACROSS Parallelism exists, speed-up may be achieved by executing iterations in a pipelined fashion which gives us an execution time of  .

DOPIPE parallelism

edit

DOPIPE Parallelism implements pipelined parallelism for loop-carried dependence where a loop iteration is distributed over multiple, synchronized loops.[1] The goal of DOPIPE is to act like an assembly line, where one stage is started as soon as there is sufficient data available for it from the previous stage.[6]

Consider the following, synchronous code with dependence S1[i] ->T S1[i+1].

for (int i = 1; i < n; ++i) {
    S1: a[i] = a[i-1] + b[i];
    S2: c[i] += a[i];
}

S1 must be executed sequentially, but S2 has no loop-carried dependence. S2 could be executed in parallel using DOALL Parallelism after performing all calculations needed by S1 in series. However, the speedup is limited if this is done. A better approach is to parallelize such that the S2 corresponding to each S1 executes when said S1 is finished.

Implementing pipelined parallelism results in the following set of loops, where the second loop may execute for an index as soon as the first loop has finished its corresponding index.

for (int i = 1; i < n; ++i) {
    S1: a[i] = a[i-1] + b[i];
        post(i);
}

for (int i = 1; i < n; i++) {
        wait(i);
    S2: c[i] += a[i];
}

Let's say the time of execution of S1 and S2 be   and   then the execution time for sequential form of above code is  , Now because DOPIPE Parallelism exists, speed-up may be achieved by executing iterations in a pipelined fashion which gives us an execution time of  , where p is the number of processor in parallel.

See also

edit

References

edit
  1. ^ a b c d e Solihin, Yan (2016). Fundamentals of Parallel Architecture. Boca Raton, FL: CRC Press. ISBN 978-1-4822-1118-4.
  2. ^ Goff, Gina (1991). "Practical dependence testing". Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation - PLDI '91. pp. 15–29. doi:10.1145/113445.113448. ISBN 0897914287. S2CID 2357293.
  3. ^ Murphy, Niall. "Discovering and exploiting parallelism in DOACROSS loops" (PDF). University of Cambridge. Retrieved 10 September 2016.
  4. ^ Kavi, Krishna. "Parallelization of DOALL and DOACROSS Loops-a Survey". {{cite journal}}: Cite journal requires |journal= (help)
  5. ^ Unnikrishnan, Priya (2012), "A Practical Approach to DOACROSS Parallelization", Euro-Par 2012 Parallel Processing, Lecture Notes in Computer Science, vol. 7484, pp. 219–231, doi:10.1007/978-3-642-32820-6_23, ISBN 978-3-642-32819-0, S2CID 18571258
  6. ^ "DoPipe: An Effective Approach to Parallelize Simulation" (PDF). Intel. Retrieved 13 September 2016.
应景是什么意思 什么是痣 有事钟无艳无事夏迎春是什么意思 什么原因得疱疹 养胃早餐吃什么好
spa是什么 k代表什么 水可以做什么 泌尿系统由什么组成 洗牙有什么好处和坏处
吉祥是什么意思 杜仲泡水喝有什么功效 芝麻开花节节高是什么意思 发狂是什么意思 字如其人什么意思
什么叫尿潜血 口服是什么意思 夏天怕冷是什么原因 伟五行属性是什么 龙肉指的是什么肉
湿气重能吃什么水果hcv8jop2ns3r.cn 爱新觉罗是什么意思hcv9jop5ns0r.cn 子宫内膜16mm说明什么hcv8jop8ns6r.cn 7月26日是什么日子luyiluode.com 为什么风团会在晚上爆发sanhestory.com
翻什么越什么hcv9jop1ns3r.cn 防蓝光眼镜有什么好处hcv8jop4ns4r.cn 等闲之辈是什么意思hcv8jop6ns1r.cn 摩羯座什么性格yanzhenzixun.com 女汉子什么意思naasee.com
牙根出血是什么原因naasee.com 湿疹是什么意思hcv8jop4ns2r.cn 脚丫痒是什么原因hcv7jop6ns2r.cn 双肺门不大是什么意思hcv8jop0ns9r.cn 夵是什么意思hcv8jop5ns9r.cn
老天爷叫什么名字hcv9jop1ns9r.cn 10.1是什么星座hcv8jop3ns6r.cn 肋软骨炎挂什么科hcv7jop6ns3r.cn 下水是什么意思hcv7jop9ns3r.cn 什么时辰出生的人命好hcv8jop3ns4r.cn
百度