parallel processing - Parallelization of an openMP nested do loop -
i have nested loop in openmp fortran 77 code unable parallelize (the code gives segmentation fault error when run). have similar nested loop in different subroutine of same code runs parallel no issues. here nested loop having problems with:
n=1,num_p c$omp parallel default(shared), private(l,i1,i2,j1,j2,k1,k2 c$omp& ,i,j,k,i_t,j_t,i_ddf,j_ddf,ddf_dum) l=1,n_l(n) call del_fn(l,n) i1=p_iw(l,n) i2=p_ie(l,n) j1=p_js(l,n) j2=p_jn(l,n) k1=p_kb(l,n) k2=p_kt(l,n) i=i1,i2 i_ddf=i-i1+1 if(i .lt. 1) i_t=nx+i elseif (i .gt. nx) i_t=i-nx else i_t=i endif j=j1,j2 j_ddf=j-j1+1 if(j .lt.1) j_t=ny+j elseif(j .gt. ny) j_t=j-ny else j_t=j endif k=k1,k2 ddf(l,n,i_ddf,j_ddf,k-k1+1) = ddf_dum(i_t,j_t,k) enddo enddo enddo enddo c$omp end parallel enddo
i have narrowed problem down ddf_dum(i_t,j_t,k). when term turned off (say replace 0.d0), code runs fine.
on other hand, have similar nested loop runs parallel no issues. below nested loop runs parallel no issues. can please identify missing here?
n=1,1 c$omp parallel default(shared), private(l,i1,i2,j1,j2,k1,k2 c$omp& ,i,j,k,i_f,j_f,i_ddf,j_ddf) l=1,n_l(n) i1=p_iw(l,n) i2=p_ie(l,n) j1=p_js(l,n) j2=p_jn(l,n) k1=p_kb(l,n) k2=p_kt(l,n) u_forcing(l,n)= (u_p(l,n)-up_tilde(l,n))/dt v_forcing(l,n)= (v_p(l,n)-vp_tilde(l,n))/dt w_forcing(l,n)= (w_p(l,n)-wp_tilde(l,n))/dt i=i1,i2 i_ddf=i-i1+1 if(i .lt. 1) i_f=nx+i elseif (i .gt. nx) i_f=i-nx else i_f=i endif j=j1,j2 j_ddf=j-j1+1 if(j .lt.1) j_f=ny+j elseif(j .gt. ny) j_f=j-ny else j_f=j endif k=k1,k2 forcing_x(i_f,j_f,k)=forcing_x(i_f,j_f,k)+u_forcing(l,n) & *ddf_n(l,n,i_ddf,j_ddf,k-k1+1)*dv_l(l,n) forcing_y(i_f,j_f,k)=forcing_y(i_f,j_f,k)+v_forcing(l,n) & *ddf_n(l,n,i_ddf,j_ddf,k-k1+1)*dv_l(l,n) forcing_z(i_f,j_f,k)=forcing_z(i_f,j_f,k)+w_forcing(l,n) & *ddf_n(l,n,i_ddf,j_ddf,k-k1+1)*dv_l(l,n) enddo enddo enddo enddo c$omp end parallel enddo
as noted, problem ddf_dum
. should shared variable, not private, because being read , never written to. getting segfault because attempting access uninitialized memory on threads aren't master thread.
a rule of thumb have used find mistake yourself: variables found on rhs of equal signs within parallel region should shared
.
Comments
Post a Comment