Skip to content

Instantly share code, notes, and snippets.

@ivan-pi
Last active November 6, 2023 10:30
Show Gist options
  • Save ivan-pi/a9c905065af75362c786c2032ac48c56 to your computer and use it in GitHub Desktop.
Save ivan-pi/a9c905065af75362c786c2032ac48c56 to your computer and use it in GitHub Desktop.
Banded system of linear equations solved using "gbsv" driver from LAPACK (Intel MKL)
!LINK="${MKLROOT}/lib/intel64/libmkl_lapack95_lp64.a -Wl,--start-group ${MKLROOT}/lib/intel64/libmkl_intel_lp64.a ${MKLROOT}/lib/intel64/libmkl_sequential.a ${MKLROOT}/lib/intel64/libmkl_core.a -Wl,--end-group -lpthread -lm -ldl"
!INC="-I${MKLROOT}/include/intel64/lp64/ -I${MKLROOT}/include/"
!ifort -O3 -warn all -o test_banded $INC test_banded.f90 $LINK
program test_banded
use lapack95, only: gbtrf, gbtrs, gbsv
use f95_precision, only: wp => dp
implicit none
integer, parameter :: kl = 2, ku = 1
integer, parameter :: n = 21
real(wp) :: ab(2*kl+ku+1,n)
real(wp) :: b(n)
integer :: ipiv(n), info
! Setup reaction-diffusion equation
call setup(n,kl,ku,ab,b)
! General banded matrix driver
call gbsv(ab,b,kl,ipiv,info=info)
print *, "# factor and solve info = ", info
print *, b
contains
subroutine setup(n,kl,ku,ab,b)
integer, intent(in) :: n, kl, ku
real(wp), intent(out) :: ab(2*kl+ku+1,n)
real(wp), intent(out) :: b(n)
real(wp) :: h, diff, reac
integer :: i, ir
b = 0.0_wp
b(1) = 1.0_wp ! Dirichlet boundary condition
! b(n) = 0.0_wp ! zero-flux boundary condition satisfied during initialization
! stepsize, diffusivity and reaction coefficient
h = 1.0_wp/real(n-1,wp)
diff = 1.0_wp
reac = 1.0_wp
ir = kl+ku ! number of super diagonals is 3, also the row of the first actual super-diagonal
ab(ir:,:) = 0._wp ! set matrix to zero
do i = 2, n - 1
ab(ir,i+1) = diff
ab(ir+1,i) = -2._wp*diff - h**2*reac ! D [1 -2 1] u = h^2 k u
ab(ir+2,i-1) = diff
end do
ab(ir+1,1) = 1.0 ! Dirichlet boundary condition
! Zero-flux boundary condition on right side
ab(ir+1,n) = 1.5_wp
ab(ir+2,n-1) = -2._wp
ab(ir+3,n-2) = 0.5_wp
end subroutine
end program
@ivan-pi
Copy link
Author

ivan-pi commented Nov 6, 2023

If stored as a sparse matrix we would have:

  • 1 nonzero for the Dirichlet condition at the surface,
  • 3 nonzeros for the N-2 bulk nodes,
  • 3 nonzeros for the reflection condition at the center (assuming a one-sided difference is used)

This gives 1 + 3 + 3*(N-2) nonzeros in total.

For this discretization we could also use a tridiagonal solver by either

  • using a ghost node for the zero-flux boundary condition, or
  • by eliminating the extra value off of the main three diagonals on the fly.

@ivan-pi
Copy link
Author

ivan-pi commented Nov 6, 2023

For the sparse matrix, the assembly in triplet (coordinate) format would look as follows:

! Dirichlet condition at the surface
a(1) = 1.0
cnr(1) = 1
rnr(1) = 1

! Interior nodes
k = 2
do i = 2, N-1
  a(k:k+2) = [diff, -2*diff - h**2*reac, diff]
  cnr(k:k+2) = [i-1,i,i+1] ! column numbers
  rnr(k:k+2) = i           ! row numbers 
  k = k + 3
end do

! Neumann boundary
a(k:k+2) = [0.5d0,-2.0d0,1.5d0]
cnr(k:k+2) = [N-2,N-1,N]
rnr(k:k+2) = N

For CSR storage it would be the following,

integer :: n, nnz, i, iaa, iab
integer, allocatable :: ia(:), ja(:)
real(dp), allocatable :: a(:)

n = ...
nnz = 1 + 3 + 3*(n-2)

allocate(ia(n+1),ja(nnz),a(nnz))

! Dirichlet condition at the surface
ia(1:2) = [1,2]
a(1) = 1.0
ja(1) = 1

! Interior nodes
do i = 2, N-1
  ia(i+1) = ia(i) + 3   ! row pointer
  iaa = ia(i)
  iab = ia(i+1) - 1
  a(iaa:iab) = [diff, -2*diff - h**2*reac, diff]
  ja(iaa:iab) = [i-1,i,i+1]  ! column indices
end do

! Neumann boundary
ia(N+1) = ia(N) + 3
iaa = ia(N)
iab = ia(N+1)-1
a(iaa:iab) = [0.5d0,-2.0d0,1.5d0]
ja(iaa:iab) = [N-2,N-1,N]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment