CPR-Matrix
                              **********

This Classic Parallel R-matrix (CPR) suite of codes should not be mixed 
with any others EXCEPT the serial suite available here, with which it is 
(or should be) fully interchangeable. Thus, if a parallel version of the 
serial code is not present here then simply use the serial version.

Since this Parallel R-matrix suite is based on the classic serial R-matrix
suite, its input options (and usage) differs little from the serial case.
In this PWRITEUP file we discuss points specific to CPR-matrix.


N. R. Badnell 20/10/14

                ooooooooooooooooooooooooooooooooooooo


***All input files as per the serial case, except as detailed below***


                        INNER REGION CODES
                        ******************


EXCHANGE: LS-coupling/Breit-Pauli
*********

PSTG1:
------

No new input. 

Blocks of bound-continuum and continuum-continuum integrals
are distributed over the available processors. This is mainly
of note for RMPS calculations, a small number of processors 
should suffice (e.g. 10).

Note, if you follow PSTG1 with the serial STG2 code then the number
of processors used in PSTG1 (NPROCSTG1) must be given in NAMELIST
STG2A.


PSTG2:
------

No new input.

Groups of symmetries are distributed over the available NPROC processors.
The max number of processors useable is INAST - one symmetry per processor.
For efficiency, INAST/NPROC should be (close to) an integer. Best to use
the MINST,MAXST,MINLT,MAXLT specification (INAST.lt.0) then use MAXLT+1
processors. Then, all spins and parities for a given L are on the same
processor and each L is on a different processor - best load balance.

Note, PSTG2 writes a new file called sizeH.dat which is read by PSTG3 -
do not delete it!


PSTG2.5
-------

Default, reads the dstgjk file that will be used by pstgjk. MUST use
the same number of processors as Jp symmetries, i.e. only one Jp symmetry
per processor is allowed. Creates STG2HJXXX files of the H(LS) symmetries
which contribute to each of the Jp symmetries.


PSTGJK:
------
No new input.

Run the same Jp symmetries (hence number of processors) as for the pstg2.5.

Once the RECUPHXXX files have been computed, the STGHJXXX files can
be deleted. On a small cluster, several pstg2.5/pstgjk runs will be
needed. Best to run thru pstg3r each time and archive the H.DAT files.
(These files, relabelled HXX.DAT, can me merged via hmerge.f to form
a single H.DAT file for outer region processing.)

***WARNING, if new RECUPHXXX files overwrite those from an old run and
the old run had more symmetries, pstg3r will read the old ones again as
well - so delete them all first.

Note, PSTGJK writes a new file called sizeBP.dat which is read by PSTG3 -
do not delete it!


PSTG3:
------

A new namelist MATRIXDAT follows the serial STG3A and STG3B namelists.

Uses scaLAPACK to carry-out the matrix diagonalization distributed over
the available NPROC processors (NOT by symmetry.) The matrix is divided
up by NPROW and NPCOL processors where NPROW*NPCOL=NPROC. Ideally, the 
number of processors used should be an integer square, then NPROW=NPCOL.
If NPROW and/or NPCOL are not specified in the namelist then the code
attempts a suitable assignment based on NPROC. This may fail if NPROC
is not an integer square. A prime number for NPROC is not a good idea.

Note, the MATRIXDAT namelist must come BEFORE any observed energies
that have been tagged for reading by NAST in namelist STG3B.
(***This ordering is subject to change.)


PSTG3_SPLIT (Advanced)
-----------

Diagonalize multiple H concurrently (CPB), if you have the memory...
No details.


NON-EXCHANGE:
*************

Use the exchange codes with their non-exchange switch set (LNOEX=-1).
(The historic explicit non-exchange codes did not scale well.)


PHOTOIONIZATION (CPB)
***************

Switch-on dipoles in input dstgn (RAD='YES') for n=1,2,jk (not 3) 
as in the serial case.


PSTG1
-----

Same code as "non-dipole" operation.


PSTG2_DIP
---------

No new input. 

Runs one dipole pair of symmetries per processor. 
So, nproc MUST match the number of pairs - exits if not so.
Produces a STG2HXXX and STGDXXX file for each dipole pair.
In LS, there is a single initial and final symmtery per dipole.
In jK, the code looks ahead and dstgjk is read. Then, all
LS-dipole pairs that contribute to the single jK-dipole are 
assembled in the STGDXXX file and the H-symmetries in STGHXXX.


PSTGJK_DIP
----------

No new input. 

Runs one (jK) dipole pair of symmetries per processor. 
Reads appropriate STG2HXXX and STGDXXX files.
So nproc is the same as in PSTG2_DIP; vice versa, nproc
for PSTG2_DIP is the number of jK-dipole pairs.


PSTG3_DIP
---------

No new input.

Diagonalizes H as in parallel non-dipole run but writes e-vectors to 
symmetryXX files, one for each dipole pair (but non-sequential numbering),
in addition to the usual H.DAT.


PSTGD_DIP
---------

No input.

Combines the STG2DXXX and symmetryxx files to produce the final DXX files.
Processes one dipole pair at a time. 
Parallelization is in the matrix multiplication: V^T*D*V.
For large cases, set nproc ~ MAXC (no. of continuum basis orbitals).


PSTGD_DIP_SPLIT (Advanced)
---------------

Process dipole pairs concurrently (CPB), if you have the memory/processors...
No details.


                        OUTER REGION CODES
                        ******************


PSTGB:
-----

No new input.

Distributes groups of symmetries over available processors.


PSTGF:
-----

No new input.

Groups of energies are distributed over the available processors.
The number of energies on each processor is MXEP=MXE/NPROC so if this
is not an integer then the energy range covered will fall a little short.
The energies are not distributed sequentially over processors as low
energies (most channels closed) are much faster than high energies (most
channels open). Rather, each processor has a set of energies which spans
the entire energy range, in effect (actually in practice) a large EINCRP
given by EINCR*NPROC. *** See note below about interpolation.

If k/smtls.dat files are being written (by IPRKM=4) for PSTGICF
then they are split by energy groups k/smtls.dat.001, k/smtls.dat.002 etc.
If the k/smtls files are split by symmetry as well (NOT default - opposite
to serial case) then they are labelled k/smtls.001.002 etc where the first
triplet denotes the symmetry and the second the energy group.


PSTGICF:
-------

No new input.

By default, the energy mesh is that used and passed thru by PSTGF.
Thus, PSTGICF must use the same number of processors as the PSTGF run. 


Both PSTGF and PSTGICF produce OMEGA(U).001, OMEGA(U).002 etc. files,
one for each energy group, i.e. the number of processors used in the run.
A single OMEGA(U) file can be reconstructed using the serial utility code
omgmrgp.f, which requires no input for standard operation. All other
processing operations follow as in the serial case.

Energy interpolation of the k/s-matrices (within PSTGF or in PSTGICF)
operates rather differently in parallel due to the need to load balance
while avoiding O/I and message passing of said matrices. As such, it is
beyong the scope of this introduction - see the PUPDATES entry for 02/07/05
or the comemnts at the start of the PSTGF/PSTGICF codes.


PSTGFDAMP, PSTGBF0DAMP, PSTGICFDAMP
-----------------------------------

Parallel operation is as per PSTGF or PSTGICF. 
Generate the dipole matrices using the PSTGN_DIP suite.


                           ((((((((()))))))))