technical modifications from HIRLAM

Contains:

Suggested changes for cy40t1
Rimvydas Jasinkas, LHMI, Ulf Andrae SMHI
2013-09-24


(extract)

Updated 2014-03-17 (point 8)

  • Introduction
    The latest version of the code for HARMONIE based on the CY40 version of the IFS/ARPEGE code has gone through a first sanity check on some basic fortran and C standard violations.
    The IFS/ARPEGE code has a well defined coding standard described in
    http://www.cnrm.meteo.fr/gmapdoc//IMG/pdf/coding-rules.pdf. Some of the rules there are very specific to this particular project whereas other norms are more generally applicable and considered a
    good coding standard in general. These latter rules are mainly defined in section four of the document but there are examples elsewhere as well. For example it’s stated that the code should be in free format,
    implicit none is mandatory and that tab character is not allowed. It’s clear from the suggested changes below that there are more than enough of examples of deviations from these rules in the code. The
    coding rules does not specify how to define, and undefine CPP flags and several of the changes deals with dangerous preprocessor constructions that may lead to unpredictable and unwanted result on
    different platforms. It’s unclear to us whether the other groups apply similar rules or not to their codes.
    The HARMONIE code is an example of a project where we utilize the knowledge of several groups in Europe, all with their different specialties. We lean on SURFEX for the surface modeling; from the
    MESO-NH group we use several packages for physical parameterization and The IFS/ARPEGE code is
    the backbone of the system with all the different components for e.g. numerics and data assimilation.
    The coordination of these different code bases is a challenge, but we hope that it is a win-win situation
    for everyone in the end. We understand that we should not try to impose coding rules on other projects.
    Still we believe that we all have to same goal; To arrive to a system that is fast, portable and
    manageable.
  1. CPP directives
    This change deals with cpp features. All are quite safe under normal compilation, but on optimized
    builds, that’s quite a different thing (safe programming practices with cpp for c codes). Cpp in fortran is
    not defined by fortran std, thus very unstable and vendor specific without any guarantee to be
    supported or rewritten for fortran semantic syntax (line continuation with ’&’, comments with ’!’). Also
    cpp was intended for case sensitive languages, since fortran is case sensitive just for string comparisons
    and together without implicit none could and already in few places create very hard to detect errors.
    Undefs are required to limit namespace breakouts to other blocks in extreme inlining mode (since cpp
    in fortran is only a vendor extension, cpp expansion step could be performed not in first stages).
  2. SURFEX CPP flags
    Mainly surfex define sanitation. Surfex CPP flags should be changed to
    SFX_ARO, SFX_ASC, SFX_BIN, SFX_FA, SFX_LFI, SFX_OL, SFX_TXT.
    This allows to have at least nice debug information with 1:1 identical source code and doesn’t conflict
    with fa/lfi structure names. SFX_ prefix was chosen in analogy to SFX_NETCDF define.
  3. SURFEX cleaning
    Feature proposal:
    Mainly replace outdated & very dirty [if/endif]++ and if/[elseif]++/else/endif with more cleaner and
    safer "select case" structures. This is more scalable, with cpp defines it preserves fortran std semantic
    syntax. Also, provides clear default action and error control. Moreover, gives idea how to keep such
    structures in same manner across different units.
    From low level point, this one counts as compile time optimization. Since compiler don’t have to guess
    about the flow. Better inlining and branching. From assembler level, this allows easy and fast creation
    of jump tables even for strings. Later if compiler is clever enough, it can partly fuse these jump tables,
    since they are constant type and put in .rodata data sections(easier code placement)
  4. Ascii and unicode symbols
    This is not fortran standard defined, vendor specific, better to avoid if possible. Lots of time wasted to
    understand what is " \206 " in string hex dumps. Problem with extended ascii symbols is that some
    editors can upconvert them to their utf-8/16 variants. Then not only fortran runtime is forced to work in
    full unicode mode, but can lead to non compatible/portable output files and multibyte encoding could
    be still interpreted as 7bit char stream, creating offsets in records. No cases in actual code (too much
    I/O), but inlined debug information with source code parts have some mangling.
  5. Detab
    Reason: tab character *is not* of fortran std character set. Behavior is not portable, depends on system,
    environment. Error with tabs are very hard to detect, usually creates problems when changing compiler
    vendors or even versions. Some optimization could be disabled just because compiler regex’es where
    not exten
  6. Free format and explicit kind declarations
    Conversion to universal free/fixed form (as long as everything is discarded after 72 column) , should be
    renamed to F90. Remark stabfunc2.h and impnone.h inlined thus can be removed. Manual indentation
    was performed in some places for readability) other packs were prepared in similar way.
    For portability reasons and for easier interfacing to other systems explicit kind declarations or all reals
    and integers is preferable.
  7. New suggestions concerning the use of modules to avoid generating extra interface blocksFree format and explicit kind declarations
    For each subroutine we today generate and extra interface routine module and make use of it all calling routines. If you would put the routines in modules directly the interface would be included implictly. To be consistent with the current "USE MODI_" statements we suggest to rename files from this_sub.F90 to modi_this_sub.F90


module modi_this_sub
contains
subroutine this_sub
...
end module modi_this_sub

Further we suggest to rename files so that module/subroutine names are consistent. E.g. init_io_surfn.F90 -> modi_init_io_surf_n.F90

These changes would:

  • Cut compilable file list by half.
  • Simplifie makefile logics.
  • Make dependency analysis simpler.
  • Provide better diagnostics.

The suggested change would come with some cost:

  • An increased recompilation cost.
  • All subroutines will lose global visibility which is aimed at anyway in the suggested "remove global state" change but will require explicit use modi_NAME so not to get any undefined reference to NAME.
  • Some current explicit INTERFACE blocks has to be recoded.
  • Various build systems have to adapt to the new convention.

These changes will be included in SURFEX V8 by Rimvydas Jasinskas, in a mainly automatic way.

They could be included in SURFEX V8 at the end of the whole merging, in last-but-one position, just before the CERFACS contribution.

To make this inclusion easier, Rimvydas will test its changes (affecting a large part of the code) before the final inclusion in V8, on a previous version.