Developer guidelines

Developers and contributors of FEASST may consider the following guidelines in order to simplify collaboration.

Branch policies

Branches in Git are extremely useful for working on different features within the same code base and then merging them into a final product or release.

https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging

Using and naming branches in a way that is consistent may simplify this process. The policies below are based loosely on the following blog.

http://nvie.com/posts/a-successful-git-branching-model/

Main branch

This is the release branch. This branch may only merge with develop and hotfix branches.

Develop branch

This is the development branch.

Tested and complete implementation of new features are added here.

  • Code must compile and gtests pass successfully

  • Consider feature compatibility with swig python interface

  • When merging into this branch, use –squash to add the entire feature with a single commit. Alternatively, you could also rebase your private branch.

Hotfix branch

The hotfix branch is for implementation of bug fixes to the main branch. After merging with main and develop, they are deleted.

Feature branches

These branches are for the development of new features. These branch names must begin with characters which identify the main developer of the feature (e..g, hwh/feature) The main developer sets the branch policy.

Dead (dead_*) branches

These branches may record an incomplete attempt of a feature that may be relevant in the future. These branch names must begin with the characters “dead”.

  • No rules. Code may not compile. HINT: rename branches with “git branch -m <newname>”

Pull requests

To create a pull request, fork the usnistgov repo, create a new branch with your changes, and add the pull request, as detailed in https://opensource.com/article/19/7/create-pull-request-github

To incorporate the pull request into feasst - git fetch usnistgov pull/ID/head:BRANCHNAME - git checkout BRANCHNAME - [make local changes] - git push usnistgov BRANCHNAME

Tools

GTEST: Unittest C++

The main branch requires GTEST coverage for all cpp plugins in plugin/name/test.

cmake -DUSE_GTEST=ON ..
make unittest -j12
./bin/unittest
  • use --gtest_filter=*Name* to run specific tests

  • use ./bin/unittest \|& more to pipe stderr to stdout.

  • use --gtest_shuffle to randomize the order of the tests

  • use --gtest_random_seed=SEED to reproduce an specific order.

GDB or LLDB: Debugging

gdb (or lldb on macOS) is especially useful for identifying segfaults via backtraces. The -g flag in compilation pulls the symbols so that you can get correct line numbers in the gdb output.

In bash

gdb [program executable name]
r [flags]

or

gdb --batch --command=../dev/test.gdb ./bin/unittest

gdb can also be used with python as

export PYTHONPATH=$PYTHONPATH:~/feasst/build/
gdb python
r [python script] [optional flags]

Valgrind: Find memory leaks

Valgrind helps to detect memory management bugs.

http://valgrind.org/

For example, to run Valgrind on a particular test and output to text file

valgrind ./unittest --gtest_filter=MC.* > out.txt 2>&1
  • For uninitialized value errors, try –track-origins=yes

  • For leaks, try –leak-check=full –show-leak-kinds=all

  • Don’t use profiler for leak checks. OMP causes “leaks” O.K.

  • For suppress false-positives (e.g., gomp or gsl), use –gen-suppressions=all to generate suppression files

GCOV and LCOV: Test coverage

GCC compilers allow testing of coverage with gcov and lcov for visualization.

  • Code: currently implemented with Travis CI and CodeCov and available online. See .travis.yml for example of how to use lcov

  • Use GCOV with CMake: cmake -DUSE_GCOV . Note: this disables optimization, so don’t use it for production simulations.

  • make coverage

  • Open coverage/index.html in your browser.

  • Go into “src” and ignore the external library coverage.

CCACHE: Speed up compilation time

Something as trivial as changing a comment in a header file can lead to a massive recompile of the entire source. Your previous compile is remembered by ccache, leading to near instant recompilation in the above example.

Document

Setup

pip install sphinx breathe doxygen with GENERATE_XML run sphinx-quickstart, enable autodoc add something like the following to your sphinx index.rst:

.. doxygenclass:: Nutshell
   :project: nutshell
   :members:
add the following to your sphinx conf.py

extensions = [ “breathe”, “nbsphinx” ] breathe_projects = {“FEASST”:”../xml”} breathe_domain_by_extension = {“h” : “cc”}

pip install sphinx_rtd_theme nbsphinx

run sphinx: make html

Sphinx/Breathe/Doxygen notes

  • Link from rst file to C++ function: :cpp:func:`link <feasst::className::function()>`

  • Link from rst file to C++ class: :cpp:class:`link <feasst::className>`

  • Link from rst file to fst file: :doc:`/tutorial/asdf` [note, / references root]

  • Link from rst file to ipynb file : `Tutorial <tutorial/tutorial.html>`_

  • Link from C++ to C++: className::function()

  • Link from C++ source to rst file: <a href="tutorial/asdf.html">test</a>

  • For math in C++ comments:

    \f$ latex code here \f$
    
  • For tables, see monte_carlo/include/trial_compute_add.h

Pip notes

dev/tools/pip_install.sh

Anaconda cloud notes

https://docs.anaconda.com/anaconda-cloud/user-guide/getting-started/ * conda build purge * still haven’t gotten this to work because of overlinking

Style

Naming

  • ClassNames are mixed case with starting upper case letter

  • member_names are lower case with underscores

  • private_member_names_ end with an underscore

  • function_names are also lower case with underscores

  • bools syntax: is_[accepted.., etc]

  • MACROS and CONSTANTS are all upper case.

  • Avoid MACROS and CONSTANTS.

  • use “and”, “or” instead of “&&”, “||” (HWH: change this to follow Google?)

Functions

  • Use return values. Argument ordering: input (value or constant reference), then output (pointer only)

  • Overloaded functions -> can you document all in a single comment? good

  • No Default parameters on virtual functions

Classes

  • Nearly all data members should be private. Limit protected members

  • member_name() returns const member

  • set_member_name(member_name) sets member

  • For setters with multiple arguments, the first are vector indices as in order x[0] = 3…

  • getptr_member_name returns constant pointer (optimization only)

Loops and if

  • use of “for (auto element : container) { … }” is dangerous

  • for simple loops over containers, use “for (element : container)”

  • for loops where you need the index, use: for (int index = 0; index < static_cast<int>(container.size()); ++index)

Auto

  • only use auto when the type is clear such as auto var = std::make_shared<..>.

Arguments

  • All arguments are provided as strings and converted to the expected type.

  • Check that all arguments are used (e.g., like implicit none, a typo is caught).

  • Argument defaults need to be set and clearly commented.

  • If no default, it is a required argument.

Serialization

  • guided by https://isocpp.org/wiki/faq/serialization

  • For inheritance hierarchy, a static deserialize_map is used to relate class name to template.

  • Each object serializes a version that can be used for checks and backwards compatibility.

  • utils_io.h contains many function templates for serialization.

  • In particular, feasst_deserialize_fstdr() needs to be fixed.

  • Don’t forget to serialize (private) member data in new implementations.

  • To compare differences between two serializations, paste into file and using “s/ /r/g”

File output

  • comma-separated values (CSV) are the preferred format (e.g., comma deliminter)

For quick reference

  • line counts [find . -name ‘.cpp’ -o -name ‘.h’ | xargs wc -l | sort -n]

  • tutorial errors [ find . -name ‘tutorial_failures.txt’ | xargs cat ]

  • clean docs before running depend.py again [ for dir in ls –color=never -d *; do rm $dir/doc/*rst; done ]

  • find difference in serialization string: [ diff -u f1 f2 |colordiff | perl /usr/share/doc/git/contrib/diff-highlight/diff-highlight | more ]

To Do List

  • implement gibbs ensemble

  • find a better way for two different classes to take from the same argument list and still maintain unused checks.

  • Make utils:lj,spce,etc derived classes of System ?

  • benchmark feasst vs simple hardcoded LJ simulations. Create benchmarking profile to compare among versions

  • ideal gas as the first tutorial/testcase

  • specify units in LMP data files?

  • fix dependency linkers required by clang/cmake on macOS but not g++ on ubuntu

  • consider optimization of Ewald: init ewald storage on particle types, precompute property index.

  • when selecting from cpdf, use lnp instead of p?

  • insert optimization: update cell list of sites when added, but of domain only when finalized.

  • IF using argtype for custom object, considering single string constructors. E.g., for position in cylinder.h, use {“point0”, “0 0 0”}

  • Python debug script: easy for user to run gdb/valgrind

  • Toggle more debug levels, and localized to certain files/plugins, etc

  • force precompute when reinitializing system, criteria, etc in MonteCarlo

  • MonteCarlo subclass Simulation

  • swig python wrap arguments std::istream and std::ostream for serialization

  • add citations to tutorials (reweighting, etc) and also citation suggestions for MC objects

  • VisitModels may prefer to update select properties (e.g., cell, eik)

  • Jupyter notebook output should go to cells, not terminal that runs jupyter.

  • lint file_[xyz,lmp]

  • regrow but within near existing, for ‘free dof, e.g. azimuthal in angle, sphere in bond, etc’

  • put cell list in finalize-heavy paradigm, update_positions updates cell of selection, finalize updates entire cell list. linked list

  • config could use revert,finalize to update cell list only on finalization, and maybe not have to exclude from cell properties (why exclude?). same with ewald

  • Refactor arguments so that they can be checked for usage (especially in Trials)

  • Rename TrialSelect->SelectTrial, TrialCompute->ComputeTrial. Rename Compute->Decide?.

  • Somehow, trial_growth_expanded.h doesn’t include debug.h but can compile with ASSERT

  • Speed up RNG by maintaining int_distribution like dis_double

  • Document utils lj, spce, rpm in tutorials

  • Add a FAQ for common compile errors: “no known conversion from brace-enclosed initializer list to const argtype&” often means that a parameter was not converted to a string.

  • Make a CachedRandom and CachedPotential for prefetch and avoid if statements that could slow down serial simulations.

  • remove tutorial/fh.py

  • Tuner->Tune

  • Analyze/ModifyFactory optimization: use steps_per in factory to limit number of checks

  • implement timer for profiles (with hierarchies by class… tried this, but its too slow. Time only infrequently?)

  • implement a timer to auto-balance trial weights based on cpu time.

  • More documentation/tutorial on how to create your own plugins, classes, etc

  • add orientation argument to shapes with internal coordinate transformation

  • System should track current energy of every potential for analysis (Criteria running energies may contain a part from each potential to simplify debugging).

  • Consider using new state instead of old state in acceptance derivations

  • Sort selection_of_all, or impose sorting in Select::add_particles. Currently, this leads to issues.

  • Rename data and xyz files, document them more cleary (second line in xyz, and error if data not read correctly).

  • Make ModelTwoBodyTable that tabulates interaction from min(hs)-max(rc) distance for each distinct pair of site types, and can easily be added as optimized Potential

  • Rename plugin chain->config_bias ?

  • maybe mc.add(criteria) is preferable to mc.set. Same with sys?

  • in optimizing where config only updates when trial finalized, how to build off new perturbed config in CB?