Grand canonical ensemble transition-matrix Monte Carlo

In this example, flat histogram methods are employed for a small macrostate range from 0 to 5 particles. To begin, the system is initialized with the minimum number of particles by setting Metropolis acceptance criteria with favorable conditions for adding particles. The Metropolis criteria are then replaced with the flat histogram criteria. At this point, typical analysis from the previous tutorials are added. In addition, we also add checkpoint files, criteria status, and average energy of a given macrostate. Finally, the simulation is run until the requested number of iterations of the flat histogram algorithm are complete.

A small macrostate range allows the simulation to run quickly with good sampling, and thus it is an ideal starting point to test the simulations. To begin, read the previous SRSW values from file for comparison.

[1]:
import pandas as pd
import feasst as fst

srsw_file = fst.install_dir() + "/plugin/flat_histogram/test/data/stat150.csv"
ln_prob_srsw = fst.LnProbability(pd.read_csv(srsw_file)["lnPI"].values[:6])
ln_prob_srsw.normalize() # normalize to account for a smaller macrostate range
df = pd.DataFrame(data=ln_prob_srsw.values(), columns={"ln_prob_srsw"})
df['ln_prob_srsw_std'] = 0.04
df['u_srsw'] = pd.read_csv(srsw_file)["energy"]
df['u_srsw_std'] = pd.read_csv(srsw_file)["energystd"]
df
[1]:
ln_prob_srsw ln_prob_srsw_std u_srsw u_srsw_std
0 -18.707570 0.04 -2.312265e-10 6.689238e-10
1 -14.037373 0.04 -6.057402e-04 6.709198e-10
2 -10.050312 0.04 -3.057422e-02 9.649147e-06
3 -6.458921 0.04 -8.992832e-02 1.387472e-04
4 -3.145637 0.04 -1.784571e-01 3.315245e-05
5 -0.045677 0.04 -2.961920e-01 1.348791e-05
[2]:
import unittest
import pyfeasst

def run_sample_lj_tm_mc(checkpoint_file_name):
    monte_carlo = fst.MonteCarlo()
    monte_carlo.set(fst.lennard_jones(fst.args({"cubic_box_length": "8"})))
    monte_carlo.set(fst.MakeThermoParams(fst.args({"beta": str(1./1.5), "chemical_potential": "-2.352321"})))
    monte_carlo.set(fst.MakeFlatHistogram(
        fst.MakeMacrostateNumParticles(fst.Histogram(fst.args({"width": "1", "min": "0", "max": "5"}))),
        fst.MakeTransitionMatrix(fst.args({"min_sweeps": "50"}))))
    monte_carlo.add(fst.MakeTrialTranslate(fst.args({"weight": "0.25", "tunable_param": "1."})))
    monte_carlo.add(fst.MakeTrialTransfer(fst.args({"weight": "1", "particle_type": "0"})))
    monte_carlo.add(fst.MakeLogAndMovie(fst.args({"steps_per": str(1e5), "file_name": "lj"})))
    monte_carlo.add(fst.MakeCheckEnergyAndTune(fst.args({"steps_per": str(1e5), "tolerance": str(1e-8)})))
    monte_carlo.add(fst.MakeCriteriaUpdater(fst.args({"steps_per": str(1e5)})))
    monte_carlo.add(fst.MakeCriteriaWriter(fst.args({"steps_per": str(1e5), "file_name": "lj_fh.txt"})))
    monte_carlo.add(fst.MakeEnergy(fst.args({"file_name": "lj_en.txt", "steps_per_update": "1",
        "steps_per_write": str(1e5), "multistate": "true"})))
    monte_carlo.set(fst.MakeCheckpoint(fst.args({"file_name": checkpoint_file_name, "num_hours": "0.0001"})))
    monte_carlo.run_until_complete()

class TestFlatHistogramLJ(unittest.TestCase):
    """Test flat histogram grand canonical ensemble Monte Carlo simulations"""
    def test_serial_5max(self):
        """Compare the free energies and potential energies with the NIST SRSW
        https://www.nist.gov/programs-projects/nist-standard-reference-simulation-website
        https://mmlapps.nist.gov/srs/LJ_PURE/eostmmc.htm
        """
        # To emulate post-processing, obtain monte_carlo from checkpoint file
        checkpoint_file_name='checkpoint.txt'
        run_sample_lj_tm_mc(checkpoint_file_name)
        monte_carlo = fst.MonteCarlo().deserialize(pyfeasst.read_checkpoint(checkpoint_file_name))

        # To compare with previous values, make a deep copy of the FlatHistogram derived class
        criteria = fst.FlatHistogram(monte_carlo.criteria())
        print('lnpi energy')
        for macro in range(criteria.num_states()):
            self.assertAlmostEqual(
                df["ln_prob_srsw"][macro],
                criteria.bias().ln_prob().value(macro),
                delta=df["ln_prob_srsw_std"][macro])
            energy_analyzer = monte_carlo.analyze(monte_carlo.num_analyzers() - 1)
            energy_accumulator = energy_analyzer.analyze(macro).accumulator()
            stdev = (df["u_srsw_std"][macro]**2 + energy_accumulator.block_stdev()**2)**(1./2.)
            #print(criteria.bias().ln_prob().value(macro), energy_accumulator.average())
            self.assertAlmostEqual(
                df["u_srsw"][macro],
                energy_accumulator.average(),
                delta=5*stdev)
[3]:
# Note: any line starting with % is only to be used with ipynb
%time
unittest.main(argv=[''], verbosity=2, exit=False)
test_serial_5max (__main__.TestFlatHistogramLJ)
Compare the free energies and potential energies with the NIST SRSW ...
lnpi energy
CPU times: user 8.97 s, sys: 1.53 s, total: 10.5 s
Wall time: 10.5 s
ok

----------------------------------------------------------------------
Ran 1 test in 10.458s

OK
[3]:
<unittest.main.TestProgram at 0x7f41d3758070>

A number of files should also have been created. If the flat histogram method is sampling perfectly, the simulation performs a random walk along the macrostate. For larger ranges of macrostates, or for more difficult sampling cases, monitoring the macrostate can help you determine what conditions are preventing convergence. For example, a plot of the macrostate as a function of the number of attempts may look like the following:

[4]:
pd.read_csv("lj.txt", header=0).dropna(axis='columns')
[4]:
volume p0 beta state energy attempt TrialTranslate tunable TrialAdd TrialRemove
0 512 4 0.666667 4 -0.148161 100000 0.943354 1 0.0444889 0.0441788
1 512 5 0.666667 5 -0.0307635 200000 0.968788 1.05 0.538184 0.53477
2 512 5 0.666667 5 -0.0537466 300000 0.970127 1.1025 0.811549 0.956009
3 512 1 0.666667 1 -0.00060574 400000 0.967404 1.15763 0.812098 0.971658
4 512 4 0.666667 4 -0.00969184 500000 0.968405 1.21551 0.818027 0.97253
... ... ... ... ... ... ... ... ... ... ...
465 512 0 0.666667 0 -4.12506e-15 4700000 0.956725 3.92013 0.818701 0.97249
466 512 3 0.666667 3 -0.00545166 4800000 0.956511 3.92013 0.817026 0.973856
467 512 2 0.666667 2 -0.00966056 4900000 0.959734 3.92013 0.810314 0.97335
468 512 3 0.666667 3 -0.0857703 5000000 0.961697 3.92013 0.81506 0.97419
469 512 0 0.666667 0 -3.21954e-15 5100000 0.95887 3.92013 0.808285 0.973467

470 rows × 10 columns

Note that states are index integer values starting from 0 (e.g., 0, 1, 2, …, criteria.num_states() - 1) The state and macrostate happen to be the same when the minimum macrostate is 0, and the macrostate is the integer number of particles. But if the minimum macrostate was 1, then state 0 would correspond to macrostate 1.0. Obtain an arbitrary macrostate value from the state as follows.

[5]:
monte_carlo = fst.MonteCarlo().deserialize(pyfeasst.read_checkpoint("checkpoint.txt"))
criteria = fst.FlatHistogram(monte_carlo.criteria())
print('state macrostate')
for state in range(criteria.num_states()):
    print(state, criteria.macrostate().value(state))
state macrostate
0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
5 5.0

Many simulation parameters may be obtained from the checkpoint file to automate your analysis.

[6]:
print('volume', monte_carlo.configuration().domain().volume())
print('beta', monte_carlo.thermo_params().beta())
print('beta_mu', monte_carlo.thermo_params().beta_mu())
print('macro_min', criteria.macrostate().value(0))  # monte_carlo.critera() doesn't know macrostate. Use copy of derived class
print('macro_max', criteria.macrostate().value(criteria.num_states() - 1))
print('macro_max', criteria.macrostate().histogram().center_of_last_bin())
volume 512.0
beta 0.6666666666666666
beta_mu -1.5682139999999998
macro_min 0.0
macro_max 5.0
macro_max 5.0

The energy of each macrostate may also be compared with the published values in the NIST SRSW.

[7]:
en = pd.read_csv("lj_en.txt").rename(columns={"average": "u", "block_stdev": "u_std"})
pd.concat([pd.DataFrame(en[["u", "u_std"]]), df[["u_srsw", "u_srsw_std"]]], axis=1)
[7]:
u u_std u_srsw u_srsw_std
0 1.040058e-16 1.611033e-16 -2.312265e-10 6.689238e-10
1 -6.057400e-04 0.000000e+00 -6.057402e-04 6.709198e-10
2 -3.085999e-02 3.288205e-04 -3.057422e-02 9.649147e-06
3 -8.994438e-02 6.247232e-04 -8.992832e-02 1.387472e-04
4 -1.783068e-01 5.512742e-04 -1.784571e-01 3.315245e-05
5 -2.973719e-01 2.316644e-03 -2.961920e-01 1.348791e-05

You may also compare the natural logarithm of the macrostate probability

[8]:
pd.concat([df["ln_prob_srsw"], pd.read_csv("lj_fh.txt", header=1)['ln_prob']], axis=1)
[8]:
ln_prob_srsw ln_prob
0 -18.707570 -18.702804
1 -14.037373 -14.035393
2 -10.050312 -10.047641
3 -6.458921 -6.457822
4 -3.145637 -3.145087
5 -0.045677 -0.045704

The macrostate probability distribution depends upon the choice of the chemical potential, but can be reweighted to different chemical potentials. The collection matrix may be accessed directly as:

[9]:
tm = fst.TransitionMatrix(criteria.bias())
print(tm.collection().matrix())
((0.0, 490611.0, 326738.0), (3268.237449174398, 520559.0335592011, 346173.7289886871), (6150.134671388728, 498387.2676497138, 329032.59768084437), (8994.715736100925, 492255.9315743939, 324274.3526889978), (11893.55579635053, 495457.33304284315, 324504.11115981545), (16200.462783328334, 548042.2720933373, 357457.2651219352))

Did this tutorial work as expected? Did you find any inconsistencies or have any comments? Please contact us. Any feedback is appreciated!