ZenoWrapper

zenowrapper.main.ZenoWrapper

class zenowrapper.main.ZenoWrapper(atom_group: AtomGroup, type_radii: dict[str, float] | None = None, n_walks: int | None = 100000, min_n_walks: int | None = None, n_interior_samples: int | None = 10000, min_n_interior_samples: int | None = None, max_rsd_capacitance: float | None = None, max_rsd_polarizability: float | None = None, max_rsd_volume: float | None = None, max_run_time: float | None = None, num_threads: int = 1, temperature: float | None = None, size_scaling_factor: float = 1.0, launch_radius: float | None = None, skin_thickness: float | None = None, mass: float | None = None, viscosity: float | None = None, buoyancy_factor: float | None = None, temperature_units: str = 'K', viscosity_units: str = 'p', length_units: str = 'A', mass_units: str = 'g', seed: int = -1, verbose: bool = False, **kwargs)[source]

Bases: AnalysisBase

Compute hydrodynamic and geometric properties via ZENO Monte Carlo methods.

This analysis class wraps the NIST ZENO software to calculate physical properties of molecular structures from MD trajectories. It uses two Monte Carlo algorithms: Walk-on-Spheres (exterior) for electrical/hydrodynamic properties and Interior Sampling for geometric properties.

The electrical properties are converted to hydrodynamic properties via the electrostatic-hydrodynamic analogy. All properties include rigorous statistical uncertainties from variance estimation.

Two-Level Parallelization

ZenoWrapper supports parallelization at two independent levels:

  1. Frame-level parallelism (MDAnalysis): Distribute trajectory frames across multiple Python processes using the backend parameter in run(). Supported backends: 'serial', 'multiprocessing', 'dask'.

  2. Within-frame parallelism (ZENO C++): Parallelize Monte Carlo walks within each frame using the num_threads parameter. ZENO uses C++ std::thread for efficient shared-memory parallelism.

These can be combined for maximum performance. For example, with 16 CPU cores: run(backend='multiprocessing', n_workers=4) with num_threads=4 uses all 16 cores (4 processes × 4 threads per process).

Note

Frame-level parallelization requires trajectory readers that support random access and pickling (most file-based readers). Streaming readers like IMDReader only support serial analysis with within-frame threading.

Parameters:
  • atom_group (MDAnalysis.core.groups.AtomGroup) – Atoms to analyze (e.g., protein, polymer, nanoparticle)

  • type_radii (dict) – Mapping of atom types to VdW radii in length_units, e.g., {'C': 1.7, 'N': 1.55, 'O': 1.52}. Required for all atom types present.

  • n_walks (int, optional) – Number of random walks for exterior calculation (default: 100000). Higher values improve accuracy but increase computation time.

  • min_n_walks (int, optional) – Minimum walks before convergence check. If None, runs exactly n_walks.

  • n_interior_samples (int, optional) – Number of sample points for interior calculation (default: 10000). Higher values improve volume/gyration accuracy.

  • min_n_interior_samples (int, optional) – Minimum samples before convergence check. If None, runs exactly n_interior_samples.

  • max_rsd_capacitance (float, optional) – Target relative standard deviation for capacitance (e.g., 0.01 for 1%). Computation stops early if reached.

  • max_rsd_polarizability (float, optional) – Target relative standard deviation for polarizability.

  • max_rsd_volume (float, optional) – Target relative standard deviation for volume.

  • max_run_time (float, optional) – Maximum computation time in seconds per frame.

  • temperature (float, optional) – Temperature for diffusion coefficient calculation. Required if diffusion coefficient is needed.

  • size_scaling_factor (float, optional) – Scale all radii by this factor (default: 1.0)

  • launch_radius (float, optional) – Radius of sphere enclosing molecule. Auto-computed if None.

  • skin_thickness (float, optional) – Cutoff distance for Walk-on-Spheres termination. Auto-set if None.

  • mass (float, optional) – Molecular mass for intrinsic viscosity with mass units and sedimentation coefficient. Specify in mass_units.

  • viscosity (float, optional) – Solvent viscosity for friction/diffusion/sedimentation coefficients. Specify in viscosity_units (e.g., 0.01 poise for water at 20°C).

  • buoyancy_factor (float, optional) – Buoyancy factor for sedimentation coefficient (typically 0.72-0.73 for proteins in water).

  • temperature_units (str, optional) – ‘K’ (Kelvin) or ‘C’ (Celsius), default ‘K’

  • viscosity_units (str, optional) – ‘p’ (poise) or ‘cp’ (centipoise), default ‘p’

  • length_units (str, optional) – ‘m’, ‘cm’, ‘nm’, ‘A’ (Angstrom), or ‘L’ (arbitrary), default ‘A’

  • mass_units (str, optional) – ‘Da’, ‘kDa’, ‘g’, or ‘kg’, default ‘g’

  • seed (int, optional) – Random seed for reproducibility. Default -1 (random seed).

  • verbose (bool, optional) – Print progress information (default: False)

  • **kwargs – Additional arguments passed to MDAnalysis.analysis.base.AnalysisBase

Variables:
  • universe (MDAnalysis.core.universe.Universe) – Universe containing the trajectory

  • atom_group (MDAnalysis.core.groups.AtomGroup) – Atoms being analyzed

  • type_radii (numpy.ndarray) – Array of radii for each atom in atom_group

  • results (MDAnalysis.analysis.base.Results) –

    Container for computed properties. Each property is a Property object with .values and .variance arrays indexed by frame. Available properties depend on input parameters:

    Always computed:
    • capacitance : Molecular capacitance (length units)

    • electric_polarizability_tensor : 3×3 polarizability tensor (length³)

    • electric_polarizability_eigenvalues : Eigenvalues of tensor (length³)

    • electric_polarizability : Mean polarizability (length³)

    • intrinsic_conductivity : Dimensionless conductivity

    • volume : Molecular volume (length³)

    • gyration_tensor : 3×3 gyration tensor (length²)

    • gyration_eigenvalues : Eigenvalues of gyration tensor (length²)

    • capacitance_same_volume_sphere : Equivalent sphere radius (length)

    • hydrodynamic_radius : Effective radius in solution (length)

    • viscometric_radius : Radius from viscosity (length)

    • intrinsic_viscosity : Dimensionless intrinsic viscosity

    • prefactor_polarizability2intrinsic_viscosity : Shape factor

    Requires viscosity:
    • friction_coefficient : Resistance to motion (mass/time)

    Requires temperature and viscosity:
    • diffusion_coefficient : Translational diffusion (length²/time)

    Requires mass, buoyancy_factor, and viscosity:
    • sedimentation_coefficient : Sedimentation rate (time)

    Requires mass:
    • intrinsic_viscosity_mass : Intrinsic viscosity (length³/mass)

  • n_frames (int) – Number of frames analyzed

  • times (numpy.ndarray) – Timestamps of analyzed frames

  • frames (numpy.ndarray) – Indices of analyzed frames

Notes

Parallelization Strategy

For optimal performance, consider your workload characteristics:

  • Many frames (>100), fast computation: Use frame-level parallelism (backend='multiprocessing') with num_threads=1 or small values

  • Few frames (<20), expensive computation: Use within-frame parallelism with higher num_threads and backend='serial'

  • Balanced workload: Use hybrid approach, e.g., n_workers=4 with num_threads=4 on a 16-core machine

Memory usage scales with n_workers (each worker loads a copy of the trajectory), while num_threads shares memory within each worker.

Examples

Basic usage computing hydrodynamic radius:

>>> import MDAnalysis as mda
>>> from zenowrapper import ZenoWrapper
>>> u = mda.Universe('protein.pdb', 'traj.dcd')
>>> type_radii = {'C': 1.7, 'N': 1.55, 'O': 1.52, 'S': 1.8}
>>> zeno = ZenoWrapper(u.select_atoms('protein'), type_radii=type_radii)
>>> zeno.run(start=0, stop=100, step=10)
>>> print(zeno.results.hydrodynamic_radius.values.mean())

Computing diffusion coefficient:

>>> zeno = ZenoWrapper(
...     u.atoms,
...     type_radii=type_radii,
...     temperature=298.15,  # K
...     viscosity=0.01,      # poise (water at 20°C)
...     temperature_units='K',
...     viscosity_units='p'
... )
>>> zeno.run()
>>> D = zeno.results.diffusion_coefficient.values
>>> D_mean = D.mean()
>>> D_std = np.sqrt(zeno.results.diffusion_coefficient.variance.mean())

Using frame-level parallelization for long trajectories:

>>> # Process 1000 frames using 8 workers, single-threaded per frame
>>> zeno = ZenoWrapper(u.atoms, type_radii=type_radii, num_threads=1)
>>> zeno.run(backend='multiprocessing', n_workers=8)

Using within-frame parallelization for expensive computation:

>>> # Process few frames with 16-threaded ZENO per frame
>>> zeno = ZenoWrapper(
...     u.atoms,
...     type_radii=type_radii,
...     n_walks=10000000,  # 10M walks = expensive!
...     num_threads=16
... )
>>> zeno.run(backend='serial')

Using hybrid two-level parallelization:

>>> # Balance: 4 workers × 4 threads = 16 cores total
>>> zeno = ZenoWrapper(u.atoms, type_radii=type_radii, num_threads=4)
>>> zeno.run(backend='multiprocessing', n_workers=4)

See also

MDAnalysis.analysis.base.AnalysisBase

Base class with run() method

Property

Container for results with values and uncertainties

Notes

  • Computation time scales linearly with n_walks and n_interior_samples

  • Statistical uncertainty decreases as 1/sqrt(N) for N samples

  • The electrostatic-hydrodynamic analogy relates capacitance C to hydrodynamic radius: R_h = C (exact for spheres, approximate for other shapes)

  • All properties include variance estimates via propagation of uncertainties

  • Setting a fixed seed ensures reproducible results

References

classmethod get_supported_backends()[source]

Return the supported parallel backends for this analysis.

ZenoWrapper supports all standard MDAnalysis parallel backends since each frame’s analysis is completely independent and results can be concatenated across frames.

Returns:

Supported backend names: (‘serial’, ‘multiprocessing’, ‘dask’)

Return type:

tuple of str

See also

Parallel analysis

MDAnalysis parallel analysis documentation

Notes

Even when using parallel backends, each worker will use num_threads for within-frame ZENO parallelization, giving two-level parallelism.

Added in version 3.0.0.

property parallelizable

Boolean mark showing that a given class can be parallelizable with split-apply-combine procedure. Namely, if we can safely distribute _single_frame() to multiple workers and then combine them with a proper _conclude() call. If set to False, no backends except for serial are supported.

Note

If you want to check parallelizability of the whole class, without explicitly creating an instance of the class, see _analysis_algorithm_is_parallelizable. Note that you setting it to other value will break things if the algorithm behind the analysis is not trivially parallelizable.

Returns:

if a given AnalysisBase subclass instance is parallelizable with split-apply-combine, or not

Return type:

bool

Added in version 2.8.0.

run(start: int = None, stop: int = None, step: int = None, frames: Iterable = None, verbose: bool = None, n_workers: int = None, n_parts: int = None, backend: str | BackendBase = None, *, unsupported_backend: bool = False, progressbar_kwargs=None)

Perform the calculation

Parameters:
  • start (int, optional) – start frame of analysis

  • stop (int, optional) – stop frame of analysis

  • step (int, optional) – number of frames to skip between each analysed frame

  • frames (array_like, optional) – array of integers or booleans to slice trajectory; frames can only be used instead of start, stop, and step. Setting both frames and at least one of start, stop, step to a non-default value will raise a ValueError.

    Added in version 2.2.0.

  • verbose (bool, optional) – Turn on verbosity

  • progressbar_kwargs (dict, optional) – ProgressBar keywords with custom parameters regarding progress bar position, etc; see MDAnalysis.lib.log.ProgressBar for full list. Available only for backend='serial'

  • backend (Union[str, BackendBase], optional) – By default, performs calculations in a serial fashion. Otherwise, user can choose a backend: str is matched to a builtin backend (one of serial, multiprocessing and dask), or a MDAnalysis.analysis.results.BackendBase subclass.

    Added in version 2.8.0.

  • n_workers (int) – positive integer with number of workers (processes, in case of built-in backends) to split the work between

    Added in version 2.8.0.

  • n_parts (int, optional) – number of parts to split computations across. Can be more than number of workers.

    Added in version 2.8.0.

  • unsupported_backend (bool, optional) – if you want to run your custom backend on a parallelizable class that has not been tested by developers, by default False

    Added in version 2.8.0.

Changed in version 2.2.0: Added ability to analyze arbitrary frames by passing a list of frame indices in the frames keyword argument.

Changed in version 2.5.0: Add progressbar_kwargs parameter, allowing to modify description, position etc of tqdm progressbars

Changed in version 2.8.0: Introduced backend, n_workers, n_parts and unsupported_backend keywords, and refactored the method logic to support parallelizable execution.