Data model

OVITO organizes the information it processes into data objects, each representing a specific fragment of a dataset. For example, a dataset may be composed of a SimulationCell object holding the box dimensions and boundary conditions, a Particles object storing information associated with the particles, and a Bonds sub-object storing the list of bonds between particles. For each type of data object you will find a corresponding Python class in the ovito.data module. All of them derive from one common base class: DataObject.

Data objects can contain other data objects, forming a nested structure with parent-child relationships. For example, the Particles object is a container, which manages a number of Property objects, each being an array of property values associated with the particles. Furthermore, the Particles object can also contain a Bonds object, which in turn is a container for the Property objects storing the per-bond property values:

../_images/data_objects.svg

At the topmost level of this hierarchy of nested objects is always the DataCollection class. It is the fundamental unit representing a complete dataset that was loaded from one or more input simulation files, and which then gets processed by the modifier steps of a data pipeline. Modifiers may alter individual data objects within the DataCollection, add new data objects to the top-level collection, or create additional sub-objects in nested container objects.

When you call the Pipeline.compute() method, you receive back a DataCollection holding the computation results of the pipeline. The DataCollection class provides various property fields for accessing the different kinds of sub-objects it contains.

It is important to note that a DataCollection object represents just a single animation frame and not an entire simulation trajectory. Thus, in OVITO’s data model, a simulation trajectory is rather represented as a series of DataCollection instances. A data pipeline operates on and produces only a single DataCollection at a time, i.e., it works on a frame-by-frame basis.

Particles

The Particles data object, which is accessible via the DataCollection.particles field, holds all particle or molecule-related data. OVITO uses a property-centered representation of particles, where information is stored as a set of uniform memory arrays, all being of the same length. Each array represents one particle property such as position, type, mass, color, etc., and holds the values for all N particles in the system. A property data array is an instance of the Property data object class, which is not only used by OVITO for storing particle properties but also bond properties, voxel grid properties, and more.

Thus, a system of particles is nothing else than a loose collection of Property objects, which are held together by a container, the Particles object, which is a specialization of the generic PropertyContainer base class. Each particle property has a unique name that identifies the meaning of the property. OVITO defines a set of standard property names, which have a specific meaning to the program and a prescribed data format. The Position standard property, for example, holds the XYZ coordinates of all particles and is mandatory. Other standard properties, such as Color or Mass, are optional and may or may not be present in a Particles container. Furthermore, Property objects with non-standard names are supported, representing user-defined particle properties.

../_images/particles_object.svg

The Particles container object mimics the programming interface of a Python dictionary, which lets you look up properties by name. To find out which properties are present, you can query the dictionary for its keys:

>>> data = pipeline.compute()
>>> list(data.particles.keys())
['Particle Identifier', 'Particle Type', 'Position', 'Color']

Individual particle properties can be looked up by their name:

>>> color_property = data.particles['Color']

Some standard properties can also be accessed through convenient getter fields, for example the Particles.colors field:

>>> color_property = data.particles.colors

The Particles class is a sub-class of the generic PropertyContainer base class. OVITO defines several property container types, such as the Bonds, DataTable and VoxelGrid types, which all work like the Particles type. They all have in common that they represent an array of uniform data elements, which may be associated with an arbitrary set of properties.

Property objects

A PropertyContainer manages a variable set of Property objects, each Property storing the values for one particular property of all data elements in an array. A Property object behaves pretty much like a standard NumPy array:

>>> coordinates = data.particles.positions

>>> print(coordinates[...])
[[ 73.24230194  -5.77583981  -0.87618297]
 [-49.00170135 -35.47610092 -27.92519951]
 [-50.36349869 -39.02569962 -25.61310005]
 ...,
 [ 42.71210098  59.44919968  38.6432991 ]
 [ 42.9917984   63.53770065  36.33330154]
 [ 44.17670059  61.49860001  37.5401001 ]]

Property arrays can be one-dimensional (in case of scalar properties) or two-dimensional (in case of vector properties). The size of the first array dimension is always equal to the number of data elements (e.g. particles) stored in the parent PropertyContainer. The container reports the current number of elements via its count attribute:

>>> data.particles.count  # Number of particles
28655
>>> data.particles['Mass'].shape   # 1-dim. array
(28655,)
>>> data.particles['Color'].shape  # 2-dim. array
(28655, 3)
>>> data.particles['Color'].dtype  # Data type of property array
float64

OVITO currently supports three different numeric data types for property arrays: float64, int32 and int64. For built-in standard properties the data type and the dimensionality are prescribed by OVITO. For user-defined properties they can be chosen by the user when creating a new property.

Global attributes

Global attributes are simple tokens of information associated with a DataCollection as a whole, organized as key-value pairs in the Python dictionary DataCollection.attributes. File readers automatically generate certain global attributes at the source of a data pipeline to associate the imported dataset with relevant information such as the current simulation timestep number or the name of the input file. In the graphical user interface of OVITO you can inspect the current set of global attributes by opening the Data Inspector panel.

Modifiers in a data pipeline may associate a DataCollection with additional attributes to report their computation results. For example, the ClusterAnalysisModifier outputs the attribute named ClusterAnalysis.cluster_count, which reflects the total number of particles clusters that have been found by the clustering algorithm at the current timestep. OVITO provides functions to export such attributes to an output text file, or to embed them in rendered images and animations as a dynamic TextLabelOverlay.

Please refer to the DataCollection.attributes documentation for a more extensive overview of how to work with global attributes.

Data tables

Tabulated data is represented in OVITO by DataTable objects, which are a specialized type of PropertyContainer. A DataTable consists of a variable number of rows and columns. Each column is an instance of the Property class. Data tables are typically generated dynamically by modifiers performing computations, for example the HistogramModifier or the CommonNeighborAnalysisModifier, to store their results. In the graphical user interface of OVITO, data tables are rendered as graphs or charts (line, scatter, histogram plots), found in the data inspector panel.

In Python, all data tables generated by the modifiers in the current pipeline can be accessed from the DataCollection.tables dictionary returned by the pipeline. Each table has a unique identifier string, which serves as lookup key in that dictionary. Furthermore, OVITO provides export functions for writing data tables to an output text file. Please see the DataTable class for further details.

Surface meshes

OVITO can import or generate surface mesh data structures for visualization purposes and other applications. For instance, the ConstructSurfaceModifier can be inserted into a data pipeline to construct a triangulated surface mesh to represent the spatial region filled with particles. The output of this modifier is a SurfaceMesh object, which holds the vertices and faces of the mesh. See also corresponding section in the user manual.

In Python, surface meshes generated by the modifiers in the current pipeline can be accessed from the DataCollection.surfaces dictionary returned by the pipeline. Each surface mesh has a unique identifier string, which serves as lookup key in that dictionary. Please see the SurfaceMesh class for further details.

Next topic