ModECI Model Description Format (MDF)
MDF is an open source, community-supported standard and associated library of tools for expressing computational models in a form that allows them to be exchanged between diverse programming languages and execution environments. The overarching aim is to provide a common format for models across computational neuroscience, cognitive science and machine learning.
It consists of a specification for expressing models in serialized form (currently JSON, YAML or BSON representations, though others such as HDF5 are planned) and a set of Python tools for implementing a model described using MDF. The serialized formats can be used when importing a model into a supported target environment to execute it; and, conversely, when exporting a model built in a supported environment so that it can be re-used in other environments.
ModECI Model Description Format (MDF)
Click here for the full MDF documentation
Note: MDF is still in development! See the open issues related to the specification or go here to get in contact regarding MDF. The MDF format was first proposed following a meeting organised at Princeton in July 2019 by Russ Poldrack of the Center for Reproducible Neuroscience (CRN) at Stanford and the Brain Imaging Data Standard (BIDS) initiative. For more on the previous work in this area, see here.
Overview
MDF is an open source, community-supported standard and associated library of tools for expressing computational models in a form that allows them to be exchanged between diverse programming languages and execution environments. The overarching aim is to provide a common format for models across computational neuroscience, cognitive science and machine learning.
It consists of a specification for expressing models in serialized formats (currently JSON, YAML and BSON representations are supported, though others such as HDF5 are planned) and a set of Python tools for implementing a model described using MDF. The serialized formats can be used when importing a model into a supported target environment to execute it; and, conversely, when exporting a model built in a supported environment so that it can be re-used in other environments.
The MDF Python API can be used to create or load an MDF model for inspection and validation. It also includes a basic execution engine for simulating models in the format. However, this is not intended to provide a efficient, general-purpose simulation environment, nor is MDF intended as a programming language. Rather, the primary purpose of the Python API is to facilitate and validate the exchange of models between existing environments that serve different communities. Accordingly, these Python tools include bi-directional support for importing to and exporting from widely-used programming environments in a range of disciplines, and for easily extending these to other environments.
Development
The implementation and dissemination of the MDF language and associated tools is being carried out by the Model Exchange and Convergence Initiative (ModECI), which has been supported by the NSF Convergence Accelerator Program (Track D: AI-Driven Innovation via Data and Model Sharing), as a publicly accessible open-source project. The initial design has been informed by a series of workshops involving developers of key software environments and other stakeholders in machine learning, cognitive science and neuroscience. Future workshops will address broadening of support to other domains in basic and applied science and technology development (e.g., population biology, medical informatics, structural and environmental monitoring, and complex systems control). Environments for which support is currently being developed include PyTorch, ONNX, WebGME, NeuroML, PsyNeuLink, and ACT-R.
Fig 1: Some of the current and planned formats which MDF will interact with. Click on the image for more information.
Successful interfacing of MDF to existing disciplinary standards (such as ONNX in machine learning, and NeuroML in neuroscience) as well as general-purpose simulation environments (such as WebGME) will permit bridging between these environments, and translation to the broader set of environments supported by those standards (such as Tensorflow & Keras in the case of ONNX, and The Virtual Brain and SONATA in the case of NeuroML). Initial investigations have also taken place, in collaboration with projects in the NSF Accelerator Track C (Quantum Technology), to use MDF for facilitating the implementation of computational models on quantum hardware.
The core elements of the MDF standard
Models The highest level construct in MDF is a model that consists of one or more graphs and model attributes. The former describe the operational features of the model (its structure and execution), while the latter provide additional information (metadata) useful for executing, evaluating, testing or visualizing it.
Graphs A graph specifies the structure and process flow of a model. The most fundamental element of a graph is a node, which specifies some unit of computation in terms of its parameters and functions. Nodes are connected to other nodes via directed edges, which, in the absence of additional conditions, define the computational flow of the model.
Nodes These define the core elements of computation in a graph, that receive and transmit information via their input and output ports. In general, ports represent points of contact between a node and the edgesthat connect it to other nodes.
Output Ports An output port is the starting point of the data transmission process. After processing the information in a node, an output port is used to begin the transmission of information to the next node through edges.
Edges These transmit information from the output port of one node to the input port of another, collectively defining a graph’s topography. Edges may contain weights that can operate on the information they carry.
Input Ports An input port is the endpoint of the data transmission process. It receives the information transmitted through an edge and inputs it to the next node for further processing.
Conditions These are a core and distinctive element of the MDF specification, that complement other computational graph-based formats by providing a high-level set of descriptors for specifying conditional execution of nodes. This allows models with relatively complex execution requirements (e.g., containing cycles, branches, and/or temporal dependencies) to be expressed as graphs in a sufficiently abstract form that facilities exchange among high-level modeling environments without requiring that they be “lowered” to and then recovered from more elaborated procedural descriptions.
Parameters Attributes that determine the configuration and operation of nodes and edges, can be defined in the MDF using parameters. In the case of parameters specifying large data structures (e.g., weight-matrices), arrays in widely used formats (e.g. numpy arrays, TensorFlow tensors) can be used, and serialisation in portable binary formats (e.g. BSON) is supported. Parameters can either be fixed values, which don’t change when the node is executed, or can change over time (stateful parameters).
Functions A single value which is evaluated as a function of values on input ports and other functions and parameters. A key distinction with parameters is that a function is always stateless.
Model metadata There is the ability to add “metadata” to the model, graph, nodes and many of their sub elements which provide additional information about that element. While the metadata should not be essential to the mathematical description of the behavior/structure of the element, it could be useful for human interpretability of its function/purpose, or used when it is mapped to a specific application for simulation/visualization. Metadata can be added to the top level model to specify contact information, citations, acknowledgements, pointers to sample data and benchmark results, and environments in which the specified model was originally implemented and any that have been validated to support its execution.
Fig 2: A simple graph with 3 nodes and 2 edges expressed in MDF.
Fig 3: This graph illustrates the ability to specify behavior that extends beyond the directed flow through the graph. Here, Node 1 generates a random number and transmits that number to Node 2. Node 2 will only run if the number it receives from Node 1 is greater than 10.
Installation
Requirements
Requires Python >= 3.7
Quick start
pip install modeci-mdf
For more detailed installation instructions see here.
For guidelines on contributing to the development of MDF, see here.
Examples
To get started, follow the simple example in a Jupyter notebook here
Multiple examples of serialized MDF files, the Python scripts used to generate them, as well as mappings to target environments can be found here.
Quick Start Guide to MDF
This is a quick guide to the various parts of the ModECI Model Description Format (MDF) specification, API and examples.
Specification of MDF language
The specification for the language, including the core types Graph, Node, Edge etc. is available here.
Installation of Python API
There is a prototype implementation of an API (Application Programming Interface) in Python which can be used to build models in the MDF format, as well as save (serialize) the models in JSON, YAML and other formats. It also has an Execution Engine which can be used to execute/evaluate the models.
Use pip to install the latest version of MDF (plus dependencies) from PyPI:
pip install modeci_mdf
More details, and importantly, how to set up a virtual environment for the package, can be found here.
Examples of MDF
Simple examples
Some basic examples of models in MDF format which illustrate how a model can be 1) created using the Python API, 2) saved to JSON and YAML, 3) exported to graphical form and 4) executed to evaluate all parameters, can be found here.
A step-by-step guide to using MDF
This Jupyter notebook provides a step-by-step guide to creating, saving and executing an MDF model in Python.
More complex examples
An example of a simple Spiking Neuronal Network (SNN) can be found here.
Multiple examples of Convolutional Neural Network (CNN) models can be found in the PyTorch to MDF documentation.
An example of a Recurrent Neural Network (RNN) in MDF can be found here.
Export/import formats
Serialization formats
Whenever a model is exchanged between different environments it will usually be a serialized form of the model which is exported/imported. Python scripts can be used to generate MDF models (e.g. this), but the models are saved in standardized format in either text based JSON or YAML formats or in binary BSON format.
Currently supported environments
PyTorch
Models can be created in PyTorch and exported into MDF format, or MDF models can be converted to code which executes natively in PyTorch. See here for more details.
ONNX
ONNX (Open Neural Network Exchange) is an important format for exchanging models between machine learning environments. It is used in the MDF function ontology, and models in ONNX format can be exported to MDF. See here for more details. Converting MDF->ONNX is best enabled currently by converting the model to PyTorch and from there to ONNX.
NeuroML
Examples of converting MDF to/from NeuroML2/LEMS can be found here.
PsyNeuLink
An outline of interactions between PsyNeuLink and MDF can be found here.
Planned environments to support
ACT-R
We have started some preliminary interactions between ACT-R and MDF. See here for more details.
BIDS
The MDF format was first proposed following a meeting organised at Princeton in July 2019 by Russ Poldrack of the Center for Reproducible Neuroscience (CRN) at Stanford and the Brain Imaging Data Standard (BIDS) initiative. While the prototype Python API and MDF specification have been developed independently of the BIDS initiative (which focusses on exchange of neuroimaging data), there is interest in that community to allow MDF to be used as a way to encode models of neuronal activity, which can be embedded in BIDS datasets. The BIDS Extension Proposal Computational Models is a potential avenue for this.
Background to the ModECI Initiative
See here for details about the Model Exchange and Convergence Initiative (ModECI).
Paper introducing MDF
The background to the ModECI project, the motivation for developing the Model Description Format, and the initial Python implementation of the language have been described in a NeuroView article in the Neuron journal:
Integrating model development across computational neuroscience, cognitive science, and machine learning Padraig Gleeson, Sharon Crook, David Turner, Katherine Mantel, Mayank Raunak, Ted Willke and Jonathan D. Cohen, April 25, 2023 DOI: https://doi.org/10.1016/j.neuron.2023.03.037
Neuroscience, cognitive science, and computer science are increasingly benefiting through their interactions. This could be accelerated by direct sharing of computational models across disparate modeling software used in each. We describe a Model Description Format designed to meet this challenge.
The paper will be freely downloadable from here in April 2024. If you do not have access to this via your institution, please download the preprint of the paper here.
Installation
Requirements
Python >=3.7 is required. Support on Python 3.11 is limited, see this issue.
Installation using pip
Use pip to install the latest version of MDF (plus dependencies) from PyPI:
pip install modeci_mdf
Installation from source
To install the MDF package from source and run it locally:
1) Create a virtual environment (e.g. called mdf-env
)
pip install virtualenv
virtualenv mdf-env
2) Activate the virtual environment
source mdf-env/bin/activate
3) Clone this repository
git clone https://github.com/ModECI/MDF.git
4) Change to the directory
cd MDF
5) Install the package
pip install .
Alternatively, to install MDF plus all of the modules required for the export/import interfaces (e.g. PsyNeuLink, NeuroML):
pip install .[all]
Additional dependencies
To generate generate Graph images in MDF you require Graphviz which uses dot.
pip install graphviz
To render the generated DOT source code, you also need to install Graphviz (download page, installation procedure for Windows.
Make sure that the directory containing the dot executable is on your system’s PATH (sometimes done by the installer; setting PATH on Linux, Mac, and Windows.
Generating ModECI MDF documentation offline
The ModECI MDF Documentation can be found online here. If you are working on MDF documentation or you make changes to the documentation, it is good practice to see if it is working as expected before pushing to the GitHub repository. Here is a walkthrough on how to generate the ModECI MDF documentation offline
Requirements
Python version-3.10 is ideally used for generating MDF documentation offline but if not working, use python version-3.9. The steps are the same except in creating a virtual environment.
The documentation is generated using Sphinx. Make is also required. For Windows installation of Make, see here. For Mac installation of Make, see here
1) Create a virtual environment with python
# install virtual environment
pip install virtualenv
# create & activate virtual environment for python 3.9
python3.9 -m virtualenv venv39
venv39\Scripts\activate
# or create & activate virtual environment for python 3.10
python3.10 -m virtualenv venv310
venv310\Scripts\activate
2) Clone MDF repository from GitHub into your local machine
git clone https://github.com/ModECI/MDF.git
3) Change into the MDF directory
cd MDF
4) Install all MDF package into the virtual environment
pip install .[all]
5) Change directory into sphinx folder
# for windows
cd docs\sphinx
# for Mac/Linux
cd docs/sphinx
6) Create offline documentation in sphinx folder
# To allow a fresh start when making the documentation
make clean
# To make the documentation
make html
7) Change directory into html folder and run the documentation offline
# for Windows go into build\html folder and double click on the index.html file, or:
cd build\html
index.html
# for Mac, go into build/html folder and double click on the index.html file or:
cd build/html
open index.html
The documentation will open up in your browser automatically or right click on the file and open in any browser of your choice.
Contribution Guidelines
This documentation contains a set of guidelines to help new users and potential contributors to MDF.
Before Contributing
Before opening pull requests, make sure that you read these guidelines. If you have any doubt on this contributing guide, please feel free to reach out on our discussion forum.
Making Contributions
Install the MDF package and all its dependencies: https://github.com/ModECI/MDF (see here for full details).
Try running locally the standard MDF examples: https://github.com/ModECI/MDF/tree/main/examples/MDF
Run the following notebook, altering the network elements along the way to build your own model: https://github.com/ModECI/MDF/blob/main/examples/SimpleExample.ipynb.
Read the documentation on the elements of the MDF specification: https://mdf.readthedocs.io
The project uses an issue tracker to keep information about bugs to fix, project features to implement, documentation to write, and more. Potential contributors can look for newcomer-friendly issues by looking for the following issue tags in the project issue tracker:
good first issue
.
Steps to Contribute
Following are the steps to guide you to making your own fork of the MDF repository, making changes and submitting them as contributions:
Step 1
Create and activate a virtual environment for MDF as outlined in the main installation guide.
Step 2
Fork the MDF repository on the GitHub website, and then go to your terminal and clone it on your machine.
git clone https://github.com/<your_fork_name>/MDF
Step 3
Add an upstream link to main branch in your cloned repo.
git remote add upstream https://github.com/ModECI/MDF.git
Step 4
Keep your cloned repo up to date by pulling from upstream (this will also avoid any merge conflicts while committing new changes)
git pull upstream main https://github.com/ModECI/MDF.git
Step 5
Create your feature branch. Note: it is useful to give this a name relevant to the issue being addressed e.g. feat/my_new_feature
or bugfix/123
(to fix issue #123)
git checkout -b <feature-name>
Step 6
Make your changes! Run the tests in test_all.sh to make sure all tests are passing locally.
Step 7
Format your code. We use a standard format (black) for all our code, as this minimises the changes between commits especially when people have different coding styles. Install pre-commit using pip install pre-commit
and type pre-commit run --all-files
at the top level MDF directory to format the code. This will change all the relevant files to the correct formatting before you commit. This formatting is checked by our GitHub Actions tests and will fail if the code is not correctly formatted.
Step 8
Commit your the changes. Note: if you have run the test_all.sh, many of the image files will have been regenerated (and may show as changed even though they are identical). Don’t commit these unless you know there is an actual change.
git commit -m "A meaningful, concise commit message"
Step 9
Push the changes to your fork
git push origin <branch-name>
Step 10
Create a PR on GitHub.
Don’t just hit the create a pull request button, you should write a detailed PR message to clarify why and what are you contributing.
Put the hashtag of a relevant issue in a commit message for the pull request (e.g. #123), and it will show up in the issue itself which will make easy for developers to review your PR based on the issue.
Resources
Markdown : Markdown is a lightweight markup language like HTML, with plain text formatting syntax.
Git : Git is a distributed version-control system for tracking changes in source code during software development. It is designed for coordinating work among programmers, but it can be used to track changes in any set of files.
Need more help?
You can refer to the following articles on basics of Git and Github, in case you are stuck:
ModECI contributors
This page list names and Github profiles of contributors to the various ModECI repositories, listed in no particular order. This page is generated periodically, most recently on 2023-03-06.
Padraig Gleeson (@pgleeson)
David Turner (@davidt0x)
Katherine Mantel (@kmantel)
Ivy (@Ivy8127)
(@mraunak)
Shanka Subhra Mondal (@Shanka123)
Onabajo Monsurat (@Monsurat-Onabajo)
Parikshit Singh Rathore (@parikshit14)
Patrick Stock (@patrickstock)
Jeremy Lee (@jeremyrl7)
Raghavendra Pradyumna Pothukuchi (@rpradyumna)
Marble Kusanele Mpofu (@kusanele)
Somya Agrawal (@somyagr)
(@jdcpni)
Riya Saxena (@29riyasaxena)
Megha Bose (@Megha-Bose)
Pranav Gokhale (@singular-value)
Esraa Abdelmaksoud (@esraa-abdelmaksoud)
Shivani Rana (@shivani6320)
Matteo Cantarelli (@tarelli)
Brian Broll (@brollb)
Repositories
Specification of ModECI v0.4
Note: the ModECI MDF specification is still in development! See here for ongoing discussions.
Model
The top level construct in MDF is Model, which may contain multiple Graph objects and model attribute(s)
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
metadata |
Union[Any, NoneType] |
Optional metadata field, an arbitrary dictionary of string keys and JSON serializable values. |
id |
str |
A unique identifier for this Model |
format |
str |
Information on the version of MDF used in this file |
generating_application |
str |
Information on what application generated/saved this file |
onnx_opset_version |
Union[str, NoneType] |
The ONNX opset used for any ONNX functions in this model. |
Allowed children
Allowed child |
Data Type |
Description |
---|---|---|
graphs |
The collection of graphs that make up the MDF model. |
Graph
A directed graph consisting of Nodes (with Parameters and Functions evaluated internally) connected via Edges.
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
metadata |
Union[Any, NoneType] |
Optional metadata field, an arbitrary dictionary of string keys and JSON serializable values. |
id |
str |
A unique identifier for this Graph |
parameters |
Union[Any, NoneType] |
Dictionary of global parameters for the Graph |
conditions |
Union[ConditionSet, NoneType] |
The ConditionSet stored as dictionary for scheduling of the Graph |
Allowed children
Allowed child |
Data Type |
Description |
---|---|---|
nodes |
One or more Node(s) present in the graph |
|
edges |
Zero or more Edge(s) present in the graph |
Node
A self contained unit of evaluation receiving input from other nodes on InputPort(s). The values from these are processed via a number of Function(s) and one or more final values are calculated on the OutputPort(s)
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
metadata |
Union[Any, NoneType] |
Optional metadata field, an arbitrary dictionary of string keys and JSON serializable values. |
id |
str |
A unique identifier for the node. |
Allowed children
Allowed child |
Data Type |
Description |
---|---|---|
input_ports |
Dictionary of the InputPort objects in the Node |
|
functions |
The Function(s) for computation the node |
|
parameters |
Dictionary of Parameter(s) for the node |
|
output_ports |
The OutputPort(s) containing evaluated quantities from the node |
InputPort
The InputPort is an attribute of a Node which allows external information to be input to the Node
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
metadata |
Union[Any, NoneType] |
Optional metadata field, an arbitrary dictionary of string keys and JSON serializable values. |
id |
str |
The unique (for this Node) id of the input port, |
shape |
Union[Tuple[int, …], NoneType] |
The shape of the input port. This uses the same syntax as numpy ndarray shapes (e.g., numpy.zeros(shape) would produce an array with the correct shape |
type |
Union[str, NoneType] |
The data type of the input received at a port. |
Function
A single value which is evaluated as a function of values on InputPort(s) and other Functions
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
metadata |
Union[Any, NoneType] |
Optional metadata field, an arbitrary dictionary of string keys and JSON serializable values. |
id |
str |
The unique (for this Node) id of the function, which will be used in other Functions and the OutputPorts for its value |
function |
Union[str, NoneType] |
Which of the in-build MDF functions (linear, etc.). See supported functions: https://mdf.readthedocs.io/en/latest/api/MDF_function_specifications.html |
args |
Union[Any, NoneType] |
Dictionary of values for each of the arguments for the Function, e.g. if the in-built function is linear(slope),the args here could be {“slope”:3} or {“slope”:”input_port_0 + 2”} |
value |
Union[EvaluableExpression, List, Dict, ndarray, int, float, str, NoneType] |
If the function is a value expression, this attribute will contain the expression and the function and args attributes will be None. |
Parameter
A parameter of the Node, which can be: 1) a specific fixed value (a constant (int/float) or an array) 2) a string expression for the value referencing other named Parameter(s). which may be stateful (i.e. can change value over multiple executions of the Node); 3) be evaluated by an inbuilt function with args; 4) or change from a default_initial_value with a time_derivative.
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
metadata |
Union[Any, NoneType] |
Optional metadata field, an arbitrary dictionary of string keys and JSON serializable values. |
id |
str |
|
value |
Union[EvaluableExpression, List, Dict, ndarray, int, float, str, NoneType] |
The next value of the parameter, in terms of the inputs, functions and PREVIOUS parameter values |
default_initial_value |
Union[EvaluableExpression, List, Dict, ndarray, int, float, str, NoneType] |
The initial value of the parameter, only used when parameter is stateful. |
time_derivative |
Union[str, NoneType] |
How the parameter changes with time, i.e. ds/dt. Units of time are seconds. |
function |
Union[str, NoneType] |
Which of the in-build MDF functions (linear etc.) this uses, See |
args |
Union[Any, NoneType] |
Dictionary of values for each of the arguments for the function of the parameter, e.g. if the in-build function is linear(slope), the args here could be {“slope”: 3} or {“slope”: “input_port_0 + 2”} |
Allowed children
Allowed child |
Data Type |
Description |
---|---|---|
conditions |
Parameter specific conditions |
ParameterCondition
A condition to test on a Node’s parameters, which if true, sets the value of this Parameter
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
id |
str |
A unique identifier for the ParameterCondition |
test |
Union[EvaluableExpression, List, Dict, ndarray, int, float, str, NoneType] |
The boolean expression to evaluate |
value |
Union[EvaluableExpression, List, Dict, ndarray, int, float, str, NoneType] |
The new value of the Parameter if the test is true |
OutputPort
The OutputPort is an attribute of a Node which exports information to another Node connected by an Edge
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
metadata |
Union[Any, NoneType] |
Optional metadata field, an arbitrary dictionary of string keys and JSON serializable values. |
id |
str |
Unique identifier for the output port. |
value |
Union[str, NoneType] |
The value of the OutputPort in terms of the InputPort, Function values, and Parameter values. |
shape |
Union[Tuple[int, …], NoneType] |
The shape of the output port. This uses the same syntax as numpy ndarray shapes (e.g., numpy.zeros(shape) would produce an array with the correct shape |
type |
Union[str, NoneType] |
The data type of the output sent by a port. |
Edge
An Edge is an attribute of a Graph that transmits computational results from a sender’s OutputPort to a receiver’s InputPort.
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
metadata |
Union[Any, NoneType] |
Optional metadata field, an arbitrary dictionary of string keys and JSON serializable values. |
id |
str |
A unique string identifier for this edge. |
sender |
str |
The id of the Node which is the source of the edge. |
receiver |
str |
The id of the Node which is the target of the edge. |
sender_port |
str |
The id of the OutputPort on the sender Node, whose value should be sent to the receiver_port |
receiver_port |
str |
The id of the InputPort on the receiver Node |
parameters |
Union[Any, NoneType] |
Dictionary of parameters for the edge. |
Condition
A set of descriptors which specifies conditional execution of Nodes to meet complex execution requirements.
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
metadata |
Union[Any, NoneType] |
Optional metadata field, an arbitrary dictionary of string keys and JSON serializable values. |
type |
str |
The type of Condition from the library |
kwargs |
Union[Any, NoneType] |
The dictionary of keyword arguments needed to evaluate the Condition |
ConditionSet
Specifies the non-default pattern of execution of Nodes
Allowed parameters
Allowed field |
Data Type |
Description |
---|---|---|
metadata |
Union[Any, NoneType] |
Optional metadata field, an arbitrary dictionary of string keys and JSON serializable values. |
node_specific |
Union[Condition, NoneType] |
A dictionary mapping nodes to any non-default run conditions |
termination |
Union[Condition, NoneType] |
A dictionary mapping time scales of model execution to conditions indicating when they end |
MDF Examples
Examples of Python, JSON and YAML files to illustrate the structure and usage of MDF.
Simple | ABCD | Arrays | States | Conditions | Parameters and Functions
Simple example
Python Source | JSON | YAML
A simple example with 2 Nodes connected by an Edge:

With more detail on Nodes (showing Input Ports (green), Parameters (blue) and Output Ports) (red) and Edges:

ABCD
Python Source | JSON | YAML
Another simple example with more Nodes.


Arrays
Python Source | JSON | YAML
An example using arrays for Parameters and weights on Edges.

States
Python Source | JSON | YAML
An example with Nodes containing persistent States.

Conditions
Python Source | JSON | YAML
A simple 3 Nodes graph with scheduling Conditions. For more examples of conditions see here.

Parameters and Functions
Python Source | JSON | YAML
A simple Node with a number of different types of Parameters (in blue; fixed and stateful) and Functions (in purple; can be built in or ONNX based).

More examples
There are further examples under development, including of a Recurrent Neural Network (RNN), and an Integrate and Fire (IaF) neuron model in this directory.
Interactions between MDF and ACT-R
This directory contains examples of ACT-R models converted to MDF. The ACT-R models count.lisp and addition.lisp are based on the ACT-R tutorial.
The scripts count.py and addition.py can be run to create the MDF .json and .yaml files for the given example and execute it using the MDF scheduler.
The below graph represents the basic structure of all ACT-R models in MDF:
There are also more detailed graphs count.png and addition.png for each example.
Count Model
ACT-R | JSON | YAML | Python Script | Graph
The count model counts from a start value to an end value. The script count.py first reads the original ACT-R model file count.lisp, generates an MDF representation using the MDF ACT-R interface, and outputs the JSON and YAML files. It then executes the MDF model using the MDF scheduler and outputs the final goal set by the model once execution is finished. The final goal has the form:
{'name': 'first-goal', 'ISA': 'count-from', 'start': 'two', 'end': 'four', 'count': 'four'}
In this example, the start value is two, the end value is four, and the final value of count is four, indicating that the model counted from two to four. The start and end values can be modified in line 17 of count.lisp, which sets the initial goal of the model:
(first-goal ISA count-from start two end four)
The script can use any values specified in count.lisp, so the model can be modified and run multiple times in order to test different values. The count graph represents this example.
Addition Model
ACT-R | JSON | YAML | Python Script | Graph
The addition model computes the sum of two numbers. The script addition.py functions identically to the previous example, but uses the addition model instead. The final goal has the form:
{'name': 'second-goal', 'ISA': 'add', 'arg1': 'five', 'arg2': 'two', 'sum': 'seven', 'count': 'nil'}
In this case, the first argument is five, the second argument is two, and the model has calculated the sum, seven. Like the previous example, the arguments can be modified in line 23 of addition.lisp in order to test different values:
(second-goal ISA add arg1 five arg2 two)
The addition graph represents this example.
Interactions between NeuroML and MDF
1) Converting NeuroML to MDF
1.1) Simple ABCD model
Summary: A model is created in NeuroML (using cell dynamics specified in LEMS and a network in NeuroMLlite) and converted to the equivalent model in MDF, which runs with the reference MDF execution engine.
1.1.1) ABCD - NeuroMLlite version
ABCD.py is a script using the NeuroMLlite package to create a simple network with 4 connected elements. The network built can be seen below (this can be generated with python ABCD.py -graph2
):
1.1.2) ABCD - NeuroML2 version
A version of the network in NeuroML 2 can be generated with python ABCD.py -nml
, or generated and executed with jNeuroML with python ABCD.py -nml
.
This will produce the NeuroML file: ABCD.net.nml (note though this is not valid, as not all the elements included are pure NeuroML). A LEMS Simulation file is generated for running the model in jNeuroML or pyNeuroML: LEMS_SimABCD.xml
The definitions of the components used for A, B, etc. can be found in PNL.xml. This is a set of definitions of component types based on those present in PsyNeuLink. A graph depicting the definitions of the network elements can be generated with pynml LEMS_SimABCD.xml -lems-graph
:
1.1.3) ABCD - MDF version
A version of the network in MDF can be generated from NeuroMLlite definition with: python ABCD.py -mdf
producing ABCD.mdf.yaml and ABCD.mdf.json.
A graph of the structure of the MDF model can be generated with: python -m modeci_mdf.interfaces.graphviz.exporter ABCD.mdf.yaml 1
(left below), or with more detail: python -m modeci_mdf.interfaces.graphviz.exporter ABCD.mdf.yaml 3
(right below.)
1.2) FitzHugh Nagumo cell models
1.2.1) FN - NeuroML version
A version of the FitzHugh Nagumo neuron model has been created using NeuroMLlite (FN.py) which generated LEMS (LEMS_SimFN.xml) which can simulate the NeuroML model (FN.net.nml).
A graphical representation of the LEMS is below:
It can be run with:
python FN.py -jnml # Generate and run the LEMS file from the NeuroMLlite description
pynml LEMS_SimFN.xml # Run the LEMS file using pyNeuroML
1.2.2) FN - MDF version
The NeuroMLlite version can also be used to generate MDF for the model:
python FN.py -mdf # Generate the MDF serializations (JSON and YAML) from the NeuroMLlite description
These can be seen here: FN.mdf.json, FN.mdf.yaml, and a graphical version generated with:
python -m modeci_mdf.interfaces.graphviz.importer FN.mdf.yaml 3 # Generate graph from MDF version
1.2.3) FN - Execute model using MDF
A script has been created (FNrun.py) where the model is loaded, run using the standard MDF execution engine, and plotted:
python FNrun.py # Load FN model and run with MDF scheduler
Adding the option -multi
to the Python script for running the FN example, modifies the graph to add an input node with an array of values, meaning multiple instances of the FN neuron will be simulated:
python FN.py -multi
1.3) Izhikevich cell models
A version of the Izhikevich spiking neuron model has been created in NeuroML and can be exported to MDF and executed with the standard execution engine.
1.3.1) Izhikevich - NeuroML version
The single cell model has been created using NeuroMLlite (Izhikevich.py) which generated LEMS (LEMS_SimIzhikevichTest.xml) which can simulate the NeuroML model (IzhikevichTest.net.nml).
A graphical representation of the LEMS is below:
It can be run with:
python Izhikevich.py -jnml # Generate and run the LEMS file from the NeuroMLlite description
pynml LEMS_SimIzhikevichTest.xml # Run the LEMS file using pyNeuroML
1.3.2) Izhikevich - MDF version
The NeuroMLlite version can also be used to generate MDF for the model:
python Izhikevich.py -mdf # Generate the MDF serializations (JSON and YAML) from the NeuroMLlite description
These can be seen here: IzhikevichTest.mdf.json, IzhikevichTest.mdf.yaml, and a graphical version generated with:
python -m modeci_mdf.interfaces.graphviz.importer IzhikevichTest.mdf.yaml 3 # Generate graph from MDF version
1.3.3) Izhikevich - Execute model using MDF
A script has been created (Izh_run.py) where the model is loaded, run using the standard MDF execution engine, and plotted:
python Izh_run.py # Load Izh model and run with MDF scheduler
2) Converting MDF to NeuroML/LEMS
It is also possible to convert MDF models into equivalents in NeuroML/LEMS:
cd ../MDF # convert some of the examples in the examples/MDF directory
python -m modeci_mdf.interfaces.neuroml.exporter Simple.json -run
python -m modeci_mdf.interfaces.neuroml.exporter ABCD.json -run
python -m modeci_mdf.interfaces.neuroml.exporter States.json -run
ONNX MDF Converter
ONNX (Open Neural Network Exchange) is an important format for exchanging models between machine learning environments. It is used in the MDF function ontology, and models in ONNX format can be exported to MDF. Converting MDF->ONNX is best enabled currently by converting the model to PyTorch and from there to ONNX.
ONNX to MDF
AB Sequential Model - 2 nodes
Python source | JSON | YAML
This is an example of a PyTorch model with 2 nodes. First, the script saves the PyTorch model as ONNX and then converts this to MDF. The graphical view of the generated MDF is shown below.
ABC Sequential Model with Loop
Python source | JSON | YAML
Note: Example still in development!
This is an example of a PyTorch model that is implemented in onnx_mdf/examples/simple_abc.py
. The model code
is very simple:
import torch
class A(torch.nn.Module):
def forward(self, x):
return x + 1
@torch.jit.script
def loop_b(x, y):
for i in range(int(y)):
x = x / 10
return x
class B(torch.nn.Module):
def forward(self, x, y):
return loop_b(x, y)
class C(torch.nn.Module):
def forward(self, x):
return x * 100
class ABC(torch.nn.Module):
def __init__(self):
super(ABC, self).__init__()
self.A = A()
self.B = B()
self.C = C()
def forward(self, x, B_loop_count):
return self.C(self.B(self.A(x), B_loop_count))
This implements a PyTorch model with three modules. The modules process the input sequentially, and the
inner B
module has a loop construct.
It is exported to ONNX via a combination of tracing and scripting.
ABCD Branching Conditional Model
Python source | JSON | YAML
Note: Example still in development!
This is an example of a PyTorch model that have four components (A, B, C, D). We loop over the whole model 10 iterations. A is executed only on the first iteration, B is executed every iteration, C is executed every 5 times B is executed, and D is executed every 10 times B is executed. A, B, C, and D are each simple stateless linear functions. This type of conditional execution specification is common in PsyNeuLink. The PyTorch code for the model is fairly straightforward:
class Linear(torch.nn.Module):
def __init__(self, slope=1.0, intercept=0.0):
super(Linear, self).__init__()
self.slope = slope
self.intercept = intercept
def forward(self, x):
return self.slope*x + self.intercept
class ABCD(torch.nn.Module):
def __init__(self, A, B, C, D):
super(ABCD, self).__init__()
self.A = A
self.B = B
self.C = C
self.D = D
def forward(self, x):
# Since we are implementing conditions that reference the number of calls
# to A and B, we need to keep track of this.
num_A_calls = 0
num_B_calls = 0
# We need to initialize outputs, torchscript jit complains if c and d
# are not defined in the FALSE branches of our conditionals.
a = torch.zeros_like(x)
b = torch.zeros_like(x)
c = torch.zeros_like(x)
d = torch.zeros_like(x)
for i in range(10):
# A: pnl.AtNCalls(A, 0),
if num_A_calls == 0:
a = self.A(x)
num_A_calls = num_A_calls + 1
# B: pnl.Always()
b = self.B(a)
num_B_calls = num_B_calls + 1
# C: pnl.EveryNCalls(B, 5),
if num_B_calls % 5 == 0:
c = self.C(b)
# D: pnl.EveryNCalls(B, 10)
if num_B_calls % 10 == 0:
d = self.D(b)
return c, d
The ONNX IR representation of this model is shown below. The small computation sub-graphs contained in the if and else body attributes are not shown. These are either a simple multiplication and addition or an identity.
Interactions between PsyNeuLink and MDF
Simple
ABCD
Python source | JSON | Reconstructed source
An example with four Nodes, as in other environments.
SimpleLinear
SimpleLinear-conditional
Python source | JSON | Reconstructed source
A three-Node example with Conditions.
SimpleLinear-timing
Python source | JSON | Reconstructed source
The same model as in SimpleLinear-conditional with Conditions for timeline scheduling. Note: these conditions are still not fully implemented by the scheduler.
Nested
Nested without scheduling
Python source | JSON | Reconstructed source
A model with several Nodes in two Graphs, one of which contains the other.
Nested with scheduling
Python source | JSON | Reconstructed source
A similar model as in Nested without scheduling with Conditions.
SimpleFN
Python source | JSON | Reconstructed source
An example with a single Node using the PsyNeuLink implementation of the FitzHugh–Nagumo model.
SimpleFN-timing
Python source | JSON | Reconstructed source
The same model as in SimpleFN with Conditions for timeline scheduling. Note: these conditions are still not fully implemented by the scheduler.
SimpleFN-conditional
Python source | JSON | Reconstructed source
The same model in SimpleFN with scheduling Conditions that mimic the behavior in SimpleFN-timing.
Stroop
Python source | JSON | Reconstructed source
A model representing the Stroop effect with conflict monitoring that uses Conditions.
PyTorch and MDF
Models can be created in PyTorch and exported into MDF format, or MDF models can be converted to code which executes natively in PyTorch.
MDF to PyTorch
To export an MDF model to PyTorch, provide an MDF model as an input to the mdf_to_pytorch() function.
The output of mdf_to_pytorch
is a PyTorch model.
mdf_to_pytorch(
mdf_model: model in MDF format
eval_models: Set Evaluation of model to True or False
version: MDF version
model_input: input file name
)
It returns a dictionary where key
= model name
and value
= PyTorch model object
.
A test script demonstrating conversion of MDF model to PyTorch is at MDF_to_PyTorch.py. This converts multiple MDF models to their respective PyTorch models. The converted models are available in folder: MDF_PyTorch.
Examples
Below are some working examples of this functionality.
1) Simple ABCD example
We convert one of the sample MDF examples ABCD.json:
This is converted to PyTorch and can be seen here: ABCD_pytorch.py.
The PyTorch model is further converted to ONNX ABCD.onnx. An image of the contents of the ONNX model (visualized using NETRON) is below.
2) Multi-Layer Perceptron MDF to PyTorch Conversion:
To run an example where a simple Multi-Layer Perceptron (MLP) created using the MDF specification and executed using sample digit-recognition data, run:
python mlp_pure_mdf.py
A graph of the network can be created with python mlp_pure_mdf.py -graph
:
The network can be run against images from the MNIST database with: python mlp_pure_mdf.py -run
, and produce 98% accuracy. The image below shows the results of 300 images:
PyTorch to MDF
The current implementation of our PyTorch to MDF conversion functionality is built
on top of the TorchScript infrastructure provided by PyTorch. PyTorch models that
can be translated to TorchScript (via torch.jit.script
or torch.jit.trace
) should
then be able to be converted to their MDF representation automatically. Below are
several working examples of this functionality.
To perform an PyTorch to MDF conversion, provide a PyTorch model as an input to the pytorch_to_mdf() function
which is available in importer.py. The output of pytorch_to_mdf()
is an MDF model.
pytorch_to_mdf(
model: The model to translate into MDF.
args: The input arguments for this model. If a nn.Module is passed then the model will be traced with these
inputs. If a ScriptModule is passed, they are still needed to deterimine input shapes.
trace: Force the use of tracing to compile the model. The default is to use torch.jit.script
use_onnx_ops: Use ONNX ops when possible, fallback to ATEN ops when not available. Default is True. If False,
use only ATEN ops.
)
Returns a translated MDF model.
Examples of usage
1) Simple PyTorch To MDF
This is a simple fully-connected neural network model example consisting of input image of 224 * 224 * 3 and resulting in two classes as the output To run an example of converting a PyTorch model written in PyTorch to its MDF representation simply run:
python simple_pytorch_to_mdf.py
Code is present in simple_pytorch_to_mdf.py The graph representation of the ONNX model can be generated with:
python simple_pytorch_to_mdf.py -graph-onnx
NOTE: This command will run the NETRON python server on the local host where we can export the graph as svg/png
The graph representation of the MDF model can be generated with:
python simple_pytorch_to_mdf.py -graph
Graphical export from MDF level 1:
Graphical export from MDF level 3:
To visualize the PyTorch model:
python simple_pytorch_to_mdf.py -graph-torch
The MDF for this model is the written to simple_pytorch_to_mdf.json. The model is then executed via the MDF scheduler and the results are compared to the native execution in PyTorch.
2) Inception Blocks Model
To run an example of converting a PyTorch InceptionV3 like model written in PyTorch to its MDF representation simply run:
python inception.py
Code is present in inception.py This will define the model in PyTorch, invoke the TorchScript tracing compiler, convert the underlying IR representation of the model to MDF. The MDF for this model is the written to inception.json. The model is then executed via the MDF scheduler and the results are compared to the native execution in PyTorch.
The graph representation of the MDF model can be generated with:
python inception.py -graph
Interactions between MDF and Quantum computing technologies
Starting summer 2021, we will develop tools for interfacing between MDF and quantum computers. This interface is motivated by expectations that quantum hardware will provide speedups for solving Ising-type MDF problems. We will address both gate- and annealing- based quantum computers:
for gate-based quantum computers, we will bridge from MDF to OpenQASM, the leading quantum Intermediate Representation.
for annealing-based quantum computers, we will target platforms such as D-Wave Ocean.
Our work will be agnostic to the exact quantum algorithm/solver used, though we will provide sample implementations using Variational Quantum Eigensolver (VQE) and Quantum Approximate Optimization Algorithm.
As a first step, we have begun developing implementations targeting quantum hardware for the key computations in several cognitive models as listed below. Next, we will extend MDF so that quantum implementations such as the ones we develop, can be expressed in it.
Tasks | Models | Key computations | Quantum algorithms |
---|---|---|---|
Two alternative forced choice | Quantum walk | Evolution, Projection | Unitary evolution, Hamiltonian simulation |
Multiple alternative models | Potential wells | Eigenstates and values | Variational methods (e.g., subspace and deflation) |
Bistable perception | Quantum walk | Evolution, projection | Unitary evolution, Hamiltonian simulation |
Control | Leaky Competing Integrator | Optimization | Quantum annealing |
Parameter estimation | Data fitting | Optimization | Quantum annealing |
MDF in WebGME
This contains a tool for converting the MDF specification into JSON compatible with JSON importer. This allows us to programmatically create a metamodel and, as a result, use WebGME as a design environment for MDF.
Quick Start
Starting WebGME app
First, install the mdf_gme following:
Second, start mongodb locally by running the mongod
executable in your mongodb installation
(you may need to create a data
directory or set --dbpath
).
Then, run webgme start
from the project root to start . Finally, navigate to http://localhost:8888
to start using
mdf_gme!
Loading the spec into WebGME
First, install dependencies with npm install
. Then convert the MDF specification using
node spec_to_gme.js path/to/MDF/spec.json
Finally, import the JSON into WebGME just like the examples (suffixed with “_meta”)!
Loading instances to and from WebGME importable JSON and MDF
node bin/instance_converter path/to/MDForGME/instance.json
Specification of standard functions in ModECI v0.4
Note: the ModECI MDF specification is still in development! See here for ongoing discussions. These functions are defined in Python API module modeci_mdf.functions.
Non-ONNX Functions
ONNX Functions
MatMul
Matrix multiplication (work in progress...)
MatMul(A, B) = A @ B
Python version: A @ B
Relu
Rectified linear function (work in progress...)
Relu(A) = A * (A > 0)
Python version: A * (A > 0)
change_goal
Modifies the current goal buffer using the given pattern.
change_goal(pattern, curr_goal) = actr.change_goal(pattern,curr_goal)
Python version: actr.change_goal(pattern,curr_goal)
check_termination
Function used to check if no production was selected.
check_termination(production) = actr.check_termination(production)
Python version: actr.check_termination(production)
chunk_to_string
Converts a chunk dictionary to a string format.
chunk_to_string(chunk) = actr.chunk_to_string(chunk)
Python version: actr.chunk_to_string(chunk)
conflict_resolution_function
ACT-R conflict resolution function. Currently selects a production at random from the already matched productions, since utility values and learning are not implemented yet.
conflict_resolution_function(productions) = actr.conflict_resolution_function(productions)
Python version: actr.conflict_resolution_function(productions)
cos
Cosine function
cos(variable0, scale) = scale * cos(variable0)
Python version: scale * numpy.cos(variable0)
cosh
Hyperbolic cosine function
cosh(variable0, scale) = scale * cosh(variable0)
Python version: scale * numpy.cosh(variable0)
drift_diffusion_integrator
Integrates the drift diffusion model for a single trial using and implementation of the using the Euler-Maruyama method. This is a proof of concept implementation and is not optimized for speed.
drift_diffusion_integrator(starting_point, non_decision_time, drift_rate, threshold, noise, dt) = ddm.drift_diffusion_integrator(starting_point,non_decision_time,drift_rate,threshold,noise,dt)
Python version: ddm.drift_diffusion_integrator(starting_point,non_decision_time,drift_rate,threshold,noise,dt)
exponential
Exponential function
exponential(variable0, scale, rate, bias, offset) = scale * exp((rate * variable0) + bias) + offset
Python version: scale * numpy.exp((rate * variable0) + bias) + offset
linear
A linear function, calculated from a slope and an intercept
linear(variable0, slope, intercept) = (variable0 * slope + intercept)
Python version: (variable0 * slope + intercept)
logistic
Logistic function
logistic(variable0, gain, bias, offset) = 1/(1 + exp(-1*gain*(variable0 + bias) + offset))
Python version: 1/(1 + numpy.exp(-1*gain*(variable0 + bias) + offset))
match_production
Returns True if the production's left hand side matches the given context and adds the matching bindings to the production.
match_production(production, context) = actr.match_production(production,context)
Python version: actr.match_production(production,context)
Abs
Absolute takes one input data (Tensor
Python version: onnx_ops.abs(X)
Acos
Calculates the arccosine (inverse of cosine) of the given input tensor, element-wise.
Python version: onnx_ops.anumpy.cos(input)
Acosh
Calculates the hyperbolic arccosine of the given input tensor element-wise.
Python version: onnx_ops.anumpy.cosh(input)
Add
Performs element-wise binary addition (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
(Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.
Python version: onnx_ops.add(A, B)
And
Returns the tensor resulted from performing the `and` logical operation elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.and(A, B)
ArgMax
Computes the indices of the max elements of the input tensor's element along the provided axis. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. If select_last_index is True (default False), the index of the last occurrence of the max is selected if the max appears more than once in the input. Otherwise the index of the first occurrence is selected. The type of the output tensor is integer.
Python version: onnx_ops.argmax(data, axis, keepdims, select_last_index)
ArgMin
Computes the indices of the min elements of the input tensor's element along the provided axis. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. If select_last_index is True (default False), the index of the last occurrence of the min is selected if the min appears more than once in the input. Otherwise the index of the first occurrence is selected. The type of the output tensor is integer.
Python version: onnx_ops.argmin(data, axis, keepdims, select_last_index)
Asin
Calculates the arcsine (inverse of sine) of the given input tensor, element-wise.
Python version: onnx_ops.anumpy.sin(input)
Asinh
Calculates the hyperbolic arcsine of the given input tensor element-wise.
Python version: onnx_ops.anumpy.sinh(input)
Atan
Calculates the arctangent (inverse of tangent) of the given input tensor, element-wise.
Python version: onnx_ops.anumpy.tan(input)
Atanh
Calculates the hyperbolic arctangent of the given input tensor element-wise.
Python version: onnx_ops.anumpy.tanh(input)
AveragePool
AveragePool consumes an input tensor X and applies average pooling across
the tensor according to kernel sizes, stride sizes, and pad lengths.
average pooling consisting of computing the average on all values of a
subset of the input tensor according to the kernel size and downsampling the
data into the output tensor Y for further processing. The output spatial shape will be following:
```
output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)
```
or
```
output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1)
```
if ceil_mode is enabled
* pad_shape[i] is sum of pads along axis i
auto_pad
is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
And pad shape will be following if SAME_UPPER
or SAME_LOWER
:
pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]
The output of each pooling window is divided by the number of elements (exclude pad when attribute count_include_pad is zero).
Python version: onnx_ops.averagepool(X, auto_pad, ceil_mode, count_include_pad, kernel_shape, pads, strides)
BatchNormalization
Carries out batch normalization as described in the paper
https://arxiv.org/abs/1502.03167. Depending on the mode it is being run,
There are five required inputs 'X', 'scale', 'B', 'input_mean' and
'input_var'.
Note that 'input_mean' and 'input_var' are expected to be the estimated
statistics in inference mode (training_mode=False, default),
and the running statistics in training mode (training_mode=True).
There are multiple cases for the number of outputs, which we list below:
Output case #1: Y, running_mean, running_var (training_mode=True) Output case #2: Y (training_mode=False)
When training_mode=False, extra outputs are invalid. The outputs are updated as follows when training_mode=True:
running_mean = input_mean * momentum + current_mean * (1 - momentum)
running_var = input_var * momentum + current_var * (1 - momentum)
Y = (X - current_mean) / sqrt(current_var + epsilon) * scale + B
where:
current_mean = ReduceMean(X, axis=all_except_channel_index)
current_var = ReduceVar(X, axis=all_except_channel_index)
Notice that ReduceVar
refers to the population variance, and it equals to
sum(sqrd(x_i - x_avg)) / N
where N
is the population size (this formula does not use sample size N - 1
).
The computation of ReduceMean and ReduceVar uses float to avoid overflow for float16 inputs.
When training_mode=False:
Y = (X - input_mean) / sqrt(input_var + epsilon) * scale + B
For previous (depreciated) non-spatial cases, implementors are suggested to flatten the input shape to (N x C * D1 * D2 * … * Dn) before a BatchNormalization Op. This operator has optional inputs/outputs. See the doc for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument’s name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.
Python version: onnx_ops.batchnormalization(X, scale, B, input_mean, input_var, epsilon, momentum, training_mode)
Bernoulli
Draws binary random numbers (0 or 1) from a Bernoulli distribution. The input tensor should be a tensor containing probabilities p (a value in the range [0,1]) to be used for drawing the binary random number, where an output of 1 is produced with probability p and an output of 0 is produced with probability (1-p).
This operator is non-deterministic and may not produce the same values in different implementations (even if a seed is specified).
Python version: onnx_ops.bernoulli(input, dtype, seed)
BitShift
Bitwise shift operator performs element-wise operation. For each input element, if the attribute "direction" is "RIGHT", this operator moves its binary representation toward the right side so that the input value is effectively decreased. If the attribute "direction" is "LEFT", bits of binary representation moves toward the left side, which results the increase of its actual value. The input X is the tensor to be shifted and another input Y specifies the amounts of shifting. For example, if "direction" is "Right", X is [1, 4], and S is [1, 1], the corresponding output Z would be [0, 2]. If "direction" is "LEFT" with X=[1, 2] and S=[1, 2], the corresponding output Y would be [2, 8].
Because this operator supports Numpy-style broadcasting, X’s and Y’s shapes are not necessarily identical. This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.bitshift(X, Y, direction)
Cast
The operator casts the elements of a given input tensor to a data type specified by the 'to' argument and returns an output tensor of the same size in the converted type. The 'to' argument must be one of the data types specified in the 'DataType' enum field in the TensorProto message.
Casting from string tensor in plain (e.g., “3.14” and “1000”) and scientific numeric representations (e.g., “1e-5” and “1E8”) to float types is supported. For example, converting string “100.5” to an integer may yield result 100. There are some string literals reserved for special floating-point values; “+INF” (and “INF”), “-INF”, and “NaN” are positive infinity, negative infinity, and not-a-number, respectively. Any string which can exactly match “+INF” in a case-insensitive way would be mapped to positive infinite. Similarly, this case-insensitive rule is applied to “INF” and “NaN”. When casting from numeric tensors to string tensors, plain floating-point representation (such as “314.15926”) would be used. Converting non-numerical-literal string such as “Hello World!” is an undefined behavior. Cases of converting string representing floating-point arithmetic value, such as “2.718”, to INT is an undefined behavior.
Conversion from a numerical type to any numerical type is always allowed. User must be aware of precision loss and value change caused by range difference between two types. For example, a 64-bit float 3.1415926459 may be round to a 32-bit float 3.141592. Similarly, converting an integer 36 to Boolean may produce 1 because we truncate bits which can’t be stored in the targeted type.
In more detail, the conversion among numerical types should follow these rules:
Casting from floating point to:
floating point: +/- infinity if OOR (out of range).
fixed point: undefined if OOR.
bool: +/- 0.0 to False; all else to True.
Casting from fixed point to:
floating point: +/- infinity if OOR. (+ infinity in the case of uint)
fixed point: when OOR, discard higher bits and reinterpret (with respect to two’s complement representation for signed types). For example, 200 (int16) -> -56 (int8).
bool: zero to False; nonzero to True.
Casting from bool to:
floating point:
{1.0, 0.0}
.fixed point:
{1, 0}
.bool: no change.
Python version: onnx_ops.cast(input, to)
CastLike
The operator casts the elements of a given input tensor (the first input) to the same data type as the elements of the second input tensor. See documentation of the Cast operator for further details.
Python version: onnx_ops.castlike(input, target_type)
Ceil
Ceil takes one input data (Tensor
Python version: onnx_ops.ceil(X)
Celu
Continuously Differentiable Exponential Linear Units:
Perform the linear unit element-wise on the input tensor X
using formula:
max(0,x) + min(0,alpha*(exp(x/alpha)-1))
Python version: onnx_ops.celu(X, alpha)
Clip
Clip operator limits the given input within an interval. The interval is specified by the inputs 'min' and 'max'. They default to numeric_limits::lowest() and numeric_limits::max(), respectively.
Python version: onnx_ops.clip(input, min, max)
Compress
Selects slices from an input tensor along a given axis where condition evaluates to True for each axis index. In case axis is not provided, input is flattened before elements are selected. Compress behaves like numpy.compress: https://docs.scipy.org/doc/numpy/reference/generated/numpy.compress.html
Python version: onnx_ops.compress(input, condition, axis)
Concat
Concatenate a list of tensors into a single tensor. All input tensors must have the same shape, except for the dimension size of the axis to concatenate on.
Python version: onnx_ops.concat(inputs, axis)
ConcatFromSequence
Concatenate a sequence of tensors into a single tensor. All input tensors must have the same shape, except for the dimension size of the axis to concatenate on. By default 'new_axis' is 0, the behavior is similar to numpy.concatenate. When 'new_axis' is 1, the behavior is similar to numpy.stack.
Python version: onnx_ops.concatfromsequence(input_sequence, axis, new_axis)
Constant
This operator produces a constant tensor. Exactly one of the provided attributes, either value, sparse_value, or value_* must be specified.
Python version: onnx_ops.constant(sparse_value, value, value_float, value_floats, value_int, value_ints, value_string, value_strings)
ConstantOfShape
Generate a tensor with given value and shape.
Python version: onnx_ops.constantofshape(input, value)
Conv
The convolution operator consumes an input tensor and a filter, and computes the output.
Python version: onnx_ops.conv(X, W, B, auto_pad, dilations, group, kernel_shape, pads, strides)
ConvInteger
The integer convolution operator consumes an input tensor, its zero-point, a filter, and its zero-point, and computes the output. The production MUST never overflow. The accumulation may overflow if and only if in 32 bits.
Python version: onnx_ops.convinteger(x, w, x_zero_point, w_zero_point, auto_pad, dilations, group, kernel_shape, pads, strides)
ConvTranspose
The convolution transpose operator consumes an input tensor and a filter, and computes the output.
If the pads parameter is provided the shape of the output is calculated via the following equation:
output_shape[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - pads[start_i] - pads[end_i]
output_shape can also be explicitly specified in which case pads values are auto generated using these equations:
total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - output_shape[i] If (auto_pads == SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2) Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2).
</i></p>
Python version: onnx_ops.convtranspose(X, W, B, auto_pad, dilations, group, kernel_shape, output_padding, output_shape, pads, strides)
Cos
Calculates the cosine of the given input tensor, element-wise.
Python version: onnx_ops.numpy.cos(input)
Cosh
Calculates the hyperbolic cosine of the given input tensor element-wise.
Python version: onnx_ops.numpy.cosh(input)
CumSum
Performs cumulative sum of the input elements along the given axis. By default, it will do the sum inclusively meaning the first element is copied as is. Through an `exclusive` attribute, this behavior can change to exclude the first element. It can also perform summation in the opposite direction of the axis. For that, set `reverse` attribute to 1.
Example:
input_x = [1, 2, 3]
axis=0
output = [1, 3, 6]
exclusive=1
output = [0, 1, 3]
exclusive=0
reverse=1
output = [6, 5, 3]
exclusive=1
reverse=1
output = [5, 3, 0]
Python version: onnx_ops.cumsum(x, axis, exclusive, reverse)
DepthToSpace
DepthToSpace rearranges (permutes) data from depth into blocks of spatial data.
This is the reverse transformation of SpaceToDepth. More specifically, this op outputs a copy of
the input tensor where values from the depth dimension are moved in spatial blocks to the height
and width dimensions. By default, `mode` = `DCR`.
In the DCR mode, elements along the depth dimension from the input tensor are rearranged in the
following order: depth, column, and then row. The output y is computed from the input x as below:
b, c, h, w = x.shape
tmp = np.reshape(x, [b, blocksize, blocksize, c // (blocksize**2), h, w])
tmp = np.transpose(tmp, [0, 3, 4, 1, 5, 2])
y = np.reshape(tmp, [b, c // (blocksize**2), h * blocksize, w * blocksize])
In the CRD mode, elements along the depth dimension from the input tensor are rearranged in the following order: column, row, and the depth. The output y is computed from the input x as below:
b, c, h, w = x.shape
tmp = np.reshape(x, [b, c // (blocksize ** 2), blocksize, blocksize, h, w])
tmp = np.transpose(tmp, [0, 1, 4, 2, 5, 3])
y = np.reshape(tmp, [b, c // (blocksize ** 2), h * blocksize, w * blocksize])
Python version: onnx_ops.depthtospace(input, blocksize, mode)
DequantizeLinear
The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization. `x_zero_point` and `x` must have same type. `x` and `y` must have same shape. In the case of dequantizing int32, there's no zero point (zero point is supposed to be 0).
Python version: onnx_ops.dequantizelinear(x, x_scale, x_zero_point, axis)
Det
Det calculates determinant of a square matrix or batches of square matrices. Det takes one input tensor of shape `[*, M, M]`, where `*` is zero or more batch dimensions, and the inner-most 2 dimensions form square matrices. The output is a tensor of shape `[*]`, containing the determinants of all input submatrices. e.g., When the input is 2-D, the output is a scalar(shape is empty: `[]`).
Python version: onnx_ops.det(X)
Div
Performs element-wise binary division (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
(Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.
Python version: onnx_ops.div(A, B)
Dropout
Dropout takes an input floating-point tensor, an optional input ratio (floating-point scalar) and an optional input training_mode (boolean scalar). It produces two tensor outputs,
output (floating-point tensor) and mask (optional `Tensor
Python version: onnx_ops.dropout(data, ratio, training_mode, seed)
DynamicQuantizeLinear
A Function to fuse calculation for Scale, Zero Point and FP32->8Bit convertion of FP32 Input data.
Outputs Scale, ZeroPoint and Quantized Input for a given FP32 Input.
Scale is calculated as:
```
y_scale = (max(x) - min(x))/(qmax - qmin)
```
where qmax and qmin are max and min values for quantization range .i.e [0, 255] in case of uint8 data range is adjusted to include 0.
Zero point is calculated as:
intermediate_zero_point = qmin - min(x)/y_scale
y_zero_point = cast(round(saturate(itermediate_zero_point)))
where qmax and qmin are max and min values for quantization range .i.e [0, 255] in case of uint8
for saturation, it saturates to [0, 255] if it’s uint8, or [-127, 127] if it’s int8. Right now only uint8 is supported.
rounding to nearest ties to even.
Data quantization formula is:
y = saturate (round (x / y_scale) + y_zero_point)
for saturation, it saturates to [0, 255] if it’s uint8, or [-127, 127] if it’s int8. Right now only uint8 is supported.
rounding to nearest ties to even.
Python version: onnx_ops.dynamicquantizelinear(x)
Einsum
An einsum of the form `term1, term2 -> output-term` produces an output tensor using the following equation
output[output-term] = reduce-sum( input1[term1] * input2[term] )
where the reduce-sum performs a summation over all the indices occurring in the input terms (term1, term2) that do not occur in the output-term.
The Einsum operator evaluates algebraic tensor operations on a sequence of tensors, using the Einstein summation convention. The equation string contains a comma-separated sequence of lower case letters. Each term corresponds to an operand tensor, and the characters within the terms correspond to operands dimensions.
This sequence may be followed by “->” to separate the left and right hand side of the equation. If the equation contains “->” followed by the right-hand side, the explicit (not classical) form of the Einstein summation is performed, and the right-hand side indices indicate output tensor dimensions. In other cases, output indices are (implicitly) set to the alphabetically sorted sequence of indices appearing exactly once in the equation.
When a dimension character is repeated in the left-hand side, it represents summation along the dimension.
The equation may contain ellipsis (”…”) to enable broadcasting. Ellipsis must indicate a fixed number of dimensions. Specifically, every occurrence of ellipsis in the equation must represent the same number of dimensions. The right-hand side may contain exactly one ellipsis. In implicit mode, the ellipsis dimensions are set to the beginning of the output. The equation string may contain space (U+0020) character.
Python version: onnx_ops.einsum(Inputs, equation)
Elu
Elu takes one input data (Tensor
Python version: onnx_ops.elu(X, alpha)
Equal
Returns the tensor resulted from performing the `equal` logical operation elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.equal(A, B)
Erf
Computes the error function of the given input tensor element-wise.
Python version: onnx_ops.erf(input)
Exp
Calculates the exponential of the given input tensor, element-wise.
Python version: onnx_ops.numpy.exp(input)
Expand
Broadcast the input tensor following the given shape and the broadcast rule. The broadcast rule is similar to numpy.array(input) * numpy.ones(shape): Dimensions are right alignment; Two corresponding dimensions must have the same value, or one of them is equal to 1. Also, this operator is similar to numpy.broadcast_to(input, shape), but the major difference is numpy.broadcast_to() does not allow shape to be smaller than input.size(). It is possible that the output.shape is not equal to shape, when some dimensions in shape is equal to 1, or the shape.ndim < input.shape.ndim.
Python version: onnx_ops.expand(input, shape)
EyeLike
Generate a 2D tensor (matrix) with ones on the diagonal and zeros everywhere else. Only 2D tensors are supported, i.e. input T1 must be of rank 2. The shape of the output tensor is the same as the input tensor. The data type can be specified by the 'dtype' argument. If 'dtype' is not specified, then the type of input tensor is used. By default, the main diagonal is populated with ones, but attribute 'k' can be used to populate upper or lower diagonals. The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the TensorProto message and be valid as an output type.
Python version: onnx_ops.eyelike(input, dtype, k)
Flatten
Flattens the input tensor into a 2D matrix. If input tensor has shape (d_0, d_1, ... d_n) then the output will have shape (d_0 X d_1 ... d_(axis-1), d_axis X d_(axis+1) ... X dn).
Python version: onnx_ops.flatten(input, axis)
Floor
Floor takes one input data (Tensor
Python version: onnx_ops.floor(X)
GRU
Computes an one-layer GRU. This operator is usually supported via some custom implementation such as CuDNN.
Notations:
X
- input tensorz
- update gater
- reset gateh
- hidden gatet
- time step (t-1 means previous time step)W[zrh]
- W parameter weight matrix for update, reset, and hidden gatesR[zrh]
- R recurrence weight matrix for update, reset, and hidden gatesWb[zrh]
- W bias vectors for update, reset, and hidden gatesRb[zrh]
- R bias vectors for update, reset, and hidden gatesWB[zrh]
- W parameter weight matrix for backward update, reset, and hidden gatesRB[zrh]
- R recurrence weight matrix for backward update, reset, and hidden gatesWBb[zrh]
- W bias vectors for backward update, reset, and hidden gatesRBb[zrh]
- R bias vectors for backward update, reset, and hidden gatesH
- Hidden statenum_directions
- 2 if direction == bidirectional else 1
Activation functions:
Relu(x) - max(0, x)
Tanh(x) - (1 - e^{-2x})/(1 + e^{-2x})
Sigmoid(x) - 1/(1 + e^{-x})
NOTE: Below are optional
Affine(x) - alpha * x + beta
LeakyRelu(x) - x if x >= 0 else alpha * x
ThresholdedRelu(x) - x if x >= alpha else 0
ScaledTanh(x) - alpha * Tanh(beta * x)
HardSigmoid(x) - min(max(alpha * x + beta, 0), 1)
Elu(x) - x if x >= 0 else alpha * (e^x - 1)
Softsign(x) - x/(1 + |x|)
Softplus(x) - log(1 + e^x)
Equations (Default: f=Sigmoid, g=Tanh):
zt = f(Xt*(Wz^T) + Ht-1*(Rz^T) + Wbz + Rbz)
rt = f(Xt*(Wr^T) + Ht-1*(Rr^T) + Wbr + Rbr)
ht = g(Xt*(Wh^T) + (rt (.) Ht-1)*(Rh^T) + Rbh + Wbh) # default, when linear_before_reset = 0
ht = g(Xt*(Wh^T) + (rt (.) (Ht-1*(Rh^T) + Rbh)) + Wbh) # when linear_before_reset != 0
Ht = (1 - zt) (.) ht + zt (.) Ht-1 This operator has optional inputs/outputs. See the doc for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument’s name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.
Python version: onnx_ops.gru(X, W, R, B, sequence_lens, initial_h, activation_alpha, activation_beta, activations, clip, direction, hidden_size, layout, linear_before_reset)
Gather
Given `data` tensor of rank r >= 1, and `indices` tensor of rank q, gather entries of the axis dimension of `data` (by default outer-most one as axis=0) indexed by `indices`, and concatenates them in an output tensor of rank q + (r - 1).
If axis = 0
, let k = indices[i_{0}, ..., i_{q-1}]
then output[i_{0}, ..., i_{q-1}, j_{0}, ..., j_{r-2}] = input[k , j_{0}, ..., j_{r-2}]
:
data = [
[1.0, 1.2],
[2.3, 3.4],
[4.5, 5.7],
]
indices = [
[0, 1],
[1, 2],
]
output = [
[
[1.0, 1.2],
[2.3, 3.4],
],
[
[2.3, 3.4],
[4.5, 5.7],
],
]
If axis = 1
, let k = indices[i_{0}, ..., i_{q-1}]
then output[j_{0}, i_{0}, ..., i_{q-1}, j_{1}, ..., j_{r-2}] = input[j_{0}, k, j_{1}, ..., j_{r-2}]
:
data = [
[1.0, 1.2, 1.9],
[2.3, 3.4, 3.9],
[4.5, 5.7, 5.9],
]
indices = [
[0, 2],
]
axis = 1,
output = [
[[1.0, 1.9]],
[[2.3, 3.9]],
[[4.5, 5.9]],
]
Python version: onnx_ops.gather(data, indices, axis)
GatherElements
GatherElements takes two inputs data
and indices
of the same rank r >= 1
and an optional attribute axis
that identifies an axis of data
(by default, the outer-most axis, that is axis 0). It is an indexing operation
that produces its output by indexing into the input data tensor at index
positions determined by elements of the indices
tensor.
Its output shape is the same as the shape of indices
and consists of one value
(gathered from the data
) for each element in indices
.
For instance, in the 3-D case (r = 3), the output produced is determined by the following equations:
out[i][j][k] = input[index[i][j][k]][j][k] if axis = 0,
out[i][j][k] = input[i][index[i][j][k]][k] if axis = 1,
out[i][j][k] = input[i][j][index[i][j][k]] if axis = 2,
This operator is also the inverse of ScatterElements. It is similar to Torch’s gather operation.
Example 1:
data = [
[1, 2],
[3, 4],
]
indices = [
[0, 0],
[1, 0],
]
axis = 1
output = [
[1, 1],
[4, 3],
]
Example 2:
data = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
]
indices = [
[1, 2, 0],
[2, 0, 0],
]
axis = 0
output = [
[4, 8, 3],
[7, 2, 3],
]
Python version: onnx_ops.gatherelements(data, indices, axis)
GatherND
Given `data` tensor of rank `r` >= 1, `indices` tensor of rank `q` >= 1, and `batch_dims` integer `b`, this operator gathers slices of `data` into an output tensor of rank `q + r - indices_shape[-1] - 1 - b`.
indices
is an q-dimensional integer tensor, best thought of as a (q-1)
-dimensional tensor of index-tuples into data
,
where each element defines a slice of data
batch_dims
(denoted as b
) is an integer indicating the number of batch dimensions, i.e the leading b
number of dimensions of
data
tensor and indices
are representing the batches, and the gather starts from the b+1
dimension.
Some salient points about the inputs’ rank and shape:
r >= 1 and q >= 1 are to be honored. There is no dependency condition to be met between ranks
r
andq
The first
b
dimensions of the shape ofindices
tensor anddata
tensor must be equal.b < min(q, r) is to be honored.
The
indices_shape[-1]
should have a value between 1 (inclusive) and rankr-b
(inclusive)All values in
indices
are expected to be within bounds [-s, s-1] along axis of sizes
(i.e.)-data_shape[i] <= indices[...,i] <= data_shape[i] - 1
. It is an error if any of the index values are out of bounds.
The output is computed as follows:
The output tensor is obtained by mapping each index-tuple in the indices
tensor to the corresponding slice of the input data
.
If
indices_shape[-1] > r-b
=> error conditionIf
indices_shape[-1] == r-b
, since the rank ofindices
isq
,indices
can be thought of asN
(q-b-1)
-dimensional tensors containing 1-D tensors of dimensionr-b
, whereN
is an integer equals to the product of 1 and all the elements in the batch dimensions of the indices_shape. Let us think of each suchr-b
ranked tensor asindices_slice
. Each scalar value corresponding todata[0:b-1,indices_slice]
is filled into the corresponding location of the(q-b-1)
-dimensional tensor to form theoutput
tensor (Example 1 below)If
indices_shape[-1] < r-b
, since the rank ofindices
isq
,indices
can be thought of asN
(q-b-1)
-dimensional tensor containing 1-D tensors of dimension< r-b
. Let us think of each such tensors asindices_slice
. Each tensor slice corresponding todata[0:b-1, indices_slice , :]
is filled into the corresponding location of the(q-b-1)
-dimensional tensor to form theoutput
tensor (Examples 2, 3, 4 and 5 below)
This operator is the inverse of ScatterND
.
Example 1
batch_dims = 0
data = [[0,1],[2,3]] # data_shape = [2, 2]
indices = [[0,0],[1,1]] # indices_shape = [2, 2]
output = [0,3] # output_shape = [2]
Example 2
batch_dims = 0
data = [[0,1],[2,3]] # data_shape = [2, 2]
indices = [[1],[0]] # indices_shape = [2, 1]
output = [[2,3],[0,1]] # output_shape = [2, 2]
Example 3
batch_dims = 0
data = [[[0,1],[2,3]],[[4,5],[6,7]]] # data_shape = [2, 2, 2]
indices = [[0,1],[1,0]] # indices_shape = [2, 2]
output = [[2,3],[4,5]] # output_shape = [2, 2]
Example 4
batch_dims = 0
data = [[[0,1],[2,3]],[[4,5],[6,7]]] # data_shape = [2, 2, 2]
indices = [[[0,1]],[[1,0]]] # indices_shape = [2, 1, 2]
output = [[[2,3]],[[4,5]]] # output_shape = [2, 1, 2]
Example 5
batch_dims = 1
data = [[[0,1],[2,3]],[[4,5],[6,7]]] # data_shape = [2, 2, 2]
indices = [[1],[0]] # indices_shape = [2, 1]
output = [[2,3],[4,5]] # output_shape = [2, 2]
Python version: onnx_ops.gathernd(data, indices, batch_dims)
Gemm
General Matrix multiplication:
https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_3
A’ = transpose(A) if transA else A B’ = transpose(B) if transB else B
Compute Y = alpha * A’ * B’ + beta * C, where input tensor A has shape (M, K) or (K, M), input tensor B has shape (K, N) or (N, K), input tensor C is broadcastable to shape (M, N), and output tensor Y has shape (M, N). A will be transposed before doing the computation if attribute transA is non-zero, same for B and transB. This operator supports unidirectional broadcasting (tensor C should be unidirectional broadcastable to tensor A * B); for more details please check the doc. This operator has optional inputs/outputs. See the doc for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument’s name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.
Python version: onnx_ops.gemm(A, B, C, alpha, beta, transA, transB)
GlobalAveragePool
GlobalAveragePool consumes an input tensor X and applies average pooling across the values in the same channel. This is equivalent to AveragePool with kernel size equal to the spatial dimension of input tensor.
Python version: onnx_ops.globalaveragepool(X)
GlobalLpPool
GlobalLpPool consumes an input tensor X and applies lp pool pooling across the values in the same channel. This is equivalent to LpPool with kernel size equal to the spatial dimension of input tensor.
Python version: onnx_ops.globallppool(X, p)
GlobalMaxPool
GlobalMaxPool consumes an input tensor X and applies max pooling across the values in the same channel. This is equivalent to MaxPool with kernel size equal to the spatial dimension of input tensor.
Python version: onnx_ops.globalmaxpool(X)
Greater
Returns the tensor resulted from performing the `greater` logical operation elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.greater(A, B)
GreaterOrEqual
Returns the tensor resulted from performing the `greater_equal` logical operation elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.greaterorequal(A, B)
HardSigmoid
HardSigmoid takes one input data (Tensor
Python version: onnx_ops.hardsigmoid(X, alpha, beta)
HardSwish
HardSwish takes one input data (Tensor
Python version: onnx_ops.hardswish(X)
Hardmax
The operator computes the hardmax values for the given input:
Hardmax(element in input, axis) = 1 if the element is the first maximum value along the specified axis, 0 otherwise
The “axis” attribute indicates the dimension along which Hardmax will be performed. The output tensor has the same shape and contains the Hardmax values of the corresponding input.
Python version: onnx_ops.hardmax(input, axis)
Identity
Identity operator
Python version: onnx_ops.identity(input)
InstanceNormalization
Carries out instance normalization as described in the paper https://arxiv.org/abs/1607.08022.
y = scale * (x - mean) / sqrt(variance + epsilon) + B, where mean and variance are computed per instance per channel.
Python version: onnx_ops.instancenormalization(input, scale, B, epsilon)
IsInf
Map infinity to true and other values to false.
Python version: onnx_ops.isinf(X, detect_negative, detect_positive)
IsNaN
Returns which elements of the input are NaN.
Python version: onnx_ops.isnan(X)
LRN
Local Response Normalization proposed in the [AlexNet paper](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf). It normalizes over local input regions. The local region is defined across the channels. For an element `X[n, c, d1, ..., dk]` in a tensor of shape `(N x C x D1 x D2, ..., Dk)`, its region is `{X[n, i, d1, ..., dk] | max(0, c - floor((size - 1) / 2)) <= i <= min(C - 1, c + ceil((size - 1) / 2))}`.
square_sum[n, c, d1, ..., dk] = sum(X[n, i, d1, ..., dk] ^ 2)
,
where max(0, c - floor((size - 1) / 2)) <= i <= min(C - 1, c + ceil((size - 1) / 2))
.
Y[n, c, d1, ..., dk] = X[n, c, d1, ..., dk] / (bias + alpha / size * square_sum[n, c, d1, ..., dk] ) ^ beta
Python version: onnx_ops.lrn(X, alpha, beta, bias, size)
LSTM
Computes an one-layer LSTM. This operator is usually supported via some custom implementation such as CuDNN.
Notations:
X
- input tensori
- input gateo
- output gatef
- forget gatec
- cell gatet
- time step (t-1 means previous time step)W[iofc]
- W parameter weight matrix for input, output, forget, and cell gatesR[iofc]
- R recurrence weight matrix for input, output, forget, and cell gatesWb[iofc]
- W bias vectors for input, output, forget, and cell gatesRb[iofc]
- R bias vectors for input, output, forget, and cell gatesP[iof]
- P peephole weight vector for input, output, and forget gatesWB[iofc]
- W parameter weight matrix for backward input, output, forget, and cell gatesRB[iofc]
- R recurrence weight matrix for backward input, output, forget, and cell gatesWBb[iofc]
- W bias vectors for backward input, output, forget, and cell gatesRBb[iofc]
- R bias vectors for backward input, output, forget, and cell gatesPB[iof]
- P peephole weight vector for backward input, output, and forget gatesH
- Hidden statenum_directions
- 2 if direction == bidirectional else 1
Activation functions:
Relu(x) - max(0, x)
Tanh(x) - (1 - e^{-2x})/(1 + e^{-2x})
Sigmoid(x) - 1/(1 + e^{-x})
NOTE: Below are optional
Affine(x) - alpha*x + beta
LeakyRelu(x) - x if x >= 0 else alpha * x
ThresholdedRelu(x) - x if x >= alpha else 0
ScaledTanh(x) - alphaTanh(betax)
HardSigmoid(x) - min(max(alpha*x + beta, 0), 1)
Elu(x) - x if x >= 0 else alpha*(e^x - 1)
Softsign(x) - x/(1 + |x|)
Softplus(x) - log(1 + e^x)
Equations (Default: f=Sigmoid, g=Tanh, h=Tanh):
it = f(Xt*(Wi^T) + Ht-1*(Ri^T) + Pi (.) Ct-1 + Wbi + Rbi)
ft = f(Xt*(Wf^T) + Ht-1*(Rf^T) + Pf (.) Ct-1 + Wbf + Rbf)
ct = g(Xt*(Wc^T) + Ht-1*(Rc^T) + Wbc + Rbc)
Ct = ft (.) Ct-1 + it (.) ct
ot = f(Xt*(Wo^T) + Ht-1*(Ro^T) + Po (.) Ct + Wbo + Rbo)
Ht = ot (.) h(Ct) This operator has optional inputs/outputs. See the doc for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument’s name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.
Python version: onnx_ops.lstm(X, W, R, B, sequence_lens, initial_h, initial_c, P, activation_alpha, activation_beta, activations, clip, direction, hidden_size, input_forget, layout)
LeakyRelu
LeakyRelu takes input data (Tensor
Python version: onnx_ops.leakyrelu(X, alpha)
Less
Returns the tensor resulted from performing the `less` logical operation elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.less(A, B)
LessOrEqual
Returns the tensor resulted from performing the `less_equal` logical operation elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.lessorequal(A, B)
Log
Calculates the natural log of the given input tensor, element-wise.
Python version: onnx_ops.log(input)
LogSoftmax
The operator computes the log of softmax values for the given input:
LogSoftmax(input, axis) = Log(Softmax(input, axis=axis))
The “axis” attribute indicates the dimension along which LogSoftmax will be performed. The output tensor has the same shape and contains the LogSoftmax values of the corresponding input.
Python version: onnx_ops.logsoftmax(input, axis)
LpNormalization
Given a matrix, apply Lp-normalization along the provided axis.
Python version: onnx_ops.lpnormalization(input, axis, p)
LpPool
LpPool consumes an input tensor X and applies Lp pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. Lp pooling consisting of computing the Lp norm on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing.
Python version: onnx_ops.lppool(X, auto_pad, kernel_shape, p, pads, strides)
MatMul
Matrix product that behaves like numpy.matmul: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html
Python version: onnx_ops.matmul(A, B)
MatMulInteger
Matrix product that behaves like numpy.matmul: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html. The production MUST never overflow. The accumulation may overflow if and only if in 32 bits.
Python version: onnx_ops.matmulinteger(A, B, a_zero_point, b_zero_point)
Max
Element-wise max of each of the input tensors (with Numpy-style broadcasting support). All inputs and outputs must have the same data type. This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).
Python version: onnx_ops.max(data_0)
MaxPool
MaxPool consumes an input tensor X and applies max pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. max pooling consisting of computing the max on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape will be following: ``` output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1) ``` or ``` output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i] + 1) ``` if ceil_mode is enabled `pad_shape[i]` is the sum of pads along axis `i`.
auto_pad
is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
And pad shape will be following if SAME_UPPER
or SAME_LOWER
:
pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]
The output of each pooling window is maximum number of elements exclude pad.
Python version: onnx_ops.maxpool(X, auto_pad, ceil_mode, dilations, kernel_shape, pads, storage_order, strides)
MaxRoiPool
ROI max pool consumes an input tensor X and region of interests (RoIs) to apply max pooling across each RoI, to produce output 4-D tensor of shape (num_rois, channels, pooled_shape[0], pooled_shape[1]).
Python version: onnx_ops.maxroipool(X, rois, pooled_shape, spatial_scale)
MaxUnpool
MaxUnpool essentially computes the partial inverse of the MaxPool op. The input information to this op is typically the output information from a MaxPool op. The first input tensor X is the tensor that needs to be unpooled, which is typically the pooled tensor (first output) from MaxPool. The second input tensor, I, contains the indices to the (locally maximal) elements corrsponding to the elements in the first input tensor X. Input tensor I is typically the second output of the MaxPool op. The third (optional) input is a tensor that specifies the output size of the unpooling operation.
MaxUnpool is intended to do ‘partial’ inverse of the MaxPool op. ‘Partial’ because all the non-maximal values from the original input to MaxPool are set to zero in the output of the MaxUnpool op. Pooling the result of an unpooling operation should give back the original input to the unpooling op.
MaxUnpool can produce the same output size for several input sizes, which makes unpooling op ambiguous. The third input argument, output_size, is meant to disambiguate the op and produce output tensor of known/predictable size.
In addition to the inputs, MaxUnpool takes three attributes, namely kernel_shape, strides, and pads, which define the exact unpooling op. The attributes typically have the same values as the corrsponding pooling op that the unpooling op is trying to invert.
Python version: onnx_ops.maxunpool(X, I, output_shape, kernel_shape, pads, strides)
Mean
Element-wise mean of each of the input tensors (with Numpy-style broadcasting support). All inputs and outputs must have the same data type. This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).
Python version: onnx_ops.mean(data_0)
MeanVarianceNormalization
A MeanVarianceNormalization Function: Perform mean variance normalization on the input tensor X using formula: `(X-EX)/sqrt(E(X-EX)^2)`
Python version: onnx_ops.meanvariancenormalization(X, axes)
Min
Element-wise min of each of the input tensors (with Numpy-style broadcasting support). All inputs and outputs must have the same data type. This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).
Python version: onnx_ops.min(data_0)
Mod
Performs element-wise binary modulus (with Numpy-style broadcasting support). The sign of the remainder is the same as that of the Divisor.
Mod operator can also behave like C fmod() or numpy.fmod. In this case, the sign of the remainder however, will be the same as the Dividend (in contrast to integer mod). To force a behavior like numpy.fmod() an ‘fmod’ Attribute is provided. This attribute is set to 0 by default causing the behavior to be like integer mod. Setting this attribute to 1 causes the remainder to be calculated similar to that of numpy.fmod().
If the input type is floating point, then fmod
attribute must be set to 1.
In case of dividend being zero, the results will be platform dependent.
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.mod(A, B, fmod)
Mul
Performs element-wise binary multiplication (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
(Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.
Python version: onnx_ops.mul(A, B)
Multinomial
Generate a tensor of samples from a multinomial distribution according to the probabilities of each of the possible outcomes.
Python version: onnx_ops.multinomial(input, dtype, sample_size, seed)
Neg
Neg takes one input data (Tensor
Python version: onnx_ops.neg(X)
NegativeLogLikelihoodLoss
A NegativeLogLikelihoodLoss operator computes (weighted) negative log likelihood loss.
Its "input" tensor has the shape of (N, C, d1, d2, ..., dk) where k >= 0.
The "input" tensor contains log-probabilities for input[n, :, d_1, d_2,..., d_k] being in a class of [0, C).
The operator's "target" input tensor has the shape of (N, d1, d2, ..., dk). It encodes class labels (one of C classes)
or it may contain a special value (indicated by an attribute ignore_index) for N x d1 x d2 x ... x dk samples.
The loss value for input[n, :, d_1, d_2,...d_k] being classified as class c = target[n][d_1][d_2]...[d_k] is computed as:
loss[n][d_1][d_2]...[d_k] = -input[n][c][d_1][d_2]...[d_k].
When an optional “weight” is provided, the sample loss is calculated as:
loss[n][d_1][d_2]...[d_k] = -input[n][c][d_1][d_2]...[d_k] * weight[c].
loss is zero for the case when target-value equals ignore_index.
loss[n][d_1][d_2]...[d_k] = 0, when target[n][d_1][d_2]...[d_k] = ignore_index
If “reduction” attribute is set to “none”, the operator’s output will be the above loss with shape (N, d1, d2, …, dk). If “reduction” attribute is set to “mean” (the default attribute value), the output loss is (weight) averaged:
mean(loss), if "weight" is not provided,
or if weight is provided,
sum(loss) / sum(weight[target[n][d_1][d_2]...[d_k]]]), for all samples.
If “reduction” attribute is set to “sum”, the output is a scalar: sum(loss)
.
See also https://pytorch.org/docs/stable/nn.html#torch.nn.NLLLoss.
Example 1:
// negative log likelihood loss, "none" reduction
N, C, d1 = 2, 3, 2
input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
[[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
target = [[2, 1], [0, 2]]
loss = np.zeros((N, d1))
for n in range(N):
for d_1 in range(d1):
c = target[n][d_1]
loss[n][d_1] = -input[n][c][d_1]
// print(loss)
// [[-3. -2.]
// [-0. -2.]]
Example 2:
// weighted negative log likelihood loss, sum reduction
N, C, d1 = 2, 3, 2
input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
[[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
target = [[2, 1], [0, 2]]
weight = [0.2, 0.3, 0.1]
loss = np.zeros((N, d1))
for n in range(N):
for d_1 in range(d1):
c = target[n][d_1]
loss[n][d_1] = -input[n][c][d_1] * weight[c]
loss = np.sum(loss)
// print(loss)
// -1.1
Example 3:
// weighted negative log likelihood loss, mean reduction
N, C, d1 = 2, 3, 2
input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
[[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
target = [[2, 1], [0, 2]]
weight = [0.2, 0.3, 0.1]
loss = np.zeros((N, d1))
weight_total = 0
for n in range(N):
for d_1 in range(d1):
c = target[n][d_1]
loss[n][d_1] = -input[n][c][d_1] * weight[c]
weight_total = weight_total + weight[c]
loss = np.sum(loss) / weight_total
// print(loss)
// -1.57
Python version: onnx_ops.negativeloglikelihoodloss(input, target, weight, ignore_index, reduction)
NonMaxSuppression
Filter out boxes that have high intersection-over-union (IOU) overlap with previously selected boxes. Bounding boxes with score less than score_threshold are removed. Bounding box format is indicated by attribute center_point_box. Note that this algorithm is agnostic to where the origin is in the coordinate system and more generally is invariant to orthogonal transformations and translations of the coordinate system; thus translating or reflections of the coordinate system result in the same boxes being selected by the algorithm. The selected_indices output is a set of integers indexing into the input collection of bounding boxes representing the selected boxes. The bounding box coordinates corresponding to the selected indices can then be obtained using the Gather or GatherND operation.
Python version: onnx_ops.nonmaxsuppression(boxes, scores, max_output_boxes_per_class, iou_threshold, score_threshold, center_point_box)
NonZero
Returns the indices of the elements that are non-zero (in row-major order - by dimension). NonZero behaves similar to numpy.nonzero: https://docs.scipy.org/doc/numpy/reference/generated/numpy.nonzero.html, but for scalar input, NonZero produces output shape (0, N) instead of (1, N), which is different from Numpy's behavior.
Python version: onnx_ops.nonzero(X)
Not
Returns the negation of the input tensor element-wise.
Python version: onnx_ops.not(X)
OneHot
Produces a one-hot tensor based on inputs.
The locations represented by the index values in the 'indices' input tensor will have 'on_value'
and the other locations will have 'off_value' in the output tensor, where 'on_value' and 'off_value'
are specified as part of required input argument 'values', which is a two-element tensor of format
[off_value, on_value]. The rank of the output tensor will be one greater than the rank of the
input tensor. The additional dimension is for one-hot representation. The additional dimension will
be inserted at the position specified by 'axis'. If 'axis' is not specified then then additional
dimension will be inserted as the innermost dimension, i.e. axis=-1. The size of the additional
dimension is specified by required scalar input 'depth'. The type of the output tensor is the same
as the type of the 'values' input. Any entries in the 'indices' input tensor with values outside
the range [-depth, depth-1] will result in one-hot representation with all 'off_value' values in the
output tensor.
when axis = 0:
output[input[i, j, k], i, j, k] = 1 for all i, j, k and 0 otherwise.
when axis = -1:
output[i, j, k, input[i, j, k]] = 1 for all i, j, k and 0 otherwise.
Python version: onnx_ops.onehot(indices, depth, values, axis)
Optional
Constructs an optional-type value containing either an empty optional of a certain type specified by the attribute, or a non-empty value containing the input element.
Python version: onnx_ops.optional(input, type)
OptionalGetElement
Outputs the element in the optional-type input. It is an error if the input value does not have an element and the behavior is undefined in this case.
Python version: onnx_ops.optionalgetelement(input)
OptionalHasElement
Returns true if the optional-type input contains an element. If it is an empty optional-type, this op returns false.
Python version: onnx_ops.optionalhaselement(input)
Or
Returns the tensor resulted from performing the `or` logical operation elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.or(A, B)
PRelu
PRelu takes input data (Tensor
Python version: onnx_ops.prelu(X, slope)
Pad
Given a tensor containing the data to be padded (`data`), a tensor containing the number of start and end pad values for axis (`pads`), (optionally) a `mode`, and (optionally) `constant_value`, a padded tensor (`output`) is generated.
The three supported modes
are (similar to corresponding modes supported by numpy.pad
):
constant
(default) - pads with a given constant value as specified byconstant_value
(which defaults to 0, empty string, or False)reflect
- pads with the reflection of the vector mirrored on the first and last values of the vector along each axisedge
- pads with the edge values of array
Example 1 (constant
mode):
Insert 0 pads to the beginning of the second dimension.
data = [ [1.0, 1.2], [2.3, 3.4], [4.5, 5.7], ]
pads = [0, 2, 0, 0]
mode = ‘constant’
constant_value = 0.0
output = [ [0.0, 0.0, 1.0, 1.2], [0.0, 0.0, 2.3, 3.4], [0.0, 0.0, 4.5, 5.7], ]
Example 2 (reflect
mode):
data =
[
[1.0, 1.2],
[2.3, 3.4],
[4.5, 5.7],
]
pads = [0, 2, 0, 0]
mode = ‘reflect’
output = [ [1.0, 1.2, 1.0, 1.2], [2.3, 3.4, 2.3, 3.4], [4.5, 5.7, 4.5, 5.7], ]
Example 3 (edge
mode):
data =
[
[1.0, 1.2],
[2.3, 3.4],
[4.5, 5.7],
]
pads = [0, 2, 0, 0]
mode = ‘edge’
output = [ [1.0, 1.0, 1.0, 1.2], [2.3, 2.3, 2.3, 3.4], [4.5, 4.5, 4.5, 5.7], ]
Python version: onnx_ops.pad(data, pads, constant_value, mode)
Pow
Pow takes input data (Tensor
Python version: onnx_ops.pow(X, Y)
QLinearConv
The convolution operator consumes a quantized input tensor, its scale and zero point, a quantized filter, its scale and zero point, and output's scale and zero point, and computes the quantized output. Each scale and zero-point pair must have same shape. It means they must be either scalars (per tensor) or 1-D tensors (per output channel). Each input or output and its related zero point must have same type. When bias is present it must be quantized using scale = input scale * weight scale and zero point as 0.
Python version: onnx_ops.qlinearconv(x, x_scale, x_zero_point, w, w_scale, w_zero_point, y_scale, y_zero_point, B, auto_pad, dilations, group, kernel_shape, pads, strides)
QLinearMatMul
Matrix product that behaves like numpy.matmul: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html. It consumes two quantized input tensors, their scales and zero points, scale and zero point of output, and computes the quantized output. The quantization formula is y = saturate((x / y_scale) + y_zero_point). For (x / y_scale), it is rounding to nearest ties to even. Refer to https://en.wikipedia.org/wiki/Rounding for details. Scale and zero point must have same shape. They must be either scalar (per tensor) or N-D tensor (per row for 'a' and per column for 'b'). Scalar refers to per tensor quantization whereas N-D refers to per row or per column quantization. If the input is 2D of shape [M, K] then zero point and scale tensor may be an M element vector [v_1, v_2, ..., v_M] for per row quantization and K element vector of shape [v_1, v_2, ..., v_K] for per column quantization. If the input is N-D tensor with shape [D1, D2, M, K] then zero point and scale tensor may have shape [D1, D2, M, 1] for per row quantization and shape [D1, D2, 1, K] for per column quantization. Production must never overflow, and accumulation may overflow if and only if in 32 bits.
Python version: onnx_ops.qlinearmatmul(a, a_scale, a_zero_point, b, b_scale, b_zero_point, y_scale, y_zero_point)
QuantizeLinear
The linear quantization operator. It consumes a high precision tensor, a scale, and a zero point to compute the low precision / quantized tensor. The scale factor and zero point must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization. The quantization formula is y = saturate ((x / y_scale) + y_zero_point). For saturation, it saturates to [0, 255] if it's uint8, or [-128, 127] if it's int8. For (x / y_scale), it's rounding to the nearest even. Refer to https://en.wikipedia.org/wiki/Rounding for details. 'y_zero_point' and 'y' must have same type.
Python version: onnx_ops.quantizelinear(x, y_scale, y_zero_point, axis)
RNN
Computes an one-layer simple RNN. This operator is usually supported via some custom implementation such as CuDNN.
Notations:
X
- input tensori
- input gatet
- time step (t-1 means previous time step)Wi
- W parameter weight matrix for input gateRi
- R recurrence weight matrix for input gateWbi
- W parameter bias vector for input gateRbi
- R parameter bias vector for input gateWBi
- W parameter weight matrix for backward input gateRBi
- R recurrence weight matrix for backward input gateWBbi
- WR bias vectors for backward input gateRBbi
- RR bias vectors for backward input gateH
- Hidden statenum_directions
- 2 if direction == bidirectional else 1
Activation functions:
Relu(x) - max(0, x)
Tanh(x) - (1 - e^{-2x})/(1 + e^{-2x})
Sigmoid(x) - 1/(1 + e^{-x})
NOTE: Below are optional
Affine(x) - alpha*x + beta
LeakyRelu(x) - x if x >= 0 else alpha * x
ThresholdedRelu(x) - x if x >= alpha else 0
ScaledTanh(x) - alphaTanh(betax)
HardSigmoid(x) - min(max(alpha*x + beta, 0), 1)
Elu(x) - x if x >= 0 else alpha*(e^x - 1)
Softsign(x) - x/(1 + |x|)
Softplus(x) - log(1 + e^x)
Equations (Default: f=Tanh):
Ht = f(Xt*(Wi^T) + Ht-1*(Ri^T) + Wbi + Rbi) This operator has optional inputs/outputs. See the doc for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument’s name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.
Python version: onnx_ops.rnn(X, W, R, B, sequence_lens, initial_h, activation_alpha, activation_beta, activations, clip, direction, hidden_size, layout)
RandomNormal
Generate a tensor with random values drawn from a normal distribution. The shape of the tensor is specified by the `shape` argument and the parameter of the normal distribution specified by `mean` and `scale`.
The data type is specified by the ‘dtype’ argument. The ‘dtype’ argument must be one of the data types specified in the ‘DataType’ enum field in the TensorProto message.
Python version: onnx_ops.randomnormal(dtype, mean, scale, seed, shape)
RandomNormalLike
Generate a tensor with random values drawn from a normal distribution. The shape of the output tensor is copied from the shape of the input tensor, and the parameters of the normal distribution are specified by `mean` and `scale`.
The data type is specified by the ‘dtype’ argument, or copied from the input tensor if not provided. The ‘dtype’ argument must be one of the data types specified in the ‘DataType’ enum field in the TensorProto message, and be valid as an output type.
Python version: onnx_ops.randomnormallike(input, dtype, mean, scale, seed)
RandomUniform
Generate a tensor with random values drawn from a uniform distribution. The shape of the tensor is specified by the `shape` argument and the range by `low` and `high`.
The data type is specified by the ‘dtype’ argument. The ‘dtype’ argument must be one of the data types specified in the ‘DataType’ enum field in the TensorProto message.
Python version: onnx_ops.randomuniform(dtype, high, low, seed, shape)
RandomUniformLike
Generate a tensor with random values drawn from a uniform distribution. The shape of the output tensor is copied from the shape of the input tensor, and the parameters of the uniform distribution are specified by `low` and `high`.
The data type is specified by the ‘dtype’ argument, or copied from the input tensor if not provided. The ‘dtype’ argument must be one of the data types specified in the ‘DataType’ enum field in the TensorProto message and be valid as an output type.
Python version: onnx_ops.randomuniformlike(input, dtype, high, low, seed)
Range
Generate a tensor containing a sequence of numbers that begin at `start` and extends by increments of `delta` up to `limit` (exclusive).
The number of elements in the output of range is computed as below:
number_of_elements = max( ceil( (limit - start) / delta ) , 0 )
The pseudocode determining the contents of the output is shown below:
for(int i=0; i<number_of_elements; ++i) {
output[i] = start + (i * delta);
}
Example 1
Inputs: start = 3, limit = 9, delta = 3
Output: [3, 6]
Example 2
Inputs: start = 10, limit = 4, delta = -2
Output: [10, 8, 6]
Python version: onnx_ops.range(start, limit, delta)
Reciprocal
Reciprocal takes one input data (Tensor
Python version: onnx_ops.reciprocal(X)
ReduceL1
Computes the L1 norm of the input tensor's elements along the provided axes. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are valid.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims to False instead of True.
Python version: onnx_ops.reducel1(data, axes, keepdims)
ReduceL2
Computes the L2 norm of the input tensor's elements along the provided axes. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are valid.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims to False instead of True.
Python version: onnx_ops.reducel2(data, axes, keepdims)
ReduceLogSum
Computes the log sum of the input tensor's elements along the provided axes. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are valid.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims to False instead of True.
Python version: onnx_ops.reducelogsum(data, axes, keepdims)
ReduceLogSumExp
Computes the log sum exponent of the input tensor's elements along the provided axes. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are valid.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims to False instead of True.
Python version: onnx_ops.reducelogsumnumpy.exp(data, axes, keepdims)
ReduceMax
Computes the max of the input tensor's elements along the provided axes. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are valid.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims to False instead of True.
Python version: onnx_ops.reducemax(data, axes, keepdims)
ReduceMean
Computes the mean of the input tensor's elements along the provided axes. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are valid.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims to False instead of True.
Python version: onnx_ops.reducemean(data, axes, keepdims)
ReduceMin
Computes the min of the input tensor's elements along the provided axes. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are valid.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims to False instead of True.
Python version: onnx_ops.reducemin(data, axes, keepdims)
ReduceProd
Computes the product of the input tensor's elements along the provided axes. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are valid.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims to False instead of True.
Python version: onnx_ops.reduceprod(data, axes, keepdims)
ReduceSum
Computes the sum of the input tensor's elements along the provided axes. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are valid.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims to False instead of True.
Python version: onnx_ops.reducesum(data, axes, keepdims, noop_with_empty_axes)
ReduceSumSquare
Computes the sum square of the input tensor's elements along the provided axes. The resulting tensor has the same rank as the input if keepdims equals 1. If keepdims equals 0, then the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are valid.
The above behavior is similar to numpy, with the exception that numpy defaults keepdims to False instead of True.
Python version: onnx_ops.reducesumsquare(data, axes, keepdims)
Relu
Relu takes one input data (Tensor
Python version: onnx_ops.relu(X)
Reshape
Reshape the input tensor similar to numpy.reshape. First input is the data tensor, second input is a shape tensor which specifies the output shape. It outputs the reshaped tensor. At most one dimension of the new shape can be -1. In this case, the value is inferred from the size of the tensor and the remaining dimensions. A dimension could also be 0, in which case the actual dimension value is unchanged (i.e. taken from the input tensor). If 'allowzero' is set, and the new shape includes 0, the dimension will be set explicitly to zero (i.e. not taken from input tensor). Shape (second input) could be an empty shape, which means converting to a scalar. The input tensor's shape and the output tensor's shape are required to have the same number of elements.
If the attribute ‘allowzero’ is set, it is invalid for the specified shape to contain both a zero value and -1, as the value of the dimension corresponding to -1 cannot be determined uniquely.
Python version: onnx_ops.reshape(data, shape, allowzero)
Resize
Resize the input tensor. In general, it calculates every value in the output tensor as a weighted average of neighborhood (a.k.a. sampling locations) in the input tensor. Each dimension value of the output tensor is: output_dimension = floor(input_dimension * (roi_end - roi_start) * scale) if input \"sizes\" is not specified.
Python version: onnx_ops.resize(X, roi, scales, sizes, coordinate_transformation_mode, cubic_coeff_a, exclude_outside, extrapolation_value, mode, nearest_mode)
ReverseSequence
Reverse batch of sequences having different lengths specified by `sequence_lens`.
For each slice i iterating on batch axis, the operator reverses the first sequence_lens[i] elements on time axis, and copies elements whose index’s beyond sequence_lens[i] to the output. So the output slice i contains reversed sequences on the first sequence_lens[i] elements, then have original values copied for the other elements.
Example 1: input = [[0.0, 4.0, 8.0, 12.0], [1.0, 5.0, 9.0, 13.0], [2.0, 6.0, 10.0, 14.0], [3.0, 7.0, 11.0, 15.0]] sequence_lens = [4, 3, 2, 1] time_axis = 0 batch_axis = 1
output = [[3.0, 6.0, 9.0, 12.0], [2.0, 5.0, 8.0, 13.0], [1.0, 4.0, 10.0, 14.0], [0.0, 7.0, 11.0, 15.0]]
Example 2: input = [[0.0, 1.0, 2.0, 3.0 ], [4.0, 5.0, 6.0, 7.0 ], [8.0, 9.0, 10.0, 11.0], [12.0, 13.0, 14.0, 15.0]] sequence_lens = [1, 2, 3, 4] time_axis = 1 batch_axis = 0
output = [[0.0, 1.0, 2.0, 3.0 ], [5.0, 4.0, 6.0, 7.0 ], [10.0, 9.0, 8.0, 11.0], [15.0, 14.0, 13.0, 12.0]]
Python version: onnx_ops.reversesequence(input, sequence_lens, batch_axis, time_axis)
RoiAlign
Region of Interest (RoI) align operation described in the [Mask R-CNN paper](https://arxiv.org/abs/1703.06870). RoiAlign consumes an input tensor X and region of interests (rois) to apply pooling across each RoI; it produces a 4-D tensor of shape (num_rois, C, output_height, output_width).
RoiAlign is proposed to avoid the misalignment by removing quantizations while converting from original image into feature map and from feature map into RoI feature; in each ROI bin, the value of the sampled locations are computed directly through bilinear interpolation.
Python version: onnx_ops.roialign(X, rois, batch_indices, mode, output_height, output_width, sampling_ratio, spatial_scale)
Round
Round takes one input Tensor and rounds the values, element-wise, meaning it finds the nearest integer for each value. In case of halfs, the rule is to round them to the nearest even integer. If input x is integral, +0, -0, NaN, or infinite, x itself is returned. The output tensor has the same shape and type as the input.
Examples:
round([0.9]) = [1.0]
round([2.5]) = [2.0]
round([2.3]) = [2.0]
round([1.5]) = [2.0]
round([-4.5]) = [-4.0]
Python version: onnx_ops.round(X)
Scatter
This operator is deprecated. Please use ScatterElements, which provides the same functionality.
Scatter takes three inputs data
, updates
, and indices
of the same
rank r >= 1 and an optional attribute axis that identifies an axis of data
(by default, the outer-most axis, that is axis 0). The output of the operation
is produced by creating a copy of the input data
, and then updating its value
to values specified by updates
at specific index positions specified by
indices
. Its output shape is the same as the shape of data
.
For each entry in updates
, the target index in data
is obtained by combining
the corresponding entry in indices
with the index of the entry itself: the
index-value for dimension = axis is obtained from the value of the corresponding
entry in indices
and the index-value for dimension != axis is obtained from the
index of the entry itself.
For instance, in a 2-D tensor case, the update corresponding to the [i][j] entry is performed as below:
output[indices[i][j]][j] = updates[i][j] if axis = 0,
output[i][indices[i][j]] = updates[i][j] if axis = 1,
This operator is the inverse of GatherElements. It is similar to Torch’s Scatter operation.
Example 1:
data = [
[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0],
]
indices = [
[1, 0, 2],
[0, 2, 1],
]
updates = [
[1.0, 1.1, 1.2],
[2.0, 2.1, 2.2],
]
output = [
[2.0, 1.1, 0.0]
[1.0, 0.0, 2.2]
[0.0, 2.1, 1.2]
]
Example 2:
data = [[1.0, 2.0, 3.0, 4.0, 5.0]]
indices = [[1, 3]]
updates = [[1.1, 2.1]]
axis = 1
output = [[1.0, 1.1, 3.0, 2.1, 5.0]]
Python version: onnx_ops.scatter(data, indices, updates, axis)
ScatterElements
ScatterElements takes three inputs `data`, `updates`, and `indices` of the same rank r >= 1 and an optional attribute axis that identifies an axis of `data` (by default, the outer-most axis, that is axis 0). The output of the operation is produced by creating a copy of the input `data`, and then updating its value to values specified by `updates` at specific index positions specified by `indices`. Its output shape is the same as the shape of `data`.
For each entry in updates
, the target index in data
is obtained by combining
the corresponding entry in indices
with the index of the entry itself: the
index-value for dimension = axis is obtained from the value of the corresponding
entry in indices
and the index-value for dimension != axis is obtained from the
index of the entry itself.
For instance, in a 2-D tensor case, the update corresponding to the [i][j] entry is performed as below:
output[indices[i][j]][j] = updates[i][j] if axis = 0,
output[i][indices[i][j]] = updates[i][j] if axis = 1,
This operator is the inverse of GatherElements. It is similar to Torch’s Scatter operation.
Example 1:
data = [
[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0],
]
indices = [
[1, 0, 2],
[0, 2, 1],
]
updates = [
[1.0, 1.1, 1.2],
[2.0, 2.1, 2.2],
]
output = [
[2.0, 1.1, 0.0]
[1.0, 0.0, 2.2]
[0.0, 2.1, 1.2]
]
Example 2:
data = [[1.0, 2.0, 3.0, 4.0, 5.0]]
indices = [[1, 3]]
updates = [[1.1, 2.1]]
axis = 1
output = [[1.0, 1.1, 3.0, 2.1, 5.0]]
Python version: onnx_ops.scatterelements(data, indices, updates, axis)
ScatterND
ScatterND takes three inputs `data` tensor of rank r >= 1, `indices` tensor of rank q >= 1, and `updates` tensor of rank q + r - indices.shape[-1] - 1. The output of the operation is produced by creating a copy of the input `data`, and then updating its value to values specified by `updates` at specific index positions specified by `indices`. Its output shape is the same as the shape of `data`. Note that `indices` should not have duplicate entries. That is, two or more `updates` for the same index-location is not supported.
indices
is an integer tensor. Let k denote indices.shape[-1], the last dimension in the shape of indices
.
indices
is treated as a (q-1)-dimensional tensor of k-tuples, where each k-tuple is a partial-index into data
.
Hence, k can be a value at most the rank of data
. When k equals rank(data), each update entry specifies an
update to a single element of the tensor. When k is less than rank(data) each update entry specifies an
update to a slice of the tensor. Index values are allowed to be negative, as per the usual
convention for counting backwards from the end, but are expected in the valid range.
updates
is treated as a (q-1)-dimensional tensor of replacement-slice-values. Thus, the
first (q-1) dimensions of updates.shape must match the first (q-1) dimensions of indices.shape.
The remaining dimensions of updates
correspond to the dimensions of the
replacement-slice-values. Each replacement-slice-value is a (r-k) dimensional tensor,
corresponding to the trailing (r-k) dimensions of data
. Thus, the shape of updates
must equal indices.shape[0:q-1] ++ data.shape[k:r-1], where ++ denotes the concatenation
of shapes.
The output
is calculated via the following equation:
output = np.copy(data)
update_indices = indices.shape[:-1]
for idx in np.ndindex(update_indices):
output[indices[idx]] = updates[idx]
The order of iteration in the above loop is not specified. In particular, indices should not have duplicate entries: that is, if idx1 != idx2, then indices[idx1] != indices[idx2]. This ensures that the output value does not depend on the iteration order.
This operator is the inverse of GatherND.
Example 1:
data = [1, 2, 3, 4, 5, 6, 7, 8]
indices = [[4], [3], [1], [7]]
updates = [9, 10, 11, 12]
output = [1, 11, 3, 10, 9, 6, 7, 12]
Example 2:
data = [[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]]
indices = [[0], [2]]
updates = [[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]]]
output = [[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
[[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]]
Python version: onnx_ops.scatternd(data, indices, updates)
Selu
Selu takes one input data (Tensor
Python version: onnx_ops.selu(X, alpha, gamma)
SequenceAt
Outputs a tensor copy from the tensor at 'position' in 'input_sequence'. Accepted range for 'position' is in `[-n, n - 1]`, where `n` is the number of tensors in 'input_sequence'. Negative value means counting positions from the back.
Python version: onnx_ops.sequenceat(input_sequence, position)
SequenceConstruct
Construct a tensor sequence containing 'inputs' tensors. All tensors in 'inputs' must have the same data type.
Python version: onnx_ops.sequenceconstruct(inputs)
SequenceEmpty
Construct an empty tensor sequence, with given data type.
Python version: onnx_ops.sequenceempty(dtype)
SequenceErase
Outputs a tensor sequence that removes the tensor at 'position' from 'input_sequence'. Accepted range for 'position' is in `[-n, n - 1]`, where `n` is the number of tensors in 'input_sequence'. Negative value means counting positions from the back. 'position' is optional, by default it erases the last tensor from 'input_sequence'.
Python version: onnx_ops.sequenceerase(input_sequence, position)
SequenceInsert
Outputs a tensor sequence that inserts 'tensor' into 'input_sequence' at 'position'. 'tensor' must have the same data type as 'input_sequence'. Accepted range for 'position' is in `[-n, n]`, where `n` is the number of tensors in 'input_sequence'. Negative value means counting positions from the back. 'position' is optional, by default it inserts 'tensor' to the back of 'input_sequence'.
Python version: onnx_ops.sequenceinsert(input_sequence, tensor, position)
SequenceLength
Produces a scalar(tensor of empty shape) containing the number of tensors in 'input_sequence'.
Python version: onnx_ops.sequencelength(input_sequence)
Shape
Takes a tensor as input and outputs an 1D int64 tensor containing the shape of the input tensor. Optional attributes start and end can be used to compute a slice of the input tensor's shape. If start axis is omitted, the slice starts from axis 0. The end axis, if specified, is exclusive (and the returned value will not include the size of that axis). If the end axis is omitted, the axes upto the last one will be included. Negative axes indicate counting back from the last axis. Note that axes will be clamped to the range [0, r-1], where r is the rank of the input tensor if they are out-of-range (after adding r in the case of negative axis). Thus, specifying any end value > r is equivalent to specifying an end value of r, and specifying any start value < -r is equivalent to specifying a start value of 0.
Examples:
Input tensor with shape: [2, 3, 4]
No attributes specified.
Output: [2, 3, 4]
Input tensor with shape: [2, 3, 4]
start: -1
Output: [4]
Input tensor with shape: [2, 3, 4]
end: -1
Output: [2, 3]
Input tensor with shape: [2, 3, 4]
start: 1
end: 2
Output: [3]
Python version: onnx_ops.shape(data, end, start)
Shrink
Shrink takes one input data (Tensor
Python version: onnx_ops.shrink(input, bias, lambd)
Sigmoid
Sigmoid takes one input data (Tensor
Python version: onnx_ops.sigmoid(X)
Sign
Calculate the sign of the given input tensor element-wise. If input > 0, output 1. if input < 0, output -1. if input == 0, output 0.
Python version: onnx_ops.sign(input)
Sin
Calculates the sine of the given input tensor, element-wise.
Python version: onnx_ops.numpy.sin(input)
Sinh
Calculates the hyperbolic sine of the given input tensor element-wise.
Python version: onnx_ops.numpy.sinh(input)
Size
Takes a tensor as input and outputs a int64 scalar that equals to the total number of elements of the input tensor.
Python version: onnx_ops.size(data)
Slice
Produces a slice of the input tensor along multiple axes. Similar to numpy: https://numpy.org/doc/stable/user/basics.indexing.html?highlight=slice#slicing-and-striding
Slice uses the starts
, ends
, axes
and steps
inputs to select a sub-tensor
of its input data
tensor.
An effective start[i]
, end[i]
, and step[i]
must be computed for each i
in [0, ... r-1]
where r = rank(input)
as follows:
If axes
are omitted, they are set to [0, ..., r-1]
.
If steps
are omitted, they are set to [1, ..., 1]
of length len(starts)
The effective values are initialized as start[i] = 0
, end[i] = dims[i]
where
dims
are the dimensions of input
and step[i] =
1.
All negative elements of axes
are made non-negatve by adding r
to them, where
r =rank(input)
.
All negative values in starts[i]
and ends[i]
have dims[axes[i]]
added to them,
where dims
are the dimensions of input
. Then start[axes[i]]
is the adjusted
starts[i]
is clamped into the range [0, dims[axes[i]]]
for positive stepping
and [0, dims[axes[i]]-1]
for negative stepping.
The clamping for the adjusted ends[i]
depends on the sign of steps[i]
and must
accommodate copying 0 through dims[axes[i]]
elements, so for positive stepping
end[axes[i]]
is clamped to [0, dims[axes[i]]]
, while for negative stepping it
is clamped to [-1, dims[axes[i]]-1]
.
Finally, step[axes[i]] = steps[i]
.
For slicing to the end of a dimension with unknown size, it is recommended to pass
in INT_MAX
when slicing forward and ‘INT_MIN’ when slicing backward.
Example 1:
data = [
[1, 2, 3, 4],
[5, 6, 7, 8],
]
axes = [0, 1]
starts = [1, 0]
ends = [2, 3]
steps = [1, 2]
result = [
[5, 7],
]
Example 2:
data = [
[1, 2, 3, 4],
[5, 6, 7, 8],
]
starts = [0, 1]
ends = [-1, 1000]
result = [
[2, 3, 4],
]
Python version: onnx_ops.slice(data, starts, ends, axes, steps)
Softmax
The operator computes the normalized exponential values for the given input:
Softmax(input, axis) = Exp(input) / ReduceSum(Exp(input), axis=axis, keepdims=1)
The “axis” attribute indicates the dimension along which Softmax will be performed. The output tensor has the same shape and contains the Softmax values of the corresponding input.
Python version: onnx_ops.softmax(input, axis)
SoftmaxCrossEntropyLoss
Loss function that measures the softmax cross entropy
between 'scores' and 'labels'.
This operator first computes a loss tensor whose shape is identical to the labels input.
If the input is 2-D with shape (N, C), the loss tensor may be a N-element vector L = (l_1, l_2, ..., l_N).
If the input is N-D tensor with shape (N, C, D1, D2, ..., Dk),
the loss tensor L may have (N, D1, D2, ..., Dk) as its shape and L[i,][j_1][j_2]...[j_k] denotes a scalar element in L.
After L is available, this operator can optionally do a reduction operator.
shape(scores): (N, C) where C is the number of classes, or (N, C, D1, D2,…, Dk),
with K >= 1 in case of K-dimensional loss. shape(labels): (N) where each value is 0 <= labels[i] <= C-1, or (N, D1, D2,…, Dk),
with K >= 1 in case of K-dimensional loss.
The loss for one sample, l_i, can caculated as follows:
l[i][d1][d2]...[dk] = -y[i][c][d1][d2]..[dk], where i is the index of classes.
or
l[i][d1][d2]...[dk] = -y[i][c][d1][d2]..[dk] * weights[c], if 'weights' is provided.
loss is zero for the case when label-value equals ignore_index.
l[i][d1][d2]...[dk] = 0, when labels[n][d1][d2]...[dk] = ignore_index
where:
p = Softmax(scores)
y = Log(p)
c = labels[i][d1][d2]...[dk]
Finally, L is optionally reduced:
If reduction = ‘none’, the output is L with shape (N, D1, D2, …, Dk).
If reduction = ‘sum’, the output is scalar: Sum(L).
If reduction = ‘mean’, the output is scalar: ReduceMean(L), or if weight is provided:
ReduceSum(L) / ReduceSum(W)
, where tensor W is of shape(N, D1, D2, ..., Dk)
andW[n][d1][d2]...[dk] = weights[labels[i][d1][d2]...[dk]]
.
Python version: onnx_ops.softmaxcrossentropyloss(scores, labels, weights, ignore_index, reduction)
Softplus
Softplus takes one input data (Tensor
Python version: onnx_ops.softplus(X)
Softsign
Calculates the softsign (x/(1+|x|)) of the given input tensor element-wise.
Python version: onnx_ops.softsign(input)
SpaceToDepth
SpaceToDepth rearranges blocks of spatial data into depth. More specifically, this op outputs a copy of the input tensor where values from the height and width dimensions are moved to the depth dimension.
Python version: onnx_ops.spacetodepth(input, blocksize)
Split
Split a tensor into a list of tensors, along the specified 'axis'. Lengths of the parts can be specified using input 'split'. Otherwise, the tensor is split to equal sized parts.
Python version: onnx_ops.split(input, split, axis)
SplitToSequence
Split a tensor into a sequence of tensors, along the specified 'axis'. Lengths of the parts can be specified using the optional argument 'split'. If the argument `split' is not specified, a default scalar value of 1 is used as the value of `split'. 'split' must contain only positive numbers. 'split' is either a scalar (tensor of empty shape), or a 1-D tensor. If 'split' is a scalar, then 'input' will be split into chunks all of size 'split' if possible. The last chunk alone may be smaller than 'split' if the 'input' size along the given axis 'axis' is not divisible by 'split'. If 'split' is a 1-dimensional tensor, the input tensor is split into 'size(split)' chunks, with lengths of the parts on 'axis' specified in 'split'. In this scenario, the sum of entries in 'split' must be equal to the dimension size of input tensor on 'axis'.
Python version: onnx_ops.splittosequence(input, split, axis, keepdims)
Sqrt
Square root takes one input data (Tensor
Python version: onnx_ops.sqrt(X)
Squeeze
Remove single-dimensional entries from the shape of a tensor. Takes an input `axes` with a list of axes to squeeze. If `axes` is not provided, all the single dimensions will be removed from the shape. If an axis is selected with shape entry not equal to one, an error is raised.
Python version: onnx_ops.squeeze(data, axes)
StringNormalizer
StringNormalization performs string operations for basic cleaning. This operator has only one input (denoted by X) and only one output (denoted by Y). This operator first examines the elements in the X, and removes elements specified in "stopwords" attribute. After removing stop words, the intermediate result can be further lowercased, uppercased, or just returned depending the "case_change_action" attribute. This operator only accepts [C]- and [1, C]-tensor. If all elements in X are dropped, the output will be the empty value of string tensor with shape [1] if input shape is [C] and shape [1, 1] if input shape is [1, C].
Python version: onnx_ops.stringnormalizer(X, case_change_action, is_case_sensitive, locale, stopwords)
Sub
Performs element-wise binary subtraction (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
(Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.
Python version: onnx_ops.sub(A, B)
Sum
Element-wise sum of each of the input tensors (with Numpy-style broadcasting support). All inputs and outputs must have the same data type. This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).
Python version: onnx_ops.sum(data_0)
Tan
Calculates the tangent of the given input tensor, element-wise.
Python version: onnx_ops.numpy.tan(input)
Tanh
Calculates the hyperbolic tangent of the given input tensor element-wise.
Python version: onnx_ops.numpy.tanh(input)
TfIdfVectorizer
This transform extracts n-grams from the input sequence and save them as a vector. Input can be either a 1-D or 2-D tensor. For 1-D input, output is the n-gram representation of that input. For 2-D input, the output is also a 2-D tensor whose i-th row is the n-gram representation of the i-th input row. More specifically, if input shape is [C], the corresponding output shape would be [max(ngram_indexes) + 1]. If input shape is [N, C], this operator produces a [N, max(ngram_indexes) + 1]-tensor.
In contrast to standard n-gram extraction, here, the indexes of extracting an n-gram from the original sequence are not necessarily consecutive numbers. The discontinuity between indexes are controlled by the number of skips. If the number of skips is 2, we should skip two tokens when scanning through the original sequence. Let’s consider an example. Assume that input sequence is [94, 17, 36, 12, 28] and the number of skips is 2. The associated 2-grams are [94, 12] and [17, 28] respectively indexed by [0, 3] and [1, 4]. If the number of skips becomes 0, the 2-grams generated are [94, 17], [17, 36], [36, 12], [12, 28] indexed by [0, 1], [1, 2], [2, 3], [3, 4], respectively.
The output vector (denoted by Y) stores the count of each n-gram; Y[ngram_indexes[i]] indicates the times that the i-th n-gram is found. The attribute ngram_indexes is used to determine the mapping between index i and the corresponding n-gram’s output coordinate. If pool_int64s is [94, 17, 17, 36], ngram_indexes is [1, 0], ngram_counts=[0, 0], then the Y[0] (first element in Y) and Y[1] (second element in Y) are the counts of [17, 36] and [94, 17], respectively. An n-gram which cannot be found in pool_strings/pool_int64s should be ignored and has no effect on the output. Note that we may consider all skips up to S when generating the n-grams.
The examples used above are true if mode is “TF”. If mode is “IDF”, all the counts larger than 1 would be truncated to 1 and the i-th element in weights would be used to scale (by multiplication) the count of the i-th n-gram in pool. If mode is “TFIDF”, this operator first computes the counts of all n-grams and then scale them by the associated values in the weights attribute.
Only one of pool_strings and pool_int64s can be set. If pool_int64s is set, the input should be an integer tensor. If pool_strings is set, the input must be a string tensor.
Python version: onnx_ops.tfidfvectorizer(X, max_gram_length, max_skip_count, min_gram_length, mode, ngram_counts, ngram_indexes, pool_int64s, pool_strings, weights)
ThresholdedRelu
ThresholdedRelu takes one input data (Tensor
Python version: onnx_ops.thresholdedrelu(X, alpha)
Tile
Constructs a tensor by tiling a given tensor. This is the same as function `tile` in Numpy, but no broadcast. For example A = [[1, 2], [3, 4]], B = [1, 2], tile(A, B) = [[1, 2, 1, 2], [3, 4, 3, 4]]
Python version: onnx_ops.tile(input, repeats)
TopK
Retrieve the top-K largest or smallest elements along a specified axis. Given an input tensor of
shape [a_1, a_2, ..., a_n, r] and integer argument k, return two outputs:
Value tensor of shape [a_1, a_2, …, a_{axis-1}, k, a_{axis+1}, … a_n]
which contains the values of the top k elements along the specified axis Index tensor of shape [a_1, a_2, …, a_{axis-1}, k, a_{axis+1}, … a_n] which
contains the indices of the top k elements (original indices from the input
tensor). If “largest” is 1 (the default value) then the k largest elements are returned. If “sorted” is 1 (the default value) then the resulting k elements will be sorted. If “sorted” is 0, order of returned ‘Values’ and ‘Indices’ are undefined.
Given two equivalent values, this operator uses the indices along the axis as a tiebreaker. That is, the element with the lower index will appear first.
Python version: onnx_ops.topk(X, K, axis, largest, sorted)
Transpose
Transpose the input tensor similar to numpy.transpose. For example, when perm=(1, 0, 2), given an input tensor of shape (1, 2, 3), the output shape will be (2, 1, 3).
Python version: onnx_ops.transpose(data, perm)
Trilu
Given a 2-D matrix or batches of 2-D matrices, returns the upper or lower triangular part of the tensor(s). The attribute "upper" determines whether the upper or lower part is retained. If set to true, the upper triangular matrix is retained. Lower triangular matrix is retained otherwise. Default value for the "upper" attribute is true. Trilu takes one input tensor of shape [*, N, M], where * is zero or more batch dimensions. The upper triangular part consists of the elements on and above the given diagonal (k). The lower triangular part consists of elements on and below the diagonal. All other elements in the matrix are set to zero. If k = 0, the triangular part on and above/below the main diagonal is retained. If upper is set to true, a positive k retains the upper triangular matrix excluding the main diagonal and (k-1) diagonals above it. A negative k value retains the main diagonal and |k| diagonals below it. If upper is set to false, a positive k retains the lower triangular matrix including the main diagonal and k diagonals above it. A negative k value excludes the main diagonal and (|k|-1) diagonals below it.
Python version: onnx_ops.trilu(input, k, upper)
Unique
Find the unique elements of a tensor. When an optional attribute 'axis' is provided, unique subtensors sliced along the 'axis' are returned. Otherwise the input tensor is flattened and unique values of the flattened tensor are returned.
This operator returns the unique values or sliced unique subtensors of the input tensor and three optional outputs. The first output tensor ‘Y’ contains all unique values or subtensors of the input. The second optional output tensor ‘indices’ contains indices of ‘Y’ elements’ first occurance in ‘X’.. The third optional output tensor ‘inverse_indices’ contains, for elements of ‘X’, its corresponding indices in ‘Y’. “. The fourth optional output tensor ‘counts’ contains the count of each element of ‘Y’ in the input.
Outputs are either sorted in ascending order or optionally in the order of the first occurrence of the values in the input.
https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html
Example 1:
input_X = [2, 1, 1, 3, 4, 3]
attribute_sorted = 0
attribute_axis = None
output_Y = [2, 1, 3, 4]
output_indices = [0, 1, 3, 4]
output_inverse_indices = [0, 1, 1, 2, 3, 2]
output_counts = [1, 2, 2, 1]
Example 2:
input_X = [[1, 3], [2, 3]]
attribute_sorted = 1
attribute_axis = None
output_Y = [1, 2, 3]
output_indices = [0, 2, 1]
output_inverse_indices = [0, 2, 1, 2]
output_counts = [1, 1, 2]
Example 3:
input_X = [[1, 0, 0], [1, 0, 0], [2, 3, 4]]
attribute_sorted = 1
attribute_axis = 0
output_Y = [[1, 0, 0], [2, 3, 4]]
output_indices = [0, 2]
output_inverse_indices = [0, 0, 1]
output_counts = [2, 1]
Example 4:
input_x = [[[1., 1.], [0., 1.], [2., 1.], [0., 1.]],
[[1., 1.], [0., 1.], [2., 1.], [0., 1.]]]
attribute_sorted = 1
attribute_axis = 1
intermediate data are presented below for better understanding: there are 4 subtensors sliced along axis 1 of input_x (shape = (2, 4, 2)):
A: [[1, 1], [1, 1]],
[[0, 1], [0, 1]],
[[2, 1], [2, 1]],
[[0, 1], [0, 1]].
there are 3 unique subtensors:
[[1, 1], [1, 1]],
[[0, 1], [0, 1]],
[[2, 1], [2, 1]].
sorted unique subtensors:
B: [[0, 1], [0, 1]],
[[1, 1], [1, 1]],
[[2, 1], [2, 1]].
output_Y is constructed from B:
[[[0. 1.], [1. 1.], [2. 1.]],
[[0. 1.], [1. 1.], [2. 1.]]]
output_indices is to map from B to A:
[1, 0, 2]
output_inverse_indices is to map from A to B:
[1, 0, 2, 0]
output_counts:
[2, 1, 1]
Python version: onnx_ops.unique(X, axis, sorted)
Unsqueeze
Insert single-dimensional entries to the shape of an input tensor (`data`). Takes one required input `axes` - which contains a list of dimension indices and this operator will insert a dimension of value `1` into the corresponding index of the output tensor (`expanded`).
For example, given an input tensor (data
) of shape [3, 4, 5], then
Unsqueeze(data, axes=[0, 4]) outputs a tensor (expanded
) containing same data as data
but with shape [1, 3, 4, 5, 1].
The input axes
should not contain any duplicate entries. It is an error if it contains duplicates.
The rank of the output tensor (output_rank
) is the rank of the input tensor (data
) plus the number of values in axes
.
Each value in axes
should be within the (inclusive) range [-output_rank , output_rank - 1].
The order of values in axes
does not matter and can come in any order.
Python version: onnx_ops.unsqueeze(data, axes)
Upsample
Upsample the input tensor. Each dimension value of the output tensor is: output_dimension = floor(input_dimension * scale).
Python version: onnx_ops.upsample(X, scales, mode)
Where
Return elements, either from X or Y, depending on condition. Where behaves like [numpy.where](https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html) with three parameters.
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.where(condition, X, Y)
Xor
Returns the tensor resulted from performing the `xor` logical operation elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).
This operator supports multidirectional (i.e., Numpy-style) broadcasting; for more details please check the doc.
Python version: onnx_ops.xor(A, B)
pattern_matching_function
Returns the productions that match the given goal and retrieval buffers.
pattern_matching_function(productions, goal, retrieval) = actr.pattern_matching_function(productions,goal,retrieval)
Python version: actr.pattern_matching_function(productions,goal,retrieval)
pattern_to_string
Converts a pattern dictionary to a string format.
pattern_to_string(chunk) = actr.pattern_to_string(chunk)
Python version: actr.pattern_to_string(chunk)
retrieve_chunk
Retrieve a chunk from declarative memory given a pattern.
retrieve_chunk(pattern, dm_chunks, types) = actr.retrieve_chunk(pattern,dm_chunks,types)
Python version: actr.retrieve_chunk(pattern,dm_chunks,types)
sin
Sine function
sin(variable0, scale) = scale * sin(variable0)
Python version: scale * numpy.sin(variable0)
sinh
Hyperbolic sine function
sinh(variable0, scale) = scale * sinh(variable0)
Python version: scale * numpy.sinh(variable0)
tan
Tangent function
tan(variable0, scale) = scale * tan(variable0)
Python version: scale * numpy.tan(variable0)
tanh
Hyperbolic tangent function
tanh(variable0, scale) = scale * tanh(variable0)
Python version: scale * numpy.tanh(variable0)
update_buffer
Returns a pattern to update the given buffer with.
update_buffer(production, buffer) = actr.update_buffer(production,buffer)
Python version: actr.update_buffer(production,buffer)
update_goal
Returns a pattern to update the goal buffer with.
update_goal(production) = actr.update_goal(production)
Python version: actr.update_goal(production)
update_retrieval
Returns a pattern to update the retrieval buffer with.
update_retrieval(production) = actr.update_retrieval(production)
Python version: actr.update_retrieval(production)
modeci_mdf
MDF is intended to be an open source, community-supported standard and associated library of tools for expressing computational models in a form that allows them to be exchanged between diverse programming languages and execution environments. The MDF Python API can be used to create or load an MDF model for inspection and validation. It also includes a basic execution engine for simulating models in the format. However, this is not intended as a general purpose simulation environment, nor is MDF intended as a programming language. Rather, the primary purpose of the Python API is to facilitate and validate the exchange of models between existing environments that serve different communities. Accordingly, these Python tools include bi-directional support for importing to and exporting from widely-used programming environments in a range of disciplines, and for easily extending these to other environments.
The reference implementation of the MDF execution engine; allows for executing |
|
Specifies and implements the MDF the function ontology; a collection of builtin functions that can be used in MDF |
|
Implementations of importers and exporters for supported environments; fulfilling the hub and spoke model of MDF by allowing exchange between different modeling environments via MDF. |
|
The main object-oriented implementation of the MDF schema, with each core component of the MDF specification implemented as a |
|
Useful utility functions for dealing with MDF objects. |