# Gloo documentation Documentation is split by domain. This file contains a general overview of these domains and how they interact. ## Index * [Overview](readme.md) -- this file * [Rendezvous](rendezvous.md) -- creating a `gloo::Context` * [Algorithms](algorithms.md) -- index of collective algorithms and their semantics and complexity * [Transport details](transport.md) -- the transport API and its implementations * [CUDA integration](cuda.md) -- integration of CUDA aware Gloo algorithms with existing CUDA code * [Latency optimization](latency.md) -- number of tips and tricks to improve performance ## Overview Gloo algorithms are collective algorithms, meaning they can run in parallel across two or more processes/machines. To be able to execute across multiple machines, they first need to find each other. We call this _rendezvous_ and it is the first thing to address when integrating Gloo into your code base. See [`rendezvous.md`](./rendezvous.md) for more information. Once rendezvous completes, participating machines have setup connections to one another, either in a full mesh (every machine has a bidirectional communication channel to every other machine), or some subset. The required connectivity between machines depends on the type of algorithm that is used. For example, a ring algorithm only needs communication channels to a machine's neighbors. Every participating process knows about the number of participating processes, and its _rank_ (or 0-based index) within the list of participating processes. This state, as well as the state needed to store the persistent communication channels, is stored in a `gloo::Context` class. Gloo does not maintain global state or thread-local state. This means that you can setup as many contexts as needed, and introduce as much parallelism as needed by your application. ## Anything else? If you find particular documentation is missing, please consider [contributing](../CONTRIBUTING.md).