Overview of Puppet's Architecture
Puppet usually uses an agent/master (client/server) architecture for configuring systems, using the Puppet agent and Puppet master applications. It can also run in a self-contained architecture with the Puppet apply application.
Note: Two Stages for Configuration Management
Puppet configures systems in two main stages:
- Compile a catalog
- Apply the catalog
What is a Catalog?
A catalog is a document that describes the desired system state for one specific computer. It lists all of the resources that need to be managed, as well as any dependencies between those resources.
The Agent/Master Architecture
Puppet usually runs in an agent/master architecture, where a Puppet master server controls important configuration info and managed agent nodes request only their own configuration catalogs.
In this architecture, managed nodes run the Puppet agent application, usually as a background service. One or more servers run the Puppet master application, usually as a Rack application managed by a web server (like Apache with Passenger).
Periodically, Puppet agent will send facts to the Puppet master and request a catalog. The master will compile and return that node’s catalog, using several sources of information it has access to.
Once it receives a catalog, Puppet agent will apply it by checking each resource the catalog describes. If it finds any resources that are not in their desired state, it will make any changes necessary to correct them. (Or, in no-op mode, it will report on what changes would have been needed.)
After applying the catalog, the agent will submit a report to the Puppet master.
About the Puppet Services
- Puppet Agent on *nix Systems
- Puppet Agent on Windows Systems
- The Rack Puppet Master
- The WEBrick Puppet Master
Communications and Security
Puppet agent nodes and Puppet masters communicate via HTTPS with client-verification.
The Puppet master provides an HTTP interface, with various endpoints available. When requesting or submitting anything to the master, the agent will make an HTTPS request to one of those endpoints.
For details, see:
- A walkthrough of Puppet’s HTTPS communications
- The Puppet master’s HTTP API
- The Puppet master’s auth.conf file
Client-verified HTTPS means each master or agent must have an identifying SSL certificate, and will examine their counterpart’s certificate to decide whether to allow an exchange of information.
Puppet includes a built-in certificate authority (CA) for managing certificates. Agents can automatically request certificates via the master’s HTTP API, users can use the puppet cert command to inspect requests and sign new certificates, and agents can then download the signed certificates.
For general info about SSL, see our background reference on SSL and HTTPS.
The Stand-Alone Architecture
Puppet can run in a stand-alone architecture, where each managed server has its own complete copy of your configuration info and compiles its own catalog.
In this architecture, managed nodes run the Puppet apply application, usually as a scheduled task or cron job. (You can also run it on demand for initial configuration of a server or for smaller configuration tasks.)
Like the Puppet master application, Puppet apply needs access to several sources of configuration data, which it uses to compile a catalog for the node it is managing.
After Puppet apply compiles the catalog, it immediately applies it by checking each resource the catalog describes. If it finds any resources that are not in their desired state, it will make any changes necessary to correct them. (Or, in no-op mode, it will report on what changes would have been needed.)
After applying the catalog, Puppet apply will store a report on disk. It can also be configured to send reports to a central service.
About the Puppet Apply Application
Note: Differences Between Agent/Master and Puppet Apply
In general, Puppet apply can do the same things as the combination of Puppet agent and Puppet master, but there are several trade-offs around security and the ease of certain tasks.
If you don’t have a preference, you should default to an agent/master architecture. If you have questions, considering these trade-offs will help you make your decision.
- Principle of least privilege. In agent/master Puppet, each agent only gets its own configuration, and is unable to see how other nodes are configured. With Puppet apply, it’s impractical to do this, so every node has access to complete knowledge about how your site is configured. Depending on how you’re configuring your systems, this can potentially raise the risks of horizontal privilege escalation.
- Ease of centralized reporting and inventory. Agents send reports to the Puppet master by default, and the master can be configured with any number of report handlers to pass these on to other services. You can also connect the master to PuppetDB, a powerful tool for querying inventory and activity data. Puppet apply nodes handle their own information, so if you’re using PuppetDB or sending reports to another service, each node needs to be configured and authorized to connect to it.
- Ease of updating configurations. Only the Puppet master server(s) have the Puppet modules, main manifests, and other data necessary for compiling catalogs. This means that when you need to update your systems’ configurations, you only need to update content on one (or a few) servers. In a decentralized Puppet apply deployment, you’ll need to sync new configuration code and data to every node.
- CPU and memory usage on managed machines. Since Puppet agent doesn’t compile its own catalogs, it uses fewer resources on the machines it manages, leaving them with more capacity for their designated tasks.
- Need for a dedicated master server. The Puppet master takes on the performance load of compiling all catalogs, and it should usually be a dedicated machine with a fast processor, lots of RAM, and a fast disk. Not everybody wants to (or is able to) allocate that, and Puppet apply can get around the need for it.
- Need for good network connectivity. Agents need to be able to reach the Puppet master at a reliable hostname in order to configure themselves. If a system lives in a degraded or isolated network environment, you may want it to be more self-sufficient.
- Security overhead. Agents and masters use HTTPS to secure their communications and authenticate each other, and every system involved needs an SSL certificate. Puppet includes a built-in CA to easily manage certificates, but it’s even easier to not manage them at all. (Of course, you’ll still need to manage security somehow, since you’re probably using Rsync or something to update Puppet content on every node.)