This guide applies to Vault versions 1.7 and the upper hand and the versions of Consul 1.8 and higher.
This guide describes the recommended best practice so that the infrastructure architects and operators use the production environment of Consul Storage Backend when implementing the vault.
This guide includes general guidelines as well as specific recommendations for cloud clouds.
For the production Integrated memoryNativeto Vault is now recommended instead of using a consul for the vault storage.If there are Celtasons.
Kubernetes user If you implement Vault in Kubernetes, see theBóveda reference architecture on Kubernetes.
The following diagram shows the recommended architecture for implementing a single vault cluster using the consul memory using the business version of Vault and Consul:
In this architecture, there is the main risk of availability for the storage layer. With six knots in the cluster of the consultation, which are distributed between three as a redundancy area of the consultants configured availability zones or the loss of an entire availability zone.Since Vault only uses an active knot, the safe cluster only requires three members of the cluster to support the loss of two nodes or a complete availability zone.
If the implementation is not possible in three availability areas, the same architecture in two or in one of the availability areas is used at the expense of the meaning of meaning when the availability zone is interrupted.
Additional resistance is possible by implementing a multi -cluster architecture that enables additional options for disaster output and restoration.Multi -Cluster -Architekturhandbuchfor more informations.
The following diagram shows the recommended architecture for implementing a single vault cluster using the consul memory using the open source exemption from Vault and a consultation:
In this architecture, there is the main risk of availability for the storage layer. With five nodes in the Consult Cluster, which are distributed between three availability zone, three members of the cluster are to support the loss of two nodes or a complete availability zone.
If the implementation is not possible in three availability areas, the same architecture in two or in one of the availability areas is used at the expense of the meaning of meaning when the availability zone is interrupted.
For Vault Enterprise customers, the additional resistance is possible through the implementation of several cluster architecture, which enables additional options for performance and disaster restoration.Multi -Cluster -Architekturhandbuchfor more informations.
It is important to use a dedicated cluster for the vaulted storage facility, which is separated from each consul cluster used for other purposes in order to minimize the containment of resources in the storage layer.This probably requires the use of non -specified ports that force the connectivity of the network.In this architecture, the ports 7300 and 7301 were used instead of the standard values of the ports 8300 and 8301.
System requirements
This section contains specific recommendations for hardware capacity, network requirements and additional considerations in the infrastructure.Of which the employees of each member can consider the special requirements of their implementation and adapt.
warning All specifications described in this document areMinimum recommendationsWithout reservation of the vertical scale, redundancy or other SR requirements and without measure for user quantities or their cases in all scenarios.All resource requirements are directly proportional to the processes carried out by the safe cluster and the end user.
Use In order to summarize with your requirements and maximize the stability of your vaults, it is important to ensure that you carry out and contain load tests to monitor the use of resources and all informed matron of telemetry of the vaults.
Hardware modector for Bóveda server
The size recommendations were divided into two common cluster sizes.
FewThe groups would be suitable for most initial production players or for development and test environments.
GrandeThe groups are consistent with high workloads.This can be many transactions, a large amount of insurance or a combination of both.
Size | UPC | Storage | Disc capacity | I data carrier | DISC yield |
---|---|---|---|---|---|
Few | 2-4 Kern | 8-16 GB RAM | 100+ GB | 3000+ IOPS | 75+ MB / S |
Grande | 4-8 Kern | 32-64 GB RAM | 200+ GB | 3000+ IOPS | 125 MB / S |
For each cluster size, the following table contains hardware specifications for each main provider of Cloud infrastructure.
Offerer | Size | Types of instance/vm | DISC volume specifications |
---|---|---|---|
AWS | Few | M5. PresentM5.xlarge | 100+GBGP3 , 3000 IOPS, 125 MB/s |
Grande | M5.2XLARGE PresentM5.4XLARGE | 200+GBGP3 , 5000 IOPS, 125 MB/s | |
Azur | Few | Standard_d2s_v3 PresentStandard_d4s_v3 | 1024GB*Premium_LRS |
Grande | Standard_d8s_v3 PresentStandard_d16s_v3 | 1024GB*Premium_LRS | |
GCP | Few | N2 Standard-2 PresentN2-Standard-4 | 500 GB*Pd-äquilibel |
Grande | N2-Standard-8 PresentN2 Standard-16 | 1000 GB*PD-SSD |
Use In the case of GCP and Azure recommendations, the listed disc sizes are larger than the recommended minimum size, since for the recommended disc type, the available IAPs increases with the ability of the hard drive and the listed sizes are required to provide the required IOP.
Use
For predictable performance at Cloud supplier, it is recommended to avoid "explosive" CPU options and memoryRollo
yRape
Types of instance), the performance of which can be deteriorated quickly under continuous stress.
Hardware dimension for consul servers
Size | UPC | Storage | Disc capacity | I data carrier | DISC yield |
---|---|---|---|---|---|
Few | 2-4 Kern | 8-16 GB RAM | 100+ GB | 3000+ IOPS | 75+ MB / S |
Grande | 4-8 Kern | 32-64 GB RAM | 200+ GB | More than 10,000 IOPS | 250+ MB / S |
For each cluster size, the following table contains hardware specifications for each main provider of Cloud infrastructure.
Offerer | Size | Types of instance/vm | DISC volume specifications |
---|---|---|---|
AWS | Few | M5. PresentM5.xlarge | 100+GBGP3 , 3000 IOPS, 125 MB/s |
Grande | M5.2XLARGE PresentM5.4XLARGE | 200+GBGP3 , 10000 IOPS, 250 MB/s | |
Azur | Few | Standard_d2s_v3 PresentStandard_d4s_v3 | 1024GB*Premium_LRS |
Grande | Standard_d8s_v3 PresentStandard_d16s_v3 | 1024GB*Premium_LRS | |
GCP | Few | N2 Standard-2 PresentN2-Standard-4 | 500 GB*Pd-äquilibel |
Grande | N2-Standard-8 PresentN2 Standard-16 | 1000 GB*PD-SSD |
Use In the case of GCP and Azure recommendations, the listed disc sizes are larger than the recommended minimum size, since for the recommended disc type, the available IAPs increases with the ability of the hard drive and the listed sizes are required to provide the required IOP.
Hardware considerations
In general, the CPU and the storage obligation depend on the exact usage profile of the Care Center (e.g. dimensioned according to this data.
Hashicorp urgently recommends the Vault configuration with an activated examination data set.The effects of the additional memory -E/A separate disc.
Red latency and bandwidth
In order for the cluster members to remain synchronized correctly, the latency of the network between the availability zones must be less than eight milliseconds (8 ms).
The amount of network bandwidth used by Vault and Consul depends entirely on certain client use patterns.In many cases, even a high request requirement does not lead to a large part of the consumption of the network width. However, all members of the consul cluster are replicated.In addition, amulti cluster is required that vault data sets are transmitted between substitutes in order to provide the performance and replication of DR.
Network connectivity
The following table describes the requirements for network connectivity for vaultcluster nodes described configuration management and configuration management and configuration of the safety of Andresore systems.
Fuente | determination | Puerto | Protocol | Address | Goal |
---|---|---|---|---|---|
Customer machines | Burden | 443 | TCP | in -depth | Application distribution |
Burden | Vault | 8200 | TCP | in -depth | Vault API |
Vault | Vault | 8200 | TCP | bidirectional | Bootstrapping De Clúster |
Vault | Vault | 8201 | TCP | bidirectional | Floss, replication, applications of applications |
Vault | External systems | several | several | several | External API |
Consul and vaults | Consul server | 7300* | TCP | in -depth | Cónsul Server RPC |
Consul and vaults | Consul and vaults | 7301* | Tcp, udp | bidirectional | Consul LAN -GAPS |
Use The ports for RPC and gossip traffic differ from the standard values in this architecture.
Network encryption
The entire network traffic in connection with the safe must be encrypted in every segment. The standard HTTPS TLS encryption can be used from client machine to the loading compensation and the loading compensation for safe servers.
For the communication between Vault server (by default Puerto 8201) For the application for the application, Vault automatically negotiates an MTLS connection if journalists first connect the cluster via the API -adres sport (8200 by default).
For communication between consules in Vault and consul groups, it is recommended to configure the encryption of gossip that is treated in whichImplementation manual.
Recommendations of the load balance
For the highest level of reliability and stability, it is recommended to use a charging technology to distribute applications to your Vaultcluster members.Each main cloud platform offers good options for administrative load equilibrium services, or there are a number of autospence options and wellas services discovery systems such as consul.
If you cancel TLS in your charging compensation, you will also receive TLS for the connection from the loading balancer to the safe to minimize the exposure of secret content in your network.
To monitor the health of vault cluster nodes/V1/sys/health
The final API show that the condition of the node recorded and the data traffic led accordingly.SYS/Health -API -DokumentationFor special details about advice options and answer codes and their meanings.
Scalate considerations
In a cloud -based environment, it is recommended to use an managed scale service (as automatic scale groups on AWS) in order to maintain its complex vault and consul with healthy instances.However, it is important that you cannot replace Allconsul instances in the group in the group.Group of the scale administered too quickly, which leads to the risk of data loss.
There are two factors that are taken into account in order to climb the performance of his safe cluster.Addative members of the safe cluster do not increase the information for activities that are triggered in storage loss.For Vault Enterprise customers, she added what was added, and addedPerformance maintenance nodesHorizontal scalability of Canprovide for reading requirements within a safe cluster.
When implementing a safe cluster, it is important to consider and design specific requirements for several error scenarios:
Node
In a cluster with high availability stations using the consul memory, all data is saved in the consul cluster so that the failure of a safe node does not risk any data loss. To determine the management of the safe cluster, one of the Vault servers receives a blockade in the consulum data to determine for theto become active safe nodes.
If the leader is lost at any time, another safe node takes up his place as a cluster leader.In order to allow the loss of two vaulted notes, the minimum recommended vault size is three.
Consul achieves replication and leadership through the use of its consensus and problem protocols cluster, the recommended minimum size of the ISFIVE node of the consul cluster.
Availability zone failure
When implementing vault and consul members in the recommended architecture Three availability zones, general architecture can tolerate the loss of a unique availability zone.
In cases where use in three areas is not possible, the error of the antavilier zone can cause the vault group, for example the failure of an availability zone would have a chance of 50%to lose the consul cluster of its raft and cannot cough.
Der Cluster region failure
In the event of a fault of a region or a whole cluster Vault EnterpriseeprovideMulti -Cluster -Architekturhandbuchfor more informations.
Outside -token -storage
IsTokenization Transformation MerkmalAchieved general availability in Vault 1.7.This property introduces additional considerations.
The tokenization function requires an external data warehouse to facilitate the command of tokens for cryptographic values.Disasters that meet the SamereCorements that she has for the safe itself.In order to guarantee the consistency of the data, the security indicator of the external data warehouse must be synchronized with vaults.
Vault group
A Vault cluster is a number of vault processes that together perform a confresor service. This safe processes can be carried out on physical or virtual or incoming servers.
Availability zone
An availability zone is a single network failure domain in which part of a safe cluster is housed.Examples of availability areas include:
- An isolated data center
- An isolated cage in a data center if it is isolated from other cages by high half (electricity, network, etc.)
- An "availability zone" in AWS or Azure;A "area" in GCP
Region
One region is a collection of one or more availability areas in a network with a low latency.The regions usually separate for significant distances.One or more safe groups could be housed in a region, but a single safe group would not extend problems due to the network cladding in several regions.
Autoscalado
Autolation is the process of automatic scaing calculation resources based on service activities.Autoscale can be horizontal, which means that more machines are in the series of resources or vertical, which means increasing the capacity of existing machines.
Each main cloud supplier offers a managed self -scaling service:
Clouds | Administered self -proclaimed service |
---|---|
AWS | Automatic scale groups |
Azur | Virtual mechanical scale rates |
GCP | Groups of managed authorities |
Burden
A loading balancer is a system that distributes network requirements under several verses.It can be an administered service of a cloud supplier, a physical network application, a software piece or a platform for recognizing services such as consul.
Each main supplier offers one or more administrated charging services:
Clouds | Capa | Admittedly administered load balance service |
---|---|---|
AWS | Capa 4 | Network load balancer |
Capa 7 | Application load compensation | |
Azur | Capa 4 | Blue load balance |
Capa 7 | Azure Application Gateway | |
GCP | CAPA 4/7 | Cloud load balance |
Additional references
Internal woodenarchitecture
Consultant documentation
Internal consul
Handbook on the architecture of several vault cluster architecture
Vault implementation manual
Hardening
(Video) Stephanie Rewis & Brandon Ferrua: “Crawl, Walk, Run – The Evolution of a Design Sys…” — Clarity 2016