Difference between revisions of "LPIC-306 Objectives V3.0"
FabianThorns (Talk | contribs) (→353.2 Ceph Storage Clusters (weight: 8)) |
FabianThorns (Talk | contribs) |
||
(63 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
__FORCETOC__ | __FORCETOC__ | ||
==Introduction== | ==Introduction== | ||
− | The description of the entire [[LPIC-3]] | + | The description of the entire [[LPIC-3]] program is listed [[LPIC-3|here]]. |
<br /><br /> | <br /><br /> | ||
+ | |||
==Version Information== | ==Version Information== | ||
− | These objectives are | + | These objectives are for version 3.0. |
− | There is also a [[ | + | This exam results from a split of version 2.0 of the exam 304. |
+ | |||
+ | There is also a [[LPIC306SummaryVersion2To3|summary and detailed information]] on the changes from version 2.0 of exam 304 to 3.0 of these objectives. | ||
The version [[LPIC-304 Objectives V2|2.x objectives]] can be found [[LPIC-304 Objectives V2|here]]. | The version [[LPIC-304 Objectives V2|2.x objectives]] can be found [[LPIC-304 Objectives V2|here]]. | ||
Line 15: | Line 18: | ||
==Translations of Objectives== | ==Translations of Objectives== | ||
The following translations of the objectives are available on this wiki: | The following translations of the objectives are available on this wiki: | ||
− | * [[LPIC- | + | * [[LPIC-306 Objectives V3.0|English]] |
− | * [[LPIC- | + | * [[LPIC-306 Objectives V3.0(JA)|Japanese]] |
<br /> | <br /> | ||
==Objectives== | ==Objectives== | ||
− | ===''Topic | + | ===''Topic 361: High Availability Cluster Management''=== |
− | ====<span style="color:navy"> | + | ====<span style="color:navy">361.1 High Availability Concepts and Theory (weight: 6)</span>==== |
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
− | | style="background:#eaeaea" | | + | | style="background:#eaeaea" | 6 |
|- | |- | ||
| style="background:#dadada; padding-right:1em" | '''Description''' | | style="background:#dadada; padding-right:1em" | '''Description''' | ||
Line 30: | Line 33: | ||
|} | |} | ||
'''Key Knowledge Areas:''' | '''Key Knowledge Areas:''' | ||
− | * Understand the | + | * Understand the goals of High Availability and Site Reliability Engineering |
− | * Understand recovery and cluster reorganization mechanisms | + | * Understand common cluster architectures |
− | * Design an appropriate cluster architecture for a given purpose | + | * Understand recovery and cluster reorganization mechanisms |
− | * | + | * Design an appropriate cluster architecture for a given purpose |
− | * | + | * Understand application aspects of high availability |
− | ''' | + | * Understand operational considerations of high availability |
− | * Active/Passive Cluster | + | '''Partial list of the used files, terms and utilities:''' |
− | * Failover Cluster | + | * Active/Passive Cluster |
− | * Shared-Nothing Cluster | + | * Active/Active Cluster |
+ | * Failover Cluster | ||
+ | * Load Balanced Cluster | ||
+ | * Shared-Nothing Cluster | ||
+ | * Shared-Disk Cluster | ||
* Cluster resources | * Cluster resources | ||
* Cluster services | * Cluster services | ||
* Quorum | * Quorum | ||
− | * Fencing | + | * Fencing (Node and Resource Level Fencing) |
* Split brain | * Split brain | ||
* Redundancy | * Redundancy | ||
Line 48: | Line 55: | ||
* Mean Time To Repair (MTTR) | * Mean Time To Repair (MTTR) | ||
* Service Level Agreement (SLA) | * Service Level Agreement (SLA) | ||
− | * | + | * Disaster Recovery |
+ | * State Handling | ||
* Replication | * Replication | ||
* Session handling | * Session handling | ||
<br /> | <br /> | ||
− | ====<span style="color:navy"> | + | ====<span style="color:navy">361.2 Load Balanced Clusters (weight: 8)</span>==== |
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
Line 62: | Line 70: | ||
|} | |} | ||
'''Key Knowledge Areas:''' | '''Key Knowledge Areas:''' | ||
− | * | + | * Understand the concepts of LVS / IPVS |
− | * | + | * Understand the basics of VRRP |
− | * | + | * Configure keepalived |
− | * | + | * Configure ldirectord |
− | * | + | * Configure backend server networking |
− | * | + | * Understand HAProxy |
− | * | + | * Configure HAProxy |
− | ''' | + | '''Partial list of the used files, terms and utilities:''' |
* ipvsadm | * ipvsadm | ||
* syncd | * syncd | ||
Line 82: | Line 90: | ||
<br /> | <br /> | ||
− | ====<span style="color:navy"> | + | ====<span style="color:navy">361.3 Failover Clusters (weight: 8)</span>==== |
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
Line 88: | Line 96: | ||
|- | |- | ||
| style="background:#dadada; padding-right:1em" | '''Description''' | | style="background:#dadada; padding-right:1em" | '''Description''' | ||
− | | style="background:#eaeaea" | Candidates should have experience in the installation, configuration, maintenance and troubleshooting of a Pacemaker cluster. This includes the use of Corosync. The focus is on Pacemaker | + | | style="background:#eaeaea" | Candidates should have experience in the installation, configuration, maintenance and troubleshooting of a Pacemaker cluster. This includes the use of Corosync. The focus is on Pacemaker 2.x for Corosync 2.x. |
|} | |} | ||
'''Key Knowledge Areas:''' | '''Key Knowledge Areas:''' | ||
− | * | + | * Understand the architecture and components of Pacemaker (CIB, CRMd, PEngine, LRMd, DC, STONITHd) |
− | * Pacemaker cluster | + | * Manage Pacemaker cluster configurations |
− | * | + | * Understand Pacemaker resource classes (OCF, LSB, Systemd, Service, STONITH, Nagios) |
− | * | + | * Manage Pacemaker resources |
− | * | + | * Manage resource rules and constraints (location, order, colocation). |
− | * Pacemaker | + | * Manage advanced resource features (templates, groups, clone resources, multi-state resources) |
− | * Pacemaker | + | * Obtain node information and manage node health |
− | * | + | * Manage quorum and fencing in a Pacemaker cluster |
− | * Awareness of other cluster engines (OpenAIS, Heartbeat, CMAN) | + | * Configure the Split Brain Detector on shared storage |
− | ''' | + | * Manage Pacemaker using pcs |
+ | * Manage Pacemaker using crmsh | ||
+ | * Configure and management of corosync in conjunction with Pacemaker | ||
+ | * Awareness of Pacemaker ACLs | ||
+ | * Awareness of other cluster engines (OpenAIS, Heartbeat, CMAN) | ||
+ | '''Partial list of the used files, terms and utilities:''' | ||
* pcs | * pcs | ||
* crm | * crm | ||
Line 118: | Line 131: | ||
* corosync-quorumtool | * corosync-quorumtool | ||
* stonith_admin | * stonith_admin | ||
+ | * stonith | ||
+ | * ocf:pacemaker:ping | ||
+ | * ocf:pacemaker:NodeUtilization | ||
+ | * ocf:pacemaker:ocf:SysInfo | ||
+ | * ocf:pacemaker:HealthCPU | ||
+ | * ocf:pacemaker:HealthSMART | ||
+ | * sbd | ||
<br /> | <br /> | ||
− | ===''Topic | + | ===''Topic 362: High Availability Cluster Storage''=== |
− | ====<span style="color:navy"> | + | ====<span style="color:navy">362.1 DRBD (weight: 6)</span>==== |
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
Line 127: | Line 147: | ||
|- | |- | ||
| style="background:#dadada; padding-right:1em" | '''Description''' | | style="background:#dadada; padding-right:1em" | '''Description''' | ||
− | | style="background:#eaeaea" | Candidates are expected to have the experience and knowledge to install, configure, maintain and troubleshoot DRBD devices. | + | | style="background:#eaeaea" | Candidates are expected to have the experience and knowledge to install, configure, maintain and troubleshoot DRBD devices. This includes integration with Pacemaker. DRBD configuration of version 9.0.x is covered. |
|} | |} | ||
'''Key Knowledge Areas:''' | '''Key Knowledge Areas:''' | ||
− | * | + | * Understand the DRBD architecture |
− | * | + | * Understand DRBD resources, states and replication modes |
− | * | + | * Configure DRBD disks and devices |
− | * | + | * Configure DRBD networking connections and meshes |
− | * | + | * Configure DRBD automatic recovery and error handling |
− | * | + | * Configure DRBD quorum and handlers for split brain and fencing |
− | * | + | * Manage DRBD using drbdadm |
− | + | * Understand the principles of drbdsetup and drbdmeta | |
− | ''' | + | * Restore and verify the integrity of a DRBD device after an outage |
+ | * Integrate DRBD with Pacemaker | ||
+ | * Understand the architecture and features of LINSTOR | ||
+ | '''Partial list of the used files, terms and utilities:''' | ||
* Protocol A, B and C | * Protocol A, B and C | ||
* Primary, Secondary | * Primary, Secondary | ||
Line 144: | Line 167: | ||
* drbd kernel module | * drbd kernel module | ||
* drbdadm | * drbdadm | ||
+ | * drbdmon | ||
* drbdsetup | * drbdsetup | ||
* drbdmeta | * drbdmeta | ||
* /etc/drbd.conf | * /etc/drbd.conf | ||
+ | * /etc/drbd.d/ | ||
* /proc/drbd | * /proc/drbd | ||
− | |||
− | |||
− | |||
<br /> | <br /> | ||
− | ====<span style="color:navy"> | + | ====<span style="color:navy">362.2 Cluster Storage Access (weight: 3)</span>==== |
+ | {| | ||
+ | | style="background:#dadada" | '''Weight''' | ||
+ | | style="background:#eaeaea" | 3 | ||
+ | |- | ||
+ | | style="background:#dadada; padding-right:1em" | '''Description''' | ||
+ | |||
+ | | style="background:#eaeaea" | Candidates should be able to connect a Linux node to remote block storage. This includes understanding common SAN technology and architectures, including management of iSCSI, as well as configuring multipathing for high availability and using LVM on a clustered storage. | ||
+ | |} | ||
+ | '''Key Knowledge Areas:''' | ||
+ | * Understand the concepts of Storage Area Networks | ||
+ | * Understand the concepts of Fibre Channel, including Fibre Channel Topologies | ||
+ | * Understand and manage iSCSI targets and initiators | ||
+ | * Understand and configure Device Mapper Multipath I/O (DM-MPIO) | ||
+ | * Understand the concept of a Distributed Lock Manager (DLM) | ||
+ | * Understand and manage clustered LVM | ||
+ | * Manage DLM and LVM with Pacemaker | ||
+ | '''Partial list of the used files, terms and utilities:''' | ||
+ | * tgtadm | ||
+ | * targets.conf | ||
+ | * iscsiadm | ||
+ | * iscsid.conf | ||
+ | * /etc/multipath.conf | ||
+ | * multipath | ||
+ | * kpartx | ||
+ | * pvmove | ||
+ | * vgchange | ||
+ | * lvchange | ||
+ | |||
+ | <br /> | ||
+ | <br /> | ||
+ | |||
+ | ====<span style="color:navy">362.3 Clustered File Systems (weight: 4)</span>==== | ||
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
Line 159: | Line 213: | ||
|- | |- | ||
| style="background:#dadada; padding-right:1em" | '''Description''' | | style="background:#dadada; padding-right:1em" | '''Description''' | ||
− | | style="background:#eaeaea" | Candidates should | + | | style="background:#eaeaea" | Candidates should be able to install, maintain and troubleshoot GFS2 and OCFS2 filesystems. This includes awareness of other clustered filesystems available on Linux. |
|} | |} | ||
'''Key Knowledge Areas:''' | '''Key Knowledge Areas:''' | ||
− | * Understand the principles of cluster file systems | + | * Understand the principles of cluster file systems and distributed file systems |
− | * Create, maintain and troubleshoot GFS2 file systems in a cluster | + | * Understand the Distributed Lock Manager |
− | * Create, maintain and troubleshoot OCFS2 file systems in a cluster | + | * Create, maintain and troubleshoot GFS2 file systems in a cluster |
− | + | * Create, maintain and troubleshoot OCFS2 file systems in a cluster | |
− | * Awareness of the O2CB cluster stack | + | * Awareness of the O2CB cluster stack |
− | * Awareness of other commonly used clustered file systems | + | * Awareness of other commonly used clustered file systems, such as AFS and Lustre |
− | ''' | + | '''Partial list of the used files, terms and utilities:''' |
− | + | ||
* mkfs.gfs2 | * mkfs.gfs2 | ||
* mount.gfs2 | * mount.gfs2 | ||
Line 183: | Line 236: | ||
* o2info | * o2info | ||
* o2image | * o2image | ||
− | |||
− | |||
− | |||
<br /> | <br /> | ||
<br /> | <br /> | ||
− | ===''Topic | + | |
− | ====<span style="color:navy"> | + | ===''Topic 363: High Availability Distributed Storage''=== |
+ | ====<span style="color:navy">363.1 GlusterFS Storage Clusters (weight: 5)</span>==== | ||
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
− | | style="background:#eaeaea" | | + | | style="background:#eaeaea" | 5 |
|- | |- | ||
| style="background:#dadada; padding-right:1em" | '''Description''' | | style="background:#dadada; padding-right:1em" | '''Description''' | ||
Line 203: | Line 254: | ||
* Configure high availability aspects of GlusterFS | * Configure high availability aspects of GlusterFS | ||
* Scale up a GlusterFS cluster | * Scale up a GlusterFS cluster | ||
+ | * Replace failed bricks | ||
+ | * Recover GlusterFS from a physical media failure | ||
* Restore and verify the integrity of a GlusterFS cluster after an outage | * Restore and verify the integrity of a GlusterFS cluster after an outage | ||
− | |||
* Awareness of GNFS | * Awareness of GNFS | ||
− | ''' | + | '''Partial list of the used files, terms and utilities:''' |
* gluster (including relevant subcommands) | * gluster (including relevant subcommands) | ||
<br /> | <br /> | ||
− | ====<span style="color:navy"> | + | ====<span style="color:navy">363.2 Ceph Storage Clusters (weight: 8)</span>==== |
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
Line 220: | Line 272: | ||
'''Key Knowledge Areas:''' | '''Key Knowledge Areas:''' | ||
* Understand the architecture and components of Ceph | * Understand the architecture and components of Ceph | ||
− | * Manage OSD, MON and MDS | + | * Manage OSD, MGR, MON and MDS |
* Understand and manage placement groups and pools | * Understand and manage placement groups and pools | ||
+ | * Understand storage backends (FileStore and BlueStore) | ||
+ | * Initialize a Ceph cluster | ||
* Create and manage Rados Block Devices | * Create and manage Rados Block Devices | ||
* Create and manage CephFS volumes, including snapshots | * Create and manage CephFS volumes, including snapshots | ||
Line 228: | Line 282: | ||
* Configure high availability aspects of Ceph | * Configure high availability aspects of Ceph | ||
* Scale up a Ceph cluster | * Scale up a Ceph cluster | ||
− | |||
* Restore and verify the integrity of a Ceph cluster after an outage | * Restore and verify the integrity of a Ceph cluster after an outage | ||
− | ''' | + | * Understand key concepts of Ceph updates, including update order, tunables and features |
+ | '''Partial list of the used files, terms and utilities:''' | ||
* ceph-deploy (including relevant subcommands) | * ceph-deploy (including relevant subcommands) | ||
* ceph.conf | * ceph.conf | ||
Line 237: | Line 291: | ||
* rdb (including relevant subcommands) | * rdb (including relevant subcommands) | ||
* cephfs (including relevant subcommands) | * cephfs (including relevant subcommands) | ||
− | * ceph-authtool | + | * ceph-volume (including relevant subcommands) |
− | * ceph-crushtool | + | * ceph-authtool |
+ | * ceph-bluestore-tool | ||
+ | * crushtool | ||
<br /> | <br /> | ||
<br /> | <br /> | ||
− | ===''Topic | + | ===''Topic 364: Single Node High Availability''=== |
− | ====<span style="color:navy"> | + | ====<span style="color:navy">364.1 Hardware and Resource High Availability (weight: 2)</span>==== |
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
Line 249: | Line 305: | ||
|- | |- | ||
| style="background:#dadada; padding-right:1em" | '''Description''' | | style="background:#dadada; padding-right:1em" | '''Description''' | ||
− | | style="background:#eaeaea" | | + | | style="background:#eaeaea" | Candidates should be able to monitor a local node for potential hardware failures and resource shortages. |
|} | |} | ||
'''Key Knowledge Areas:''' | '''Key Knowledge Areas:''' | ||
− | ''' | + | * Understand and monitor S.M.A.R.T values using smartmontools, including triggering frequent disk checks |
+ | * Configure system shutdown at specific UPS events | ||
+ | * Configure monit for alerts in case of resource exhaustion | ||
+ | '''Partial list of the used files, terms and utilities:''' | ||
+ | * smartctl | ||
+ | * /etc/smartd.conf | ||
+ | * smartd | ||
+ | * nvme-cli | ||
+ | * apcupsd | ||
+ | * apctest | ||
+ | * monit | ||
<br /> | <br /> | ||
− | ====<span style="color:navy"> | + | |
+ | ====<span style="color:navy">364.2 Advanced RAID (weight: 2)</span>==== | ||
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
Line 260: | Line 327: | ||
|- | |- | ||
| style="background:#dadada; padding-right:1em" | '''Description''' | | style="background:#dadada; padding-right:1em" | '''Description''' | ||
− | | style="background:#eaeaea" | | + | | style="background:#eaeaea" | Candidates should be able to manage software raid devices on Linux. This includes advanced features such as partitonable RAIDs and RAID containers as well as recovering RAID arrays after a failure. |
|} | |} | ||
'''Key Knowledge Areas:''' | '''Key Knowledge Areas:''' | ||
− | ''' | + | * Manage RAID devices using various raid levels, including hot spare discs, partitionable RAIDs and RAID containers |
+ | * Add and remove devices from an existing RAID | ||
+ | * Change the RAID level of an existing device | ||
+ | * Recover a RAID device after a failure | ||
+ | * Understand various metadata formats and RAID geometries | ||
+ | * Understand availability and performance properties of various raid levels | ||
+ | * Configure mdadm monitoring and reporting | ||
+ | '''Partial list of the used files, terms and utilities:''' | ||
+ | * mdadm | ||
+ | * /proc/mdstat | ||
+ | * /proc/sys/dev/raid/* | ||
<br /> | <br /> | ||
− | ====<span style="color:navy"> | + | |
− | + | ====<span style="color:navy">364.3 Advanced LVM (weight: 3)</span>==== | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
Line 282: | Line 349: | ||
|- | |- | ||
| style="background:#dadada; padding-right:1em" | '''Description''' | | style="background:#dadada; padding-right:1em" | '''Description''' | ||
− | | style="background:#eaeaea" | | + | | style="background:#eaeaea" | Candidates should be able to configure LVM volumes. This includes managing LVM snapshot, pools and RAIDs. |
|} | |} | ||
'''Key Knowledge Areas:''' | '''Key Knowledge Areas:''' | ||
− | ''' | + | * Understand and manage LVM, including linear and striped volumes |
+ | * Extend, grow, shrink and move LVM volumes | ||
+ | * Understand and manage LVM snapshots | ||
+ | * Understand and manage LVM thin and thick pools | ||
+ | * Understand and manage LVM RAIDs | ||
+ | '''Partial list of the used files, terms and utilities:''' | ||
+ | * /etc/lvm/lvm.conf | ||
+ | * pvcreate | ||
+ | * pvdisplay | ||
+ | * pvmove | ||
+ | * pvremove | ||
+ | * pvresize | ||
+ | * vgcreate | ||
+ | * vgdisplay | ||
+ | * vgreduce | ||
+ | * lvconvert | ||
+ | * lvcreate | ||
+ | * lvdisplay | ||
+ | * lvextend | ||
+ | * lvreduce | ||
+ | * lvresize | ||
<br /> | <br /> | ||
− | + | ||
− | + | ====<span style="color:navy">364.4 Network High Availability (weight: 5)</span>==== | |
− | ====<span style="color:navy"> | + | |
{| | {| | ||
| style="background:#dadada" | '''Weight''' | | style="background:#dadada" | '''Weight''' | ||
− | | style="background:#eaeaea" | | + | | style="background:#eaeaea" | 5 |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
|- | |- | ||
| style="background:#dadada; padding-right:1em" | '''Description''' | | style="background:#dadada; padding-right:1em" | '''Description''' | ||
− | | style="background:#eaeaea" | | + | | style="background:#eaeaea" | Candidates should be able to configure redundant networking connections and manage VLANs. Furthermore, candidates should have a basic understanding of BGP. |
|} | |} | ||
'''Key Knowledge Areas:''' | '''Key Knowledge Areas:''' | ||
− | ''' | + | * Understand and configure bonding network interface |
+ | * Network bond modes and algorithms (active-backup, balance-tlb, balance-alb, 802.3ad, balance-rr, balance-xor, broadcast) | ||
+ | * Configure switch configuration for high availability, including RSTP | ||
+ | * Configure VLANs on regular and bonded network interfaces | ||
+ | * Persist bonding and VLAN configuration | ||
+ | * Understand the principle of autonomous systems and BGP to manage external redundant uplinks | ||
+ | * Awareness of traffic shaping and control capabilities of Linux | ||
+ | '''Partial list of the used files, terms and utilities:''' | ||
+ | * bonding.ko (including relevant module options) | ||
+ | * /etc/network/interfaces | ||
+ | * /etc/sysconfig/networking-scripts/ifcfg-* | ||
+ | * /etc/systemd/network/*.network | ||
+ | * /etc/systemd/network/*.netdev | ||
+ | * nmcli | ||
+ | * /sys/class/net/bonding_masters | ||
+ | * /sys/class/net/bond*/bonding/miimon | ||
+ | * /sys/class/net/bond*/bonding/slaves | ||
+ | * ifenslave | ||
+ | * ip | ||
<br /> | <br /> | ||
<br /> | <br /> |
Latest revision as of 11:44, 21 February 2023
Contents
Introduction
The description of the entire LPIC-3 program is listed here.
Version Information
These objectives are for version 3.0.
This exam results from a split of version 2.0 of the exam 304.
There is also a summary and detailed information on the changes from version 2.0 of exam 304 to 3.0 of these objectives.
The version 2.x objectives can be found here.
Translations of Objectives
The following translations of the objectives are available on this wiki:
- English
- Japanese
Objectives
Topic 361: High Availability Cluster Management
361.1 High Availability Concepts and Theory (weight: 6)
Weight | 6 |
Description | Candidates should understand the properties and design approaches of high availability clusters. |
Key Knowledge Areas:
- Understand the goals of High Availability and Site Reliability Engineering
- Understand common cluster architectures
- Understand recovery and cluster reorganization mechanisms
- Design an appropriate cluster architecture for a given purpose
- Understand application aspects of high availability
- Understand operational considerations of high availability
Partial list of the used files, terms and utilities:
- Active/Passive Cluster
- Active/Active Cluster
- Failover Cluster
- Load Balanced Cluster
- Shared-Nothing Cluster
- Shared-Disk Cluster
- Cluster resources
- Cluster services
- Quorum
- Fencing (Node and Resource Level Fencing)
- Split brain
- Redundancy
- Mean Time Before Failure (MTBF)
- Mean Time To Repair (MTTR)
- Service Level Agreement (SLA)
- Disaster Recovery
- State Handling
- Replication
- Session handling
361.2 Load Balanced Clusters (weight: 8)
Weight | 8 |
Description | Candidates should know how to install, configure, maintain and troubleshoot LVS. This includes the configuration and use of keepalived and ldirectord. Candidates should further be able to install, configure, maintain and troubleshoot HAProxy. |
Key Knowledge Areas:
- Understand the concepts of LVS / IPVS
- Understand the basics of VRRP
- Configure keepalived
- Configure ldirectord
- Configure backend server networking
- Understand HAProxy
- Configure HAProxy
Partial list of the used files, terms and utilities:
- ipvsadm
- syncd
- LVS Forwarding (NAT, Direct Routing, Tunneling, Local Node)
- connection scheduling algorithms
- keepalived configuration file
- ldirectord configuration file
- genhash
- HAProxy configuration file
- load balancing algorithms
- ACLs
361.3 Failover Clusters (weight: 8)
Weight | 8 |
Description | Candidates should have experience in the installation, configuration, maintenance and troubleshooting of a Pacemaker cluster. This includes the use of Corosync. The focus is on Pacemaker 2.x for Corosync 2.x. |
Key Knowledge Areas:
- Understand the architecture and components of Pacemaker (CIB, CRMd, PEngine, LRMd, DC, STONITHd)
- Manage Pacemaker cluster configurations
- Understand Pacemaker resource classes (OCF, LSB, Systemd, Service, STONITH, Nagios)
- Manage Pacemaker resources
- Manage resource rules and constraints (location, order, colocation).
- Manage advanced resource features (templates, groups, clone resources, multi-state resources)
- Obtain node information and manage node health
- Manage quorum and fencing in a Pacemaker cluster
- Configure the Split Brain Detector on shared storage
- Manage Pacemaker using pcs
- Manage Pacemaker using crmsh
- Configure and management of corosync in conjunction with Pacemaker
- Awareness of Pacemaker ACLs
- Awareness of other cluster engines (OpenAIS, Heartbeat, CMAN)
Partial list of the used files, terms and utilities:
- pcs
- crm
- crm_mon
- crm_verify
- crm_simulate
- crm_shadow
- crm_resource
- crm_attribute
- crm_node
- crm_standby
- cibadmin
- corosync.conf
- authkey
- corosync-cfgtool
- corosync-cmapctl
- corosync-quorumtool
- stonith_admin
- stonith
- ocf:pacemaker:ping
- ocf:pacemaker:NodeUtilization
- ocf:pacemaker:ocf:SysInfo
- ocf:pacemaker:HealthCPU
- ocf:pacemaker:HealthSMART
- sbd
Topic 362: High Availability Cluster Storage
362.1 DRBD (weight: 6)
Weight | 6 |
Description | Candidates are expected to have the experience and knowledge to install, configure, maintain and troubleshoot DRBD devices. This includes integration with Pacemaker. DRBD configuration of version 9.0.x is covered. |
Key Knowledge Areas:
- Understand the DRBD architecture
- Understand DRBD resources, states and replication modes
- Configure DRBD disks and devices
- Configure DRBD networking connections and meshes
- Configure DRBD automatic recovery and error handling
- Configure DRBD quorum and handlers for split brain and fencing
- Manage DRBD using drbdadm
- Understand the principles of drbdsetup and drbdmeta
- Restore and verify the integrity of a DRBD device after an outage
- Integrate DRBD with Pacemaker
- Understand the architecture and features of LINSTOR
Partial list of the used files, terms and utilities:
- Protocol A, B and C
- Primary, Secondary
- Three-way replication
- drbd kernel module
- drbdadm
- drbdmon
- drbdsetup
- drbdmeta
- /etc/drbd.conf
- /etc/drbd.d/
- /proc/drbd
362.2 Cluster Storage Access (weight: 3)
Weight | 3 |
Description | Candidates should be able to connect a Linux node to remote block storage. This includes understanding common SAN technology and architectures, including management of iSCSI, as well as configuring multipathing for high availability and using LVM on a clustered storage. |
Key Knowledge Areas:
- Understand the concepts of Storage Area Networks
- Understand the concepts of Fibre Channel, including Fibre Channel Topologies
- Understand and manage iSCSI targets and initiators
- Understand and configure Device Mapper Multipath I/O (DM-MPIO)
- Understand the concept of a Distributed Lock Manager (DLM)
- Understand and manage clustered LVM
- Manage DLM and LVM with Pacemaker
Partial list of the used files, terms and utilities:
- tgtadm
- targets.conf
- iscsiadm
- iscsid.conf
- /etc/multipath.conf
- multipath
- kpartx
- pvmove
- vgchange
- lvchange
362.3 Clustered File Systems (weight: 4)
Weight | 4 |
Description | Candidates should be able to install, maintain and troubleshoot GFS2 and OCFS2 filesystems. This includes awareness of other clustered filesystems available on Linux. |
Key Knowledge Areas:
- Understand the principles of cluster file systems and distributed file systems
- Understand the Distributed Lock Manager
- Create, maintain and troubleshoot GFS2 file systems in a cluster
- Create, maintain and troubleshoot OCFS2 file systems in a cluster
- Awareness of the O2CB cluster stack
- Awareness of other commonly used clustered file systems, such as AFS and Lustre
Partial list of the used files, terms and utilities:
- mkfs.gfs2
- mount.gfs2
- fsck.gfs2
- gfs2_grow
- gfs2_edit
- gfs2_jadd
- mkfs.ocfs2
- mount.ocfs2
- fsck.ocfs2
- tunefs.ocfs2
- mounted.ocfs2
- o2info
- o2image
Topic 363: High Availability Distributed Storage
363.1 GlusterFS Storage Clusters (weight: 5)
Weight | 5 |
Description | Candidates should be able to manage and maintain a GlusterFS storage cluster. |
Key Knowledge Areas:
- Understand the architecture and components of GlusterFS
- Manage GlusterFS peers, trusted storge pools, bricks and volumes
- Mount and use an existing GlusterFS
- Configure high availability aspects of GlusterFS
- Scale up a GlusterFS cluster
- Replace failed bricks
- Recover GlusterFS from a physical media failure
- Restore and verify the integrity of a GlusterFS cluster after an outage
- Awareness of GNFS
Partial list of the used files, terms and utilities:
- gluster (including relevant subcommands)
363.2 Ceph Storage Clusters (weight: 8)
Weight | 8 |
Description | Candidates should be able to manage and maintain a Ceph Cluster. This includes the configuration of RGW, RDB devices and CephFS. |
Key Knowledge Areas:
- Understand the architecture and components of Ceph
- Manage OSD, MGR, MON and MDS
- Understand and manage placement groups and pools
- Understand storage backends (FileStore and BlueStore)
- Initialize a Ceph cluster
- Create and manage Rados Block Devices
- Create and manage CephFS volumes, including snapshots
- Mount and use an existing CephFS
- Understand and adjust CRUSH maps
- Configure high availability aspects of Ceph
- Scale up a Ceph cluster
- Restore and verify the integrity of a Ceph cluster after an outage
- Understand key concepts of Ceph updates, including update order, tunables and features
Partial list of the used files, terms and utilities:
- ceph-deploy (including relevant subcommands)
- ceph.conf
- ceph (including relevant subcommands)
- rados (including relevant subcommands)
- rdb (including relevant subcommands)
- cephfs (including relevant subcommands)
- ceph-volume (including relevant subcommands)
- ceph-authtool
- ceph-bluestore-tool
- crushtool
Topic 364: Single Node High Availability
364.1 Hardware and Resource High Availability (weight: 2)
Weight | 2 |
Description | Candidates should be able to monitor a local node for potential hardware failures and resource shortages. |
Key Knowledge Areas:
- Understand and monitor S.M.A.R.T values using smartmontools, including triggering frequent disk checks
- Configure system shutdown at specific UPS events
- Configure monit for alerts in case of resource exhaustion
Partial list of the used files, terms and utilities:
- smartctl
- /etc/smartd.conf
- smartd
- nvme-cli
- apcupsd
- apctest
- monit
364.2 Advanced RAID (weight: 2)
Weight | 2 |
Description | Candidates should be able to manage software raid devices on Linux. This includes advanced features such as partitonable RAIDs and RAID containers as well as recovering RAID arrays after a failure. |
Key Knowledge Areas:
- Manage RAID devices using various raid levels, including hot spare discs, partitionable RAIDs and RAID containers
- Add and remove devices from an existing RAID
- Change the RAID level of an existing device
- Recover a RAID device after a failure
- Understand various metadata formats and RAID geometries
- Understand availability and performance properties of various raid levels
- Configure mdadm monitoring and reporting
Partial list of the used files, terms and utilities:
- mdadm
- /proc/mdstat
- /proc/sys/dev/raid/*
364.3 Advanced LVM (weight: 3)
Weight | 3 |
Description | Candidates should be able to configure LVM volumes. This includes managing LVM snapshot, pools and RAIDs. |
Key Knowledge Areas:
- Understand and manage LVM, including linear and striped volumes
- Extend, grow, shrink and move LVM volumes
- Understand and manage LVM snapshots
- Understand and manage LVM thin and thick pools
- Understand and manage LVM RAIDs
Partial list of the used files, terms and utilities:
- /etc/lvm/lvm.conf
- pvcreate
- pvdisplay
- pvmove
- pvremove
- pvresize
- vgcreate
- vgdisplay
- vgreduce
- lvconvert
- lvcreate
- lvdisplay
- lvextend
- lvreduce
- lvresize
364.4 Network High Availability (weight: 5)
Weight | 5 |
Description | Candidates should be able to configure redundant networking connections and manage VLANs. Furthermore, candidates should have a basic understanding of BGP. |
Key Knowledge Areas:
- Understand and configure bonding network interface
- Network bond modes and algorithms (active-backup, balance-tlb, balance-alb, 802.3ad, balance-rr, balance-xor, broadcast)
- Configure switch configuration for high availability, including RSTP
- Configure VLANs on regular and bonded network interfaces
- Persist bonding and VLAN configuration
- Understand the principle of autonomous systems and BGP to manage external redundant uplinks
- Awareness of traffic shaping and control capabilities of Linux
Partial list of the used files, terms and utilities:
- bonding.ko (including relevant module options)
- /etc/network/interfaces
- /etc/sysconfig/networking-scripts/ifcfg-*
- /etc/systemd/network/*.network
- /etc/systemd/network/*.netdev
- nmcli
- /sys/class/net/bonding_masters
- /sys/class/net/bond*/bonding/miimon
- /sys/class/net/bond*/bonding/slaves
- ifenslave
- ip