TiDB Operator - Automatic Failover
What Is Automatic Failover?
If a node in a TiDB cluster fails and stops working for some reason, TiDB appears to automatically add a new node and preserve cluster availability.
Automatic Failover Test
For this test, I will kill one of two TiDB server processes and check whether it recovers automatically.
First, check the tidb-server processes.
]$ ps -ef | grep tidb-server | grep -v grep
kimkc 14506 14429 0 11:53 pts/0 00:00:22 /home/kimkc/.tiup/components/tidb/v6.1.0/tidb-server -P 4000 --store=tikv --host=127.0.0.1 --status=10080 --path=127.0.0.1:2379,127.0.0.1:2382,127.0.0.1:2384 --log-file=/home/kimkc/.tiup/data/TLr4XQE/tidb-0/tidb.log
kimkc 14508 14429 0 11:53 pts/0 00:00:26 /home/kimkc/.tiup/components/tidb/v6.1.0/tidb-server -P 4001 --store=tikv --host=127.0.0.1 --status=10081 --path=127.0.0.1:2379,127.0.0.1:2382,127.0.0.1:2384 --log-file=/home/kimkc/.tiup/data/TLr4XQE/tidb-1/tidb.log
Forcefully terminate the process using port 4001.
$ kill -9 14508
[kimkc@localhost ~]$ ps -ef | grep tidb-server | grep -v grep
kimkc 14506 14429 0 11:53 pts/0 00:00:22 /home/kimkc/.tiup/components/tidb/v6.1.0/tidb-server -P 4000 --store=tikv --host=127.0.0.1 --status=10080 --path=127.0.0.1:2379,127.0.0.1:2382,127.0.0.1:2384 --log-file=/home/kimkc/.tiup/data/TLr4XQE/tidb-0/tidb.log
[kimkc@localhost ~]$
You can see that one TiDB server process has died and only one remains. Wait for a while.
Even after waiting for some time, it does not recover.
[kimkc@localhost ~]$ ps -ef | grep tidb-server | grep -v grep
kimkc 14506 14429 0 11:53 pts/0 00:00:27 /home/kimkc/.tiup/components/tidb/v6.1.0/tidb-server -P 4000 --store=tikv --host=127.0.0.1 --status=10080 --path=127.0.0.1:2379,127.0.0.1:2382,127.0.0.1:2384 --log-file=/home/kimkc/.tiup/data/TLr4XQE/tidb-0/tidb.log
[kimkc@localhost ~]$
This shows that recovery does not happen with the basic functionality alone.
TiDB Operator
Automatic failover additionally requires a separate tool called TiDB Operator.
It appears to be a tool for managing TiDB clusters in a Kubernetes cluster and automating related tasks. With this, TiDB can become truly cloud native.