Monthly Archives: April 2021

short polling, long polling, websocket

short polling
ajax based call. Every fixed amount of time, it calls server and get the result(the result might be empty as well).

long polling
When client and server has opened a TCP connection, it won’t close. Client just wait. Whenever there is message in server side, it sends to client.

web socket
client builds a web socket with server. When it opens, server can control the websocket, and send whatever server wants to client.

https://www.youtube.com/watch?v=ZBM28ZPlin8

Union Find complexity

space complexity O(n),

time complexity for both union and find O(\alpha (n)), in practice O(\alpha (n)) is normally less than 5. So can be treated as O(1) constant time.

 

 

https://zh.wikipedia.org/wiki/%E5%B9%B6%E6%9F%A5%E9%9B%86

空间复杂度[编辑]

容易看出,不交集森林的空间复杂度是{\displaystyle \mathrm {O} \left(n\right)}{\displaystyle \mathrm {O} \left(n\right)}

时间复杂度[编辑]

对于同时使用路径压缩和按秩合并优化的不交集森林,每个查询和合并操作的平均时间复杂度仅为{\displaystyle O(\alpha (n))}O(\alpha (n)){\displaystyle \alpha (n)}\alpha (n)是反阿克曼函数。由于阿克曼函数{\displaystyle A}A增加极度迅速,所以{\displaystyle \alpha (n)}\alpha (n)增长极度缓慢,对于任何在实践中有意义的元素数目{\displaystyle n}n{\displaystyle \alpha (n)}\alpha (n)均小于5,因此,也可以粗略地认为,并查集的操作有常数的时间复杂度。

write intensive DB. LSMTree

scenario, intensive write. Also once written in DB, can be used immediately. Such as

LSM Tree, (Log Structured Merge Tree)
In memory, calld memtable.
In disk called sstable(Sorted String Table)

1. append to WAL first, then append to memtable. In any case system is crashed, can recover from WAL. memtable is sorted. When it reaches the size, flush to disk sstable.
2. When query, 1st find in memtable. If miss, find in disk. The latest record in sstable. The worst case is O(NlogN). N is the number of sstable. Also when query, use bloom filter to improve accuracty.
3. compact. Have a background thread, always compact sstable.

https://www.youtube.com/watch?v=oUNjDHYFES8

DDIA chapter 7, Replica

leader based repliaction, elasticache
replica pulls master replication log

sync, async, semi-sync(only 1 replica is sync, other replicas are async)

setup a new replica, or replica is down
1. copy snapshot
2. start sync from the log sequence number

master is down
1. replica votes a new master
2. new master has data almost same as old master
3. when old master came back, let old master be a replica

How to sync to replica?
1. statement based. INSERT/UPDATE/DELETE, but it may have nondeterministic statement, such as random(). Or where conditions. But different replica are not synced
2. WAL, consistent, low level, binary. But it’s highly relates to database product and maybe version.
3. row/record based. change data capatured

Async replica could cause data inconsistency between master and replica. In order to conquer this, there is read-after-write consistency. It enforces that replica are all updated.

transaction, distributed, replica keywords

facebook tao system.
single leader replication.

collaborative editing
https://www.zhihu.com/question/274573543
https://operational-transformation.github.io/index.html

OT, operational transaction

write through cache, read through cache

2 leaders

ddia figure 5-8

ring replication

tungstn

master-master replication
Conflict-free replicated data type

distributed lock -> distributed cache, chubby

读写异步
写异步->写事件流

paxos
multi-paxos

read after write, cascading rollback
write after read, non repeatable read.

https://jepsen.io/consistency

ACID vs BASE

cockroachDB vs spanner

compare and swap

sharding, replica

single leader
multiple leader
leaderless

sync只比async快一点点,从多个replica上读,能够确保一致性。
redis sync replication, use wait timeout,still can’t gurantee strong consistency.
Kafka单个partition其实read和write都在leader

Chandy Lamport algo, 增量snapshot, which is used by flink
WAL, use UPS to keep not down.
snapshot is actually compact for WAL.

zookeeper, need to review, how the CP works.
raft

paxos
consensus, agree on one result
3 roles: proposers, acceptors, learners
single node can take multiple roles, all all of roles.

Isolation Level

Optimistic lock, when do transaction, read the version I’d and timestamp. Dismiss when version is or timestamp changed. Good for a lot of read scenario

Pessimistic lock, when do transaction, lock the record

3 types of read anomaly:
Dirty read
Non repeatable read
Phantom read

4 Isolation levels

Uncommitted read
read an uncommitted data. After read the, data change is committed. There might be dirty read.

Committed read

Make sure the data to read is committed. But when read the data 2nd time, might be different

Non-repeatable read

In same transaction, when reading a record for several times. Need to make sure it reads same value.

Snapshot isolation(MVCC)

It is same as repeatable read. It solves non repeatable read problem. Can do read, write at same time, no need lock. Compared with lock way.

Serializable