Monthly Archives: March 2021

Rewrite lock, copy-on-write

Rewrite lock. Lock a data, then write it, then release the lock.

copy-on-write, copy the data into a new place, update the date in new place. Then update the data’s reference to the new place.

Write ahead log(WAL)

WAL persistents operation to disk, then write to cache. If each operation needs to persistent to disk, then it is low efficient. Instead, do batching, this helps to improve performance, also reduce the error to batch level.

https://martinfowler.com/articles/patterns-of-distributed-systems/wal.html

Flushing every log write to the disk gives a strong durability guarantee (which is the main purpose of having logs in the first place), but this severely limits performance and can quickly become a bottleneck. If flushing is delayed or done asynchronously, it improves performance but there is a risk of losing entries from the log if the server crashes before entries are flushed. Most implementations use techniques like Batching, to limit the impact of the flush operation.

mysql on mac

1. restart
/usr/local/bin/mysql.server restart

2. show mysql variables
mysqladmin variables

Two ways to read aws access key/secret. Different CredentialProvider

  1. instance can have instance role. When an application run in instance, it can InstanceProfileCredentialsProvider() to retrieve instance role and have the access. For example, instance_profile from EMR cluster is the role for EMR instance.
  2. StsAssumeRoleSessionCredentialsProvider
  3. AWSStaticCredentialsProvider
  4. AWSCredentialsProviderChain. It will test different credential one by one, until it finds one.

Here is a code example

Category: aws

Find count of sum pairs in array, which the sum is greater than target.

Given an array [4 2 1 3 5], and a target. Return the number of pairs that the sum is greater than or equal to 5.

Technique: 1. sort, 2. use two pointers.

1. Order is not fixed. [2, 3], [3, 2] are different. [3, 3] is ok.

[1 2 3 4 5]

[1, 4] -> [1, 4], [1, 5]
[2, 3] -> [2, 3], [2, 4], [2, 5]
[3, 2] -> [3, 2], [3, 3], [3, 4], [3, 5]
[4, 1] -> [4, 1], [4, 2], [4, 3], [4, 4], [4, 5]
[5, 1] -> [5, 1], [5, 2], [5, 3], [5, 4], [5, 5]

2. Pair order are fixed. v0 < v1

[1 2 3 4 5]

[a, b]
[1, 4] -> [1, 4], [1, 5]
[2, 3] -> [2, 3], [2, 4], [2, 5]
[3, 2] -> [3, 2], [3, 3], [3, 4], [3, 5]
[4, 1] -> [4, 1], [4, 2], [4, 3], [4, 4], [4, 5]
[5, 1] -> [5, 1], [5, 2], [5, 3], [5, 4], [5, 5]

for each loop, the ans is len – Math.max(a, b). Or in another word, we only pick the one with a < b to avoid duplicate.

 

Related: https://www.youtube.com/watch?v=9pAVTzVYnPc