Monthly Archives: February 2016

SHOW CREATE TABLE in SqlServer

In Oracle, we can easily get the DDL clause. In mysql, we can use SHOW CREATE TABLE.

In SqlServer, there is no similar command. However, we can get data type by below command:

SELECT name, system_type_name, is_nullable FROM
sys.dm_exec_describe_first_result_set(‘SELECT * FROM table‘, NULL, 0)

Zookeeper Study Summary

In terms of CAP theorem, Zookeeper is a CP system. It emphasizes consistency in distributed system. It is a Leader – Follower based system. Zookeeper assumes that at least n/2+1 nodes are available. This guarantees brain-split won’t happen to zookeeper.

How to handle partial failure?
For example, there are 5 nodes. 2 of them are down. The rest of 3 keeps working. Because more than half is running. If another one is down, only 2 are working. When client connects to one of the 2 working nodes. Even though 2 of them are not down. But because 2 nodes know that less than 3 nodes are available among them, the 2 nodes will become unavailable to client. We should know that 3 out of 5 nodes are down is unacceptable to zookeeper.

How to write?
When a client submits a write request. This request will be forwarded to leader. Then leader will forward this write request to all followers. Only more than n/2+1 nodes acknowledged, then it means a write is successful. Write to data will be firstly persistent to disk, then it will be available in memory.

How to read?
Client reach a node, node directly returns the data values in-memory. Because they maybe network delay, the data in a node still may not be the up-to-date. So in order to keep a more strict consistency, we can call sync() first before read.

How to recover?
Zookeeper keeps a snapshot of each state called fuzzy snapshot. Because there may be network delay, it is possible that a state at a node doesn’t correspond to any state in leader. So we called it fuzzy snapshot. When a node is crashed and tried to recover, it will recover based on the fuzz snapshot it has and log from other nodes.

Study materials:
ZooKeeper: Wait-freecoordinationforInternet-scalesystems
http://zookeeper.apache.org/doc/r3.1.2/zookeeperInternals.html

Self Crossing

You are given an array x of n positive numbers. You start at point (0,0) and moves x[0] metres to the north, then x[1] metres to the west, x[2] metres to the south, x[3] metres to the east and so on. In other words, after each move your direction changes counter-clockwise.

Write a one-pass algorithm with O(1) extra space to determine, if your path crosses itself, or not.

Example 1:
Given x = [2, 1, 1, 2]
Return true (self crossing)
Example 2:
Given x = [1, 2, 3, 4]
Return false (not self crossing)
Example 3:
Given x = [1, 1, 1, 1]
Return true (self crossing)

Solution. The solution is to solve the below 3 scenarios.

selfcrossing1  selfcrossing2 selfcrossing3

Accordingly, the code is like below:

public static boolean isSelfCrossing(int[] x) {
    if (x == null || x.length <= 3) {
        return false;
    }
    for (int i = 3; i < x.length; i++) {
        if (x[i - 1] <= x[i - 3] && x[i] >= x[i - 2]) {
            return true;
        }
        else if (i >= 4 && x[i - 1] == x[i - 3] && x[i] + x[i - 4] >= x[i - 2]) {
            return true;
        }
        else if (i >= 5 && x[i - 1] <= x[i - 3] && x[i - 2] >= x[i - 4]
                && x[i] + x[i - 4] >= x[i - 2] && x[i - 5] + x[i - 1] >= x[i - 3]) {
            return true;
        }
    }
    return false;
}

The solution is found here.

Check my code on github: link

Largest BST subtree

http://www.geeksforgeeks.org/find-the-largest-subtree-in-a-tree-that-is-also-a-bst/

Given a Binary Tree, write a function that returns the size of the largest subtree which is also a Binary Search Tree (BST). If the complete Binary Tree is BST, then return the size of whole tree.

Examples:

Input: 
      5
    /  \
   2    4
 /  \
1    3

Output: 3 
The following subtree is the maximum size BST subtree 
   2  
 /  \
1    3


Input: 
       50
     /    \
  30       60
 /  \     /  \ 
5   20   45    70
              /  \
            65    80
Output: 5
The following subtree is the maximum size BST subtree 
      60
     /  \ 
   45    70
        /  \
      65    80

Solution. If a node is a subtree, max value from left subtree should be less than node.val and min value from right subtree should be greater than node.val.

Define a Triplet for (max, min, ans). If ans is negative, it means subtree is not a BST. The largest BST in subtree is abs(ans).

Check my code on github: link

Increasing Triplet Subsequence

Given an unsorted array return whether an increasing subsequence of length 3 exists or not in the array.

Formally the function should:
Return true if there exists i, j, k
such that arr[i] < arr[j] < arr[k] given 0 ≤ i < j < k ≤ n-1 else return false.
Your algorithm should run in O(n) time complexity and O(1) space complexity.

Examples:
Given [1, 2, 3, 4, 5],
return true.

Given [5, 4, 3, 2, 1],
return false.

Solution. The idea is to define firstMin and secondMin. If any element nums[i] is greater than secondMin, then return true. Because firstMin < secondMin < nums[i]. If not, then we update firstMin and secondMin accordingly.

increasingTripletSubsequence

I came up with a solution passing all test. But it is not as clean as the solution from this post.

Check my code on github: link

Reconstruct Itinerary

leetcode. 332

Given a list of airline tickets represented by pairs of departure and arrival airports [from, to], reconstruct the itinerary in order. All of the tickets belong to a man who departs from JFK. Thus, the itinerary must begin with JFK.

Note:
If there are multiple valid itineraries, you should return the itinerary that has the smallest lexical order when read as a single string. For example, the itinerary [“JFK”, “LGA”] has a smaller lexical order than [“JFK”, “LGB”].
All airports are represented by three capital letters (IATA code).
You may assume all tickets form at least one valid itinerary.
Example 1:
tickets = [[“MUC”, “LHR”], [“JFK”, “MUC”], [“SFO”, “SJC”], [“LHR”, “SFO”]]
Return [“JFK”, “MUC”, “LHR”, “SFO”, “SJC”].
Example 2:
tickets = [[“JFK”,”SFO”],[“JFK”,”ATL”],[“SFO”,”ATL”],[“ATL”,”JFK”],[“ATL”,”SFO”]]
Return [“JFK”,”ATL”,”JFK”,”SFO”,”ATL”,”SFO”].
Another possible reconstruction is [“JFK”,”SFO”,”ATL”,”JFK”,”ATL”,”SFO”]. But it is larger in lexical order.

Solution. It has a lot of descriptions. There is short and awesome solution from this post. But it’s hard to me to figure out this efficient solution. I prefer a classical backtrace.

Basically, it asks to return a alphabetic order Euler path. For each stop, we can do DFS on the departure points by alphabetic order. But there is chance that a departure point won’t have final result. So we need to do backtrace.

check my code on github: link

Intensive write: Mysql vs Redis

This is another test on mysql and redis write speed. The test is easy. Keep updating a counter in mysql and redis for 10s. And see how many times it can update. Mysql uses InnoDB engine. Each update commits.

redis_mysql2

redis_mysql

Result shows that redis has better update performance. A redis server can handle up to 40K update per second.

Redis / Mysql stays in same machine with 2.9GHz CPU, 4GB memory.

You can find my testing code on github: link

Stored Procedure vs ETL Tool

This is a test for stored procedure and ETL tool kettle. The test is very easy: insert [1, 2, …, 10000] numbers to a table in Mysql(InnoDB). And I got below result.

kettle_sp2

kettle_sp3

kettle_sp

The conclusion is that performance between stored procedure and etl tool doesn’t differ more than 10 times. This is relatively fine if we are not too picky about performance.

———————————–

Code to insert rows in mysql:

DELIMITER $$

USE `test`$$

DROP PROCEDURE IF EXISTS `insert_test`$$

CREATE DEFINER=`root`@`%` PROCEDURE `insert_test`()
BEGIN
DECLARE a INT DEFAULT 0;
SET a = 1;

WHILE a <= 10000 DO
INSERT INTO test VALUES(a);
SET a = a + 1;
END WHILE;

END$$

DELIMITER ;

SET autocommit = 0;
CALL test.`insert_test`();
COMMIT;

Difference between Varray and Nested Table in PL/SQL

In PL/SQL, there are 3 types of collections: Nested Table, Varray and Associative Array. Associative Array is like HashMap in Java which is easy to understand. Nested Table and Varray are like array in java. They are confusing for java developers. Below are their differences:

Varray
It has max size which is defined in type. It can’t extend over than max size defined in type.
Element in it can be changed, but can’t be deleted. It will have exception if visit an OutofBound position.

Nested Table
An array without limit size. We can extend as much as we want. An element can be deleted. But will raise exception if we visit a deleted element.
It will have exception if visit an OutofBound position.

Try below code in PL/SQL:

Varray

DECLARE
TYPE varray_type IS VARRAY(5) OF NUMBER;
p_level varray_type;
Test 1
p_level := varray_type(1, 2, 3, 4, 5);
DBMS_OUTPUT.PUT_LINE(p_level.COUNT); — 5
DBMS_OUTPUT.PUT_LINE(p_level(2)); — 2
p_level.DELETE(2); — won’t pass compilation. Not allow to delete.
Test 2
p_level := varray_type(1, 2, 3, 4, 5, 6); — error because greater than limit.
Test 3
p_level := varray_type(1, 2, 3, 4);
p_level(5) := 5; — error because reach the limit.
Test 4
p_level := varray_type(1, 2, 3, 4);
P_level.EXTEND;
p_level(5) := 5; — 5
DBMS_OUTPUT.PUT_LINE(p_level(5)); — 5
Test 5
p_level := varray_type(1, 2, 3, 4, 5);
DBMS_OUTPUT.PUT_LINE(p_level.COUNT); — 5
p_level.EXTEND; — error. Doesn’t allow to extend. Because already reach the limit 5.
DBMS_OUTPUT.PUT_LINE(p_level.COUNT); — won’t reach here.

Nested Table

DECLARE
TYPE varray_type IS TABLE OF NUMBER;
p_level varray_type;
p_level := varray_type(1, 2, 3, 4, 5);
— Test 1
DBMS_OUTPUT.PUT_LINE(p_level.COUNT); — 5
DBMS_OUTPUT.PUT_LINE(p_level(2)); — 2
p_level.DELETE(2);
DBMS_OUTPUT.PUT_LINE(‘3nd element: ‘||p_level(3)); — 3
DBMS_OUTPUT.PUT_LINE(p_level.LAST); — 5
— Test 2
DBMS_OUTPUT.PUT_LINE(p_level.COUNT); — 5
DBMS_OUTPUT.PUT_LINE(p_level(2)); — 2
p_level.DELETE(2);
DBMS_OUTPUT.PUT_LINE(‘2nd element: ‘||p_level(2)); — silent error and won’t show anything
DBMS_OUTPUT.PUT_LINE(‘3nd element: ‘||p_level(3)); — won’t show anything. Because above clause has an error
DBMS_OUTPUT.PUT_LINE(p_level.LAST); — won’t show anything.
— Test 3
DBMS_OUTPUT.PUT_LINE(p_level.COUNT); — 5
DBMS_OUTPUT.PUT_LINE(p_level(2)); — 2
p_level.DELETE(2);
DBMS_OUTPUT.PUT_LINE(‘3nd element: ‘||p_level(3)); — 3
DBMS_OUTPUT.PUT_LINE(p_level.LAST); — 5
p_level.EXTEND;
DBMS_OUTPUT.PUT_LINE(p_level.LAST); — 6
p_level(2) := 22;
DBMS_OUTPUT.PUT_LINE(p_level(2)); — 22
Test 4
p_level := varray_type(1, 2, 3, 4);
p_level(5) := 5; — error. Because IndexOutofbound

Number of Digit One

https://leetcode.com/problems/number-of-digit-one/

Given an integer n, count the total number of digit 1 appearing in all non-negative integers less than or equal to n.

For example:
Given n = 13,
Return 6, because digit 1 occurred in the following numbers: 1, 10, 11, 12, 13.

Solution. The solution is to iterate each digit. Each time, split the digit into 3 parts and do calculation. For example, if n = 320456. When we iterate to mid = 0. Left part is 32, right part is 456. Let’s fix 1 and count combination of left and right. Left can iterate from 0 ~ 31, totally 32 times. Right part iterates from 0 to 999, totally 1000 times.

So first part is ans = 32 * 1000.

Then let’s analyze when left part is 32.

Since mid == 0, for left part plus mid == 1(for 32), there is no chance to iterate right part.

If n = 321456. Since mid == 1, so totally, when left part plus mid == 1(for 32), right part can iterate from 0 to 456, totally 457.

If n = 322456. Since mid  > 1, so totally, when left part plus mid == 1(for 321), right part can iterate from 0 to 999, totally 1000.

When writing the code, we should loop k = 1 to k <= n.

/*
For n = 54321, k = 1, 10, 100, 1000, 10000
 when k = 100, should divide into 54, 3, 21
 l = 543,
 m = 543 % 100 = 3
 r = 54321 - 543 * 100 = 21
 l = 543 / 10 = 543
*/
public int countDigitOne(int n) {
    int ans = 0;
    for (long k = 1; k <= n; k *= 10) {
        long left = n / k;
        long mid = left % 10;
        long right = n - left * k;
        left = left / 10;
        ans += left * k;
        if (mid > 1) {
            ans += k;
        }
        else if (mid == 1) {
            ans += right + 1;
        }
    }
    return ans;
}

Check my code on github link.