Monthly Archives: March 2016

Basic calculators

These problem can be surely solved by OPG. However, since they are just basic calculator. They can still be solved by some simple code.

For calculator 1, we need the help of a stack. Every time when we see a ‘(‘, we store the result and sign in stack. When we see ‘)’, we pop sign and number and calculate them with current number.

For calculator 2, if we encounter ‘+’ or ‘-‘, we simply calculate and put in a stack. If we encounter ‘*’ or ‘/’, we need to pop sign stack and number stack, and calculate with current number, then put back to stack.

Calculator 1:

"1 + 1" = 2
" 2-1 + 2 " = 3
"(1+(4+5+2)-3)+(6+8)" = 23
public static int calculate(String s) {
    Stack<Integer> stack = new Stack<>();
    int result = 0, sign = 1;
    for (int i = 0; i < s.length(); i++) {
        char ch = s.charAt(i);
        if (ch == ' ') {
            continue;
        }
        else if (Character.isDigit(ch)) {
            int num = 0;
            for (; i < s.length() && Character.isDigit(s.charAt(i)); i++) {
                num = num * 10 + (s.charAt(i) - '0');
            }
            result += num * sign;
            i--;
        }
        else if (ch == '+') {
            sign = 1;
        }
        else if (ch == '-') {
            sign = -1;
        }
        else if (ch == '(') {
            stack.push(result);
            stack.push(sign);
            result = 0;
            sign = 1;
        }
        else if (ch == ')') {
            result = stack.pop() * result + stack.pop();
        }
    }
    return result;
}

Code for calculator 2:

"3+2*2" = 7
" 3/2 " = 1
" 3+5 / 2 " = 5
public static int calculate(String s) {
    Stack<Integer> stack = new Stack<>();
    int num = 0, len = s.length();
    char sign = '+';
    for (int i = 0; i < len; i++) {
        char ch = s.charAt(i);
        if (ch == ' ' && i != len - 1) {
            continue;
        }
        if (Character.isDigit(ch)) {
            num = num * 10 + (ch - '0');
        }
        // it means this one is a sign or it is the end. Use previous one to calculate
        if (ch < '0' || ch > '9' || i == len - 1) {
            switch (sign) {
                case '+':
                    stack.add(num);
                    break;
                case '-':
                    stack.add(-num);
                    break;
                case '*':
                    stack.add(stack.pop() * num);
                    break;
                case '/':
                    stack.add(stack.pop() / num);
                    break;
                default:
                    continue;
            }
            num = 0;
            sign = ch;
        }
    }
    int ans = 0;
    while (!stack.empty()) {
        ans += stack.pop();
    }
    return ans;
}

Palindrome Pairs

Leetcode 336. Given a list of unique words. Find all pairs of distinct indices (i, j) in the given list, so that the concatenation of the two words, i.e. words[i] + words[j] is a palindrome.

Example 1:
Given words = ["bat", "tab", "cat"]
Return [[0, 1], [1, 0]]
The palindromes are ["battab", "tabbat"]

Example 2:
Given words = ["abcd", "dcba", "lls", "s", "sssll"]
Return [[0, 1], [1, 0], [3, 2], [2, 4]]
The palindromes are ["dcbaabcd", "abcddcba", "slls", "llssssll"]

Solution. The solution is from here. We uses trie. For each string, we save it trie in reverse order. For example, if it is abc. The trie will be like below. a has value pos. This tells that trie cba is a word.
palinpair1

When we use cbaxyx to match the trie, it will finally goes to trie a. And when it sees a has pos, we check if xyx is a palindrome. If it is, then we know cbaxyx and abc is a pair of palindrome.

Think about words[] = {“cbaaa”, “bc”, “abc”}. The trie will be like:
palinpair2
When it tries to match cbaaa, when it arrives cb, its pos is 1, then we should check if aaa is a palindrome. When it arrives cba, we should check if rest of string aa is a palindrome. In this way, we can know that [“cbaaa”, “bc”], [“cbaaa”, “abc”] are pairs.

How about [“xyxcba”, “abc”, “xcba”]?
palinpair3

When we check abc and it arrives c in trie. We should know that the rest of string after abc has palindrome. So we modify trie like this:
palinpair4
c has a list palins. It tells string below c, what rest of word is palindrome. When we scan abc, it will finally stops at c in xyxabc. When it ends, check if palins list of trie c is not empty. If not, it may be able to concatenate into palindrome.

We can do it in another way by rolling hash which I wrote a post before. But the code is too much overhead.

Below is my code:

public static class Trie {
    int pos;
    Trie[] nodes;   // consider xyxabc. if current trie is 'a'. Then a.nodes has information. It means string after a is palindrome
    List<Integer> palins;
    public Trie() {
        pos = -1;
        nodes = new Trie[26];
        palins = new ArrayList<>();
    }
}

public static void add(Trie root, String word, int pos) {
    for (int i = word.length() - 1; i >= 0; i--) {
        char ch = word.charAt(i);
        if (isPalindrome(word, 0, i)) { // check if substring(0, i) is palindrome.
            root.palins.add(pos);
        }
        if (root.nodes[ch - 'a'] == null) {
            root.nodes[ch - 'a'] = new Trie();
        }
        root = root.nodes[ch - 'a'];
    }
    root.pos = pos; // if it is xyxcba. Until now, the node should be at x.
    root.palins.add(pos);
}

public static void search(Trie root, String[] words, int i, List<List<Integer>> ans) {
    int len = words[i].length();
    for (int j = 0; j < len && root != null; j++) {
        if (root.pos >= 0 && i != root.pos && isPalindrome(words[i], j, len - 1)) {
            ans.add(Arrays.asList(new Integer[] {i, root.pos}));
        }
        char ch = words[i].charAt(j);
        root = root.nodes[ch - 'a'];
    }
    if (root != null && root.palins.size() > 0) { // assume 'xyxabc' is in trie, now try 'cba'
        for (int j : root.palins) {
            if (j != i) {
                ans.add(Arrays.asList(new Integer[] {i, j}));
            }
        }
    }
}

public static List<List<Integer>> palindromePairs(String[] words) {
    List<List<Integer>> ans = new ArrayList<>();
    Trie trie = new Trie();
    for (int i = 0; i < words.length; i++) {
        add(trie, words[i], i);
    }
    for (int i = 0; i < words.length; i++) {
        search(trie, words, i, ans);
    }
    return ans;
}

public static boolean isPalindrome(String str, int i, int j) {
    while (i < j) {
        if (str.charAt(i++) != str.charAt(j--)) {
            return false;
        }
    }
    return true;
}

Check my code on github.

Data Warehouse Type 2 slowly change dimension in kettle

In kettle, we can implement type 2 slowly change dimension easily by using “Dimension lookup/update”.

We have below table in a CSV file:
COUNTRY_ID,COUNTRY_NAME,REGION,LAST_UPDATE_DATE
11,Baharian,A,2015-02-15
12,China,Beijing,2013-02-15
13,USA,AZ,2014-02-20
14,CUBA,CIUDAD,2011-02-19
15,USA,Los Angeles,2011-02-18
16,Japan,TOKYO,2011-02-15

We want to put into table with below structure:
type2_update5

First, we have 2 elements: Text file input, Dimension lookup/update.

type2_update

Text file input configuration:
type2_update2

Dimension lookup / update configuration:
type2_update3

type2_update4
In the Dimension Lookup / Update tab, there are several fields that we need to pay attention:
Key: This is the natural key. We need to both specify them in source and target table.
Technical key field: In target table, the surrogate key.
Version field: In target table, version in dimension.
Stream Datefield: Source table, the last_update_date which is used to indicates start / end date.
Date range start / end field: In target table, indicate to fill out the start / end date.
Lookup / update field: This indicate which field do we update, and the way we update. Insert, type 2.(Once changed then create a new row with new version. Set the end_date in old row and start_date in new row). Update, only update the value in latest version. Punch through, update all the fields with natural key.

When we run kettle for the first time, the table is like below:
type2_update8

Let’s change the id15 from “15,USA,Los Angeles,2011-02-18” to “15,China,Los Angeles,2011-02-19”. Run the kettle again, the table changed to below. We see a new country_id 15 is generated with new version and eff_date. Old one has end_date set. This is a typical type 2 change.
type2_update9

Later, let’s change id15 “15,China,Los Angeles,2011-02-19” to “15,China,Shenzhen,2011-02-20”. Table changed to below. It only updated the region on the latest version.
type2_update10