Monthly Archives: November 2015

Worm walking on a line

Given a array consisted of only 0 or 1, and a number N.
We have N times to flip 0 to 1 in array. Find the longest continuous 1 after we flipped the N zero elements. Rotation is allowed.

For example, 100100111000, n=2, the best flip is 100111111000, return 6
10001101101, n=2, the best flip is 10001111111, return 8(rotation is allowed)

The idea is to set a head and tail in array. 0 between head and tail is N. Each time, move head and tail to next 0 position, and calculate the current distance between head and tail.

Because rotation is considered, we should consider about the boundary condition specially:
when number of zero in array equal N+1, should return arr.length-1.
when number of zero is less than N+1, should return arr.length.

So I added isValid to check number of zero.

A typical process without rotation is showed below:

check my code on github: link

power(a, n) in iteration way

Given a, n, calculate a^n in non-recursive way.

The traditional way is to use recursion to reach O(logN) solution. For iteration, we can use stack to implement the recursion process. In stack, 1 means result times a, -1 means result times result, 0 doing nothing.

Take 11 as example, the process is like:

Take 15 as example, the process is like:

Below is my code:

public static int power(int a, int n) {
    // build stack
    Stack s = new Stack();
    s.add(n);
    while(s.peek()>1) {
        int curr = s.pop();
        s.push(curr%2); //1 means result should time a.
        s.push(-1); //-1 means result should time result.
        s.push(curr/2);
    }
    int result = 1;
    while(!s.isEmpty()) {
        int curr = s.pop();
        switch (curr) {
            case 1:
                result = result * a;
                break;
            case -1:
                result = result * result;
                break;
        }
    }
    return result;
}

However, this problem can be solved in O(logN) time and O(1) space. Below is a process to calculate a^9:

 

public static int power(int a, int n) {
    int result = 1;
    while(n>0) {
        if(n%2==1)
            result = result * a;
        a = a * a;
        n = n>>1;
    }
    return result;
}

In order to understand this solution more clearly, we can transform the exponent in binary: power. We know a in the code is a base. It changes like: power2. And we know power3. In this way, we know that calculating power is similar the process to build binary 1001.  The difference is that instead doing plus, we do multiplication.

In the code, a = a* a is to enlarge the base. result = result * a is to add the base in the result.

Print prime number and Number Expressible(Ugly Number)

Print prime number:
Given a number n, print all primes smaller than or equal to n. It is also given that n is a small number. For example, if n is 10, the output should be “2, 3, 5, 7″. If n is 20, the output should be “2, 3, 5, 7, 11, 13, 17, 19″.

Nubmer Expressible(This problem is still called ugly number):
Given a prime set, we call “prime expressible” if a number can be factorized only using given prime numbers. Find n-th big expressible number. For example, if prime set = {2, 3}, expressible number = {1,2,3,4,6,8, 9, 12…}, non-expressible number = {5, 10… }

For printing prime number, we can use Sieve of Eratosthenes to solve. Space complexity is O(N), time complexsity is O(NloglogN)

For number expressible, it has the similar way which we store expressible number in a TreeSet.
At the TreeSet only has {2, 3}. Remove the min element in TreeSet, and use it to multiply set {2, 3}. We get 4, 6. Put 4, 6, in TreeSet. It will be {3, 4, 6}.
Then we get min element in TreeSet which is 3. Use 3 to multiply set {2, 3}. We get 6, 9. Put 6, 9 in TreeSet, It will be {4, 6, 9}.
We iterate this step for n times. Then we can get the Nth expressible number. Time complexity is O(NlogN), space is O(N).

There is another O(nk) time and space solution. Each time, calculate each prime time where it is at by its index. Find the minimum value, add in the array. Then move the index to next ugly number. The process is like below:

check my code on github: Print prime numberNubmer Expressible, Ugly Number O(n) solution

Longest palindrome starting from left side

Given a string, find the longest length of palindrome starting from the left.
For example:
abaeab, the longest from left is aba, so return 3.
abcbaef, the longest from left is abcba, so return 5.
ababa, the longest from left is ababa, so return 5.

One of the solution is to use rolling hash, which I wrote in previous post link.

Here is a better solution, which the idea is from calculating next array in KMP.

Take abae for example, let’s add the reverse to the right: abaeeaba.
Let’s implement KMP next function. To begin with, we set i=0, j=4. Then, let’s find the max next_value.

We can find the max next_value in O(n) time. In this example, the max next_value is 3. And that is the result.

Take another example, abbaaba, the process is like below:

check my code on github: link

cap theorem

I’v been thinking such a problem for a long time. Let’s take my website for example. What if more and more people visit my website, and what do I do with my server. Natural way is that I pay more money to upgrade the CPU, memory or storage. But what if there are millions of visits that one server couldn’t handle it anymore? Another enhancement is that I use distribudated system.

Let’s say I simply have 2 servers in load balancing mode. Both 2 servers can response requests. We can use round robin to load balance.

It seems fine, but actually, the database behind it is still one database. Millions request will finally go to the single database.

Similarly, we should do same way to the database, make it a distributed database.

Ok, here is the question, when there is a write to db1 on data1, and another request try to check the data1 in db2. How do we handle this? Because it is a distribudated system, the data consistency is a thing we need to think about.

This is the problem which confuses me for a long time, until today I read about cap theorem. Below are the definition for each letter:
Consistency (all nodes see the same data at the same time)
Availability (a guarantee that every request receives a response about whether it succeeded or failed)
Partition tolerance (the system continues to operate despite arbitrary partitioning due to network failures)

And cap theorem says no 3 conditions together can satisfy for distribudated system.

For CA, we want consistency and availability at any time that we can’t tolerent any point failure. Because if network fails, we can’t gurantee each data is newly updated, or any server on service can’t always provide available service.

For AP, the application has high availability and partition tolerance, which means we always provide end-user service. But we can’t guarantee the data is recent updated. Facebook, twitter status can be designed in this way. Because it is ok that status update delayed from seconds to minutes. The idea for this scenario is aiming eventual consistency.

For CP, the system accepts partition tolerance and consistency. But it will lock the request when a server fails until the data is consistent.One of ths type of model I can think maybe is P2P download. Because it really requires data 100% correct data.

All we do is that we choose the database according to the business characteristic. Below is the different database for different scenarios:

It would be easy to answer my question now. There is no way that both 3 conditions exist at the same time. The only solution is that we trade-off one of the characteristic out of three according to our application scenario.

Open Closed range in a sorted array

1. Find leftOpen
For a sorted array 1, 2, 3, 4, 4, 4, 5, 5, 6. And given 4. Find the left-most position, where 4>=arr[pos]. The answer of position is 3

2. Find leftClose
For a sorted array 1, 2, 3, 4, 4, 4, 5, 5, 6. And given 4. Find the left-most position, where 4>arr[pos]. The answer of position is 6

Similarly, there are problem to find right open and right close. This is a very useful question. And I finally wrote it for practice.

My general process for left open and left close.

1.Move to right as far as possible. When mid is less than n, then we should find the right side. right = mid + 1;

2. Or it means mid is at below somewhere. Then we check if arr[mid-1] is eligible for the right answer.

3. Else it means both mid-1 and mid are at the right part. We should move to left.

Because we don’t want arr[mid-1] out of index boundary. At the beginning, check if pos=0 is a valid position. Similar ways are applicable to check right close and right open scenario.

Check my code on github: link

Max Gap between consecutive numbers

Source: http://www.careercup.com/question?id=14764703

Given a unsorted array with n elements. How can we find the largest gap between consecutive numbers of sorted version in O(n)?
For example, we have a unsorted array
9 19 13 12 33 41 22
the sorted version is
9 12 13 19 22 33 41
the output of the algorithm should be 33-22 = 11

One solution is by bitmap. Let’s go through all elements, and set the bitmap 1 for each element position. Then, we go throught whole bitmap to find the max gap.

Here is a better solution. Let’s think about 1, 2, 2, 4. There are 4 elements, let’s put them into 5 buckets. Each buckets maintain a distance=(maxValue – minValue)/5=(4-1)/5=0.6 In this way, the bucket will look like:

Then, let’s put each element into each bucket. It will look like:

Because we choose 5 buckets, then there is at least 1 empty bucket. And we can gurantee the first bucket and last bucket always has element.

Thus, we know max gap exists between element in left and right bucket of empty bucket. We just find all the empty bucket, then maxValuein its left non-empty bucket, minValue in its right non-empty bucket, gap=minValue-maxValue.

public static int maximumGap(int[] nums) {
    int min = Integer.MAX_VALUE, max = Integer.MIN_VALUE;
    for (int i : nums) {
        min = Math.min(min, i);
        max = Math.max(max, i);
    }
    int bucketNum = nums.length + 1;
    double interval = (double)(max - min) / bucketNum;
    int[] minBucket = new int[bucketNum];
    int[] maxBucket = new int[bucketNum];
    Arrays.fill(minBucket, Integer.MAX_VALUE);
    Arrays.fill(maxBucket, Integer.MIN_VALUE);
    for (int i = 0; i < nums.length; i++) {
        int bucket = (int)((nums[i] - min) / interval);
        bucket = bucket >= minBucket.length ? bucket - 1 : bucket;
        minBucket[bucket] = Math.min(minBucket[bucket], nums[i]);
        maxBucket[bucket] = Math.max(maxBucket[bucket], nums[i]);
    }
    int ans = 0;
    for (int leftBucket = 0, rightBucket = 1; rightBucket < minBucket.length; rightBucket++) {
        if (minBucket[rightBucket] == Integer.MAX_VALUE) {
            continue;
        }
        ans = Math.max(ans, minBucket[rightBucket] - maxBucket[leftBucket]);
        leftBucket = rightBucket;
    }
    return ans;
}

Check my code on github: link

minimum consecutive sub-string of S containing T

source: http://careercup.com/question?id=4855286160424960

Given a random string S and another string T with unique elements,. find the minimum consecutive sub-string of S such that it contains all the elements in T
example:
S=’adobecodebanc’
T=’abc’
answer=’banc”

The idea is to maintain a double-linked-list. For each character in T, there is an element in double-linked-list. Every time, head has the indicates where window start. Tail indicates where window end. For string T, initialize double-linked-list like below:
a    b   c
-1  -1  -1

Then, for each character in S, find its element in double-linked-list, update element position, and move it to list tail. Then new window starts at head.pos, window ends at tail.pos. Check if window size is less than current one. The process is like below:

check  my code on github: link

Word Ladder

Given two words (start and end), and a dictionary, find the length of shortest transformation sequence from start to end, such that only one letter can be changed at a time and each intermediate word must exist in the dictionary. For example, given:

start = "hit"
end = "cog"
dict = ["hot","dot","dog","lot","log"]

This is a dijistra problem which can build graph, and run dijistra on it. It be solved in O(n^2) time. Dijistra is actually a BFS.

Check my code on github: link