I got this problem from Junmin: link.
Problem: given a list of words and a window size of w, how to split them into multiple lines so that the fewest lines are used if possible.
Example: w = 10, input: “given a list of words and a window size of w”
potential optimal alignment:
given a
list of
words and
a window
size of w
not a good alignment:
given
a list
of words
and a
window
size of
w
This one is a DP problem. One of the critical issue is how to measure the cost of a set of word group. The answer is to calcualte sum of the rest of the space of each line.
First of all, we define a array l[][]. l[i][j] has the cost of the word sequence from WORDi to WORDj. And for each rest space of line, we power it by 3. Taking a string “a given word list”, with line length 10 for example, it has the following cost matrix:
The reason why we power it by 3 is because we want the result to be smoothly aligned. For example, considering a text “a b c d e” with length 7, following A, B group are both ok. But abviously A is more smoothly arranged than B.
A:
a b c <– rest space is 7-5=2
d e <– rest space is 7-3=4
B:
a <–rest space is 7-1=6
b c d e <–rest space is 7-7=0
In order to solve this, we transfer the cost by adding a power by 3. I don’t understand why in official answer it chooses power by 3 of cost of each line. In my opinion, I think power by 2 works too. But the thing here we don’t use plain add is because we want the text to be smoothly arranged.
Now, let’s enter the DP process. We define the cost[]. Cost[i] indicates the cost of the optimized align from WORD0 to WRODi. And I define another array segment[]. Segment[i] indicates the index of the left-most word from WORDi, which is in the same line with WORDi. Segment[] is to store the result for printing. And we have the following DP formula:
The interesting part of this problem lies in cost measurement. We need to define a new cost measurement in order to do the DP.
My code:
- package feb;
- public class TextJustification {
- public static void main(String[] args) {
- String str = “Geeks for Geeks presents word wrap problem”;
- // str = “given a list of words and a window size of w”;
- str = “aaa bb cc ddddd”;
- textJustification(str, 6);
- }
- public static void textJustification(String str, int rowLen) {
- if (str == null || str == “”) {
- return;
- }
- String[] strs = str.split(” “);
- /*
- * l[i][j] indicate the cost of word sequence, from wordi to wordj
- */
- int[][] l = new int[strs.length][strs.length];
- for (int i = 0; i < l.length; i++) {
- l[i][i] = (int) Math.pow(rowLen – strs[i].length(), 3);
- }
- for (int wordStart = 0; wordStart < strs.length – 1; wordStart++) {
- int len = strs[wordStart].length();
- for (int i = 0; i < wordStart; i++) {
- l[wordStart][i] = Integer.MAX_VALUE / 100;
- }
- for (int wordEnd = wordStart + 1; wordEnd < strs.length; wordEnd++) {
- len += strs[wordEnd].length() + 1;
- if (len > rowLen) {
- for (int i = wordEnd; i < strs.length; i++) {
- l[wordStart][i] = Integer.MAX_VALUE / 100;
- }
- break;
- } else {
- l[wordStart][wordEnd] = (int) Math.pow(rowLen – len, 3);
- }
- }
- }
- /*
- * cost[i] stores the least cost of the justification from word0 to
- * wordi
- */
- int[] cost = new int[strs.length];
- /*
- * segment[i] stores index of left-most word which combines with wordi
- * in the same line. e.g. segment[6] = 3, it means word3, word4, word5,
- * word6 are in the same line.
- */
- int[] segment = new int[strs.length];
- cost[0] = l[0][0];
- segment[0] = 0;
- for (int i = 1; i < strs.length; i++) {
- cost[i] = l[0][i];
- segment[i] = 0;
- for (int j = 0; j < i; j++) {
- int currCost = cost[j] + l[j + 1][i];
- if (currCost < cost[i]) {
- cost[i] = currCost;
- segment[i] = j + 1;
- }
- }
- }
- /*
- * reverse the print sequence, in order to print
- */
- for (int i = strs.length – 1; i > 0;) {
- int wordStart = segment[i];
- segment[wordStart] = i;
- i = wordStart – 1;
- }
- for (int i = 0; i < strs.length;) {
- int wordEnd = segment[i];
- for (int j = i; j <= wordEnd; j++) {
- System.out.print(strs[j] + ” “);
- }
- System.out.println();
- i = wordEnd + 1;
- }
- }
- }