LOJ 1013 - Love Calculator
Tags : String, Subsequence, Dynamic Programming, Memoization
We need to find some shortest possible Strings(meaning that they won't have any unnecessary suffix or prefix) that can produce the names as Subsequences and we have to output :
- How many unique Strings are there?
- What is the length?
Helpful Resources
Solution
The problem statement has already mentioned that the length of the names (Strings) will not cross 30. We will take two 2D arrays of size 31 x 31
(although we just need length of name1 * length of name2 for each test case but let's keep it this way just for the sake of ease of implementation and since neither it consumes a lot of memory),
- one for memoizing the length of the shortest String
- another one to keep record of how many unique Strings are generated.
But first let us understand what type of Strings we are counting and to do that let's take an example:
USA
USSR
For the above example (and others), all (meaning that they may contain unnecessary suffix and prefix) the possible Strings whose Subsequences can produce both the name is actually inifinite. For example:
USSRA
abcdUSSRApzxockbvbkpvzxco
USASRadasdasd
USSAR
fwetUSSARrweUSASRadasdasd
dasdsadasdsadcwqrfqtgUSASR
gjoUSSRApidhUSSARglkdfg
dfgdgfUSSAR
USSARgnsdihgisgf
...
...
But the problem has asked us to find the shortest Strings whose Subsequences can produce both of the names. So we don't keep any unnecessary suffix or prefix.
For distinction only and to make them easily detectable, some Substrings are capitalized in each Strings shown above, from which we can obtain both USA
and USSR
.
For example :
From the String fwetUSSARrweUSASRadasdasd
, we can take any of the Substrings USSAR
or USASR
which both of them can produce the Subsequences USA
and USSR
. Remember that both the Substrings and Subsequences must maintain the order of characters of its original String. The only difference between Substrings and Subsequences is that Substrings must contain consecutive characters too while Subsequences has NO such bound.
Now if we just think of shortest unique Strings for this particular test case, those would be only these 3 :
USSRA
USSAR
USASR
And as already explained before, RAUSS
or similar Strings that contain all the characters of both the names but if the characters to produce the names can NOT be retrieved in a Sequential order, are not Subsequences at all, thus we are not counting them nor they are our concern.
Now that we have finally understand what type of Strings we are looking for, let's understand how we are using the arrays to memoize and find the solutions.
How are we keeping track of common characters (shortest length of the substring)?
At first we will take the names and shift their characters right by 1 index for ease of comparison and implementation in the upcoming steps. Also because at length == 0
they won't have any common characters at all.
As we have shifted the characters to right by 1 index, then we simply create both the matrix for dynamic programming by occupying the first column and first row. We can we do it in any fashion we want. Row and Column represents the characters of each of names. I have chosen Row to represent the first name and the Column to represent the second name, it can be done other way around as well.
For the shortest Strings matrix, we will put the max(row,column)
value to each of blocks in j-th column and i-th row. For example: for (4,0)
and (0,4)
, 4
is the maximum of both the values so those two blocks will have 4
as their value. We are doing this because we are assuming name1 == name2
at the initial stage.
The shortestString
matrix should look like something like this at the initial state:
I/J |
0 |
1 |
2 |
3 |
4 |
... |
30 |
0 |
0 |
1 |
2 |
3 |
4 |
... |
30 |
1 |
1 |
|
|
|
|
|
|
2 |
2 |
|
|
|
|
|
|
3 |
3 |
|
|
|
|
|
|
4 |
4 |
|
|
|
|
|
|
... |
... |
|
|
|
|
|
|
30 |
30 |
|
|
|
|
|
|
Now to memoize the shortestString matrix, we must check the characters of a name in respect to the other one. Each block
represents the common length of the shortest string required for both of the names' 1st character to those particular indice
of those names respectively. (Remember, we already shifted the characters by 1
character.)
For Example : shortestString[1][3]
's value
is the length required when I am taking name1's to substring length = 1 (USA
's length 1 substring == "U
") and name2's substring length = 3 (USSR
's length 3 substring == "USS
"). So, shortestString[1][3]
= 3
is required as the bare minimum length for U
and USS
.
i/j |
0 |
1(U) |
2(S) |
3(S) |
0 |
0 |
1 |
2 |
3 |
1(U) |
1 |
1(U) |
2(US) |
3(USS) |
We just simply keep adding 1
to the last updated cumulative sum. Let's understand what to do on which condition :
Case name1[i] == name2[j] : Since both of the characters
are same, we don't need to extend the string length. We can just add 1
to previous common length from upper-left block, which is shortestString[i-1][j-1]
. Look at the table below for clarification of an example:
In example of USA
and USSR
, it will look something like this after memoization.
i/j |
0 |
1(U) |
2(S) |
3(S) |
4(R) |
0 |
0 |
1 |
2 |
3 |
4 |
1(U) |
1 |
1(U) |
2(US) |
3(USS) |
4(USSR) |
2(S) |
2 |
2(US) |
2(US) |
3(USS) |
4(USSR) |
3(A) |
3 |
3(USA) |
3(USA) |
4 (USA + S) [from j=3] / (USS + A)from[i=3] =(USAS/USSA) |
5 ((USAS/USSA) + R)) /(USSR + A) =(USASR/USSAR/USSRA) |
And shortestString[name1.length()-1][name2.length()-1]
give us the length of the shortest string which is 5
.
How are we keeping track of unique strings?
For the unique String matrix, we occupy the 1st column and 1st row with 1
s because we are assuming that the number of unique string is 1
, simply name1 = name2
meaning there is no difference of characters.
The uniqueString
matrix should look like this initially:
I/J |
0 |
1 |
2 |
3 |
4 |
... |
30 |
0 |
1 |
1 |
1 |
1 |
1 |
... |
1 |
1 |
1 |
|
|
|
|
|
|
2 |
1 |
|
|
|
|
|
|
3 |
1 |
|
|
|
|
|
|
4 |
1 |
|
|
|
|
|
|
... |
... |
|
|
|
|
|
|
30 |
1 |
|
|
|
|
|
|
While populating there can be these cases :
Case name1[i] == name2[j] : When the characters matches, we don't need to create/branch out to another new sequence and thus we simply just update the value of the current block by taking the number of unique string from upper-left block.
Case name1[i] != name2[j] && shortestString[i][j - 1] == shortestString[i - 1][j] : It means there had been 2 unique strings sequence of the same length as we can already see in the shortestString
matrix. And we need to take both the sequence in account.
Example :
shortestString[3][3]
is populated by this way. How? name1 length 3 substring is USA
and name2 length 3 substring is USS
. shortestString[2][3]
is USS
and shortestString[3][2]
is USA
and neither of them can produce the other name's substring upto that length. So we can add the unique sequence character A
of name1 to name2's length 3 substring USS+A
; also if we do it the other way around, add the unique sequence character S
of name2 to name1's length 3 substring USA+S
, is valid and has to be counted as per question requirement. So, we are adding the number of unique strings found in those states.
shortestString
= number of unique strings found for name1 length 3 substring
+ number of unique strings found for name2 length 3 substring
= shortestString + shortestString = 2
Case name1[i] != name2[j] && shortestString[i][j - 1] != shortestString[i - 1][j] : In this case we just simply take the minimum of which can be taken from left block or upper block. The logic is similar to shortestString[][]
population. We don't want unnecessary suffix to be added.
If we have done everything accordingly, uniqueString
matrix should look like this :
i/j |
0 |
1 |
2 |
3 |
4 |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
3 |
1 |
1 |
1 |
2 |
3 |
Now, uniqueString[name1.length()-1][name2.length()-1]
tells us how many unique Strings are there.
The above implementation is accepted
.
Caution : The matrix to memoize the number of unique strings, must be of an integer data type that can hold 263.
Notes :
- You can solve the problem recursively too and it won't throw any MLE.
long
data type's size of c++
is dependent on the hardware (processor) and the software (operating system and compiler). LightOJ
uses a 64-bit compiler so only long
is sufficient. However we might need to use long long
if any of the dependency is 32-bit.
Solution in C++ (Iterative)
#include <bits/stdc++.h>
using namespace std;
int main()
{
ios::sync_with_stdio(false);
cin.tie(NULL);
cout.tie(NULL);
int testCases;
string name1, name2;
int shortestString[31][31];
long uniqueString[31][31];
cin >> testCases;
for (int testCase = 1; testCase <= testCases; testCase++)
{
cin >> name1 >> name2;
//Shift the characters of the name to right for ease of memoizing
name1.insert(0, "0");
name2.insert(0, "1");
//Prepare the matrices for memoization
for (int i = 0; i < 31; i++)
shortestString[0][i] = shortestString[i][0] = i, uniqueString[i][0] = uniqueString[0][i] = 1;
for (int i = 1; name1[i]; i++)
{
for (int j = 1; name2[j]; j++)
{
//Checking if we need to take the cumulative sum from upper-left block
if (name1[i] == name2[j])
{
//Adding 1 to cumulative sum from upper-left block
shortestString[i][j] = 1 + shortestString[i - 1][j - 1];
//No need to add a new branch of unique strings so taking cumulative sum from upper-left block
uniqueString[i][j] = uniqueString[i - 1][j - 1];
}
else
{
//Finding the minimum from left and upper block and adding 1 to the value of current block
shortestString[i][j] = 1 + min(shortestString[i][j - 1], shortestString[i - 1][j]);
//Checking if there are two unique strings of the same length
if (shortestString[i][j - 1] == shortestString[i - 1][j])
uniqueString[i][j] = uniqueString[i][j - 1] + uniqueString[i - 1][j];
//Checking if left block has the minimum value in shortestString matrix
else if (shortestString[i][j - 1] < shortestString[i - 1][j])
uniqueString[i][j] = uniqueString[i][j - 1];
else
uniqueString[i][j] = uniqueString[i - 1][j];
}
}
}
cout << "Case " << testCase << ": " << shortestString[name1.length() - 1][name2.length() - 1] << " " << uniqueString[name1.length() - 1][name2.length() - 1] << "\n";
}
return 0;
}
Solution in C++ (Recursive)
#include <bits/stdc++.h>
using namespace std;
string name1, name2;
int shortestString[31][31];
long uniqueString[31][31];
void memoizeArrays(int i, int j){
if(i == name1.length())
return;
if(j == name2.length())
return memoizeArrays(++i, 1);
else{
if (name1[i] == name2[j]){
shortestString[i][j] = 1 + shortestString[i - 1][j - 1];
uniqueString[i][j] = uniqueString[i - 1][j - 1];
}else{
shortestString[i][j] = 1 + min(shortestString[i][j - 1], shortestString[i - 1][j]);
if (shortestString[i][j - 1] == shortestString[i - 1][j])
uniqueString[i][j] = uniqueString[i][j - 1] + uniqueString[i - 1][j];
else if (shortestString[i][j - 1] < shortestString[i - 1][j])
uniqueString[i][j] = uniqueString[i][j - 1];
else
uniqueString[i][j] = uniqueString[i - 1][j];
}
return memoizeArrays(i, ++j);
}
}
int main()
{
ios::sync_with_stdio(false);
cin.tie(NULL);
cout.tie(NULL);
int testCases;
cin >> testCases;
for (int testCase = 1; testCase <= testCases; testCase++)
{
cin >> name1 >> name2;
name1.insert(0, "0");
name2.insert(0, "1");
for (int i = 0; i < 31; i++)
shortestString[0][i] = shortestString[i][0] = i, uniqueString[i][0] = uniqueString[0][i] = 1;
memoizeArrays(1, 1);
cout << "Case " << testCase << ": " << shortestString[name1.length() - 1][name2.length() - 1] << " " << uniqueString[name1.length() - 1][name2.length() - 1] << "\n";
}
return 0;
}
Tutorial source