Algorithm(9) Search

Find particular properties based on the abstracted decision tree or abstracted graph.

The input data is not the explicit tree with Tree node for questions listed in this blog. For the string structure, it may use some ideas like sliding window with loer bound and upper bound as the mechanism to search things, we do not list them here, we mainly list the question with the decision tree or graph in an logical way but the input is not the not the explicit tree structure.

Input data DFS
(up to bottom)
DFS
(bottom to up)
Naive BFS BFS with visited flag BFS+DFS Binary search Accelaration by Map
single number (combination)
22
array of numbers (combination)
39,40,77
78,90,216
(permuataion)
46,47,996
string (permutation)784
(expression)856,726
301
(expression) 394
2d Array 943,37,51 17,

Key techniques and considerations

  • Extracting decision tree from the actual problem. One difficulty with the search problem is that we need to extract the decision tree logically from different problem contexts. Typical scenarios are 1> the binary tree type, which decide add or not add current character. Each node represent choose or not choose here 2> Each node represents an actual character as the tree node, such as the permulation problems or combination of different cases. 3> Each node represents the derived results about the givin input, a little bit similar to the reduction opetaion. such as 241, the resutls can be a set contains all possible solutions.

  • Using DFS or BFS, we can decide to use BFS or DFS from several aspects:

    1. For the tree node expression, the BFS case might need to use a customized tree node (which is rarely used in leetcode questions), usually use the pair or tuple if we need multiple variables in the tree node. For the DFS code, there are less restrictions since we basically hide the tree node inexplicitly in the function call. We can transfer the necessary variables used in each call by function parameters. We can also set some varaibles as the class variables to avoid transfering them by tree node and simplify the code a little bit.

    2. Time complexity and if there is explicit layered structure. Although both the time compexity is O(V+E), the DFS is suitable for the narrow long tree, ans the BFS is suitable for the wide and short tree. In practical case, we may need to evaluate the data scale of the leetcode question, this may decide which algorithm is used for the giving data sets. This is tricky and is easy to be neglected when we start to code in in a rush without considering time compexity in advance carefully.

      If the question shows an obvious layered structure, the BFS can be more approporiate. For example, we find the minimal path with the particular constrians, we use the BFS, once find the path, we return it direactly. These type of questions are typical BFS style, and the BFS can ususlly save more efficient for solving these type of questions.

    3. For the DFS solution, pay attention that if we need return sth from the child tree. Namely make sure which is more convenient, either top to bottom operation (the previous informaion in the path can be used to determin the subsequent nodes in the path) or bottom to up operation. The core idea is to make sure what are computation processed in each node (is it merging two results from subtree or divide the current results to two subtree). It is worth noting that, for the bottom to up case, we can only use the DFS since it keep the status of the current node, so we can update the status when getting back results from the subtree. For the BFS, it just pop the parent out during the traverse process and we can not further update the previous status of the nodes.

    For the DFS case, there are three typical design: (a) Check all solutions and return void, put the valid path into a separate ans (b) Return true or false, we only need to find one path in this case, if we find the valid path, just return true (such as 37) (c) Return a specific value for each dfs call, maybe a reduced integer or more complex results such as a set.

    1. Set back the status. It is important to always check if we set the status back for the dfs operation. For example, a> when we find specific path, remember to pop the current node back from the ans if we insert it for the current layer (make sure the pop_back match with the insert, otherwise, there might be the memory issue). b> The status can also be the flag variable that label if a specific character is visited or not. c> In the for loop scenarios, we need to make sure the status if set back to the parent case before calling the dfs operations each time.

    If there is no difference between the aspects discussed above, we can either use BFS or DFS. And using DFS might be more easier from coding’s perspective if we write it into a recursive way instead of using a stack.

  • Commonly used debug strategy. This part of question is easy prone and some small type of condition error can lead to totally different results. Start from the small scale case or print out the traverse path explicitly are commonly used strategies.

  • Commonly used optimization strategy or trimming strategy. Sort the input array. Using flag to label the status of associated elements. Using map to memorize existing path (if the key and values are clear). Avoid the unnecessary string copy.

  • Ensentially the bfs or dfs questions discussed this part can process the unweighted graph. For the question which can be abstracted as the weighted graph, we need to use more complex mechanism to do that such as the dp method to acclerate the searching process (such as dp approach of the 943)

  • Be careful about using the vector<bool> when you need to use a flag array. There are lots of discussion online, you may just use the vector<int> for simplicity or use the bit mask if the flag length is fixed. Using the vector<bool> can slow down the your code heavily (a least 10x times slower compared with usinig the vector<int>) 996 is a good example (three important conditions) to discuss about how to set the flag to make sure how the same node can be visited once in the same level.

  • A tip for checking the associated elements in 2d case. We might check the element at the different direaction such as up down left and right even for slash direaction sometimes, instead of go through each dimention separately, the good way is to initilize a 2d vector, such as (-1,1),(1,-1)... then we use one loop to range this vector and adjust the x and y coordinates accordingly. This method is more simple to check the different direction of the 2d grid (Related question: 51).

Naive DFS-UpToBottom-Combination

The nature of the combination is the cartision product, we may need to have an idea about the set. Then the results of the combination might be to do the cartision products between different sets. It is important to figure out what are avalible set for some questions with complex background.

39. Combination Sum

This is the recursion version, which is really fast. For this recursion version based on dfs, the pop out operation at the end of dfs call is really important. Sorting the input array is an important techniques, sometimes, it can trim the searching tree (when the target value is negative one we trim the tree). The key idea to make sure that the elements can be used multiple times is that for each step of creating the children node, we go through all possible options no matter it is visited or not.

Another key point is that, if we do not use the sort, we may have repeated resutls such as [2,3,3] and [3,2,3]. From the implementaion perspective, using the remain value, and we decrease it gradually (instead of sum value) can be more convenient. Otherwise, we need to check if the current path exists in the ans set (we need to sort the array and check back and forth, which is redundant)

class Solution {
public:
vector<vector<int>> combinationSum(vector<int>& candidates, int target) {
vector<vector<int>> ans;
vector<int> cur;
sort(candidates.begin(), candidates.end());
dfs(candidates, target, 0, cur, ans);
return ans;
}

void dfs(vector<int>& candidates, int target, int s, vector<int>& cur, vector<vector<int>>& ans){
if(target==0){
ans.push_back(cur);
return;
}

for(int i=s;i<candidates.size();i++){
if(candidates[i]>target){
//current path not valid
continue;
}
cur.push_back(candidates[i]);
dfs(candidates, target-candidates[i], i, cur, ans);
//set back to original status
cur.pop_back();
}
return;
}
};

We can also use the BFS for this simple context, but for this question, there is no obvious benifits for using the BFS, since we need to search all possible solutions. However, it needs some difficuties to process the intermidiate node. Using the targetNew value as the distinction for different strategies is important. (how to define the control block is important)

class Solution {
struct Node {
Node(vector<int> path, int target, int index) : m_path(path), m_target(target), m_index(index){}
vector<int> m_path;
int m_target;
int m_index;
};
public:
vector<vector<int>> combinationSum(vector<int>& candidates, int target) {
queue<Node> q;
vector<vector<int>> ans;
//avoid the repeated answer such as [2,3,3] and [3,2,3]
sort(candidates.begin(), candidates.end());

int len = candidates.size();
// init the q
for (int i = 0; i < len; i++) {
int v = candidates[i];
int targetnew = target - v;
Node d({v}, targetnew, i);

if (targetnew > 0) {
q.push(d);
} else if (targetnew == 0) {
ans.push_back({v});
continue;
}

while (q.empty() == false) {
int levelSize = q.size();
// go through the queue
while (levelSize > 0) {
Node node = q.front();
//there should not be the i
//it should be the value of the parent
for (int j = node.m_index; j < len; j++) {
int vnew = candidates[j];
int targetNew = node.m_target - vnew;
vector<int> pathNew = node.m_path;
pathNew.push_back(vnew);
// push the satiafied results into the ans
// push the valid results into the next level
if (targetNew > 0) {
Node nodnew(pathNew, targetNew, j);
q.push(nodnew);
} else if (targetNew == 0) {
ans.push_back(pathNew);
}
// targetNew < 0
// do nothing for this case
// do not push subsequent nodes
}
q.pop();
levelSize--;
}
}
}
return ans;
}
};

40. Combination Sum II

This one is similar with previous one, but the only difference is that the element is not unique in the input string, we add some constrains to avoid the repeated ans. The sorting operation is still necessary, we just skip the current element if it has been visited.

class Solution {
public:
vector<vector<int>> combinationSum2(vector<int>& candidates, int target) {
vector<vector<int>> ans;
vector<int> curr;
sort(candidates.begin(),candidates.end());
dfs(ans,candidates,0,target,curr, true);
return ans;
}

void dfs(vector<vector<int>>& ans, vector<int>& candidates, int start, int target, vector<int>& curr, bool firstlayer){
//dfs children
for(int i=start;i<candidates.size();i++){
// if i is in visited, jump
// attention, it is the i>start here
if(i>start && candidates[i]==candidates[i-1]){
continue;
}
int targetNew = target-candidates[i];
if(targetNew<0){
continue;
}else if(targetNew==0){
curr.push_back(candidates[i]);
ans.push_back(curr);
curr.pop_back();
continue;
}else{
//targetNew >0
//visit current node
curr.push_back(candidates[i]);
//dfs
dfs(ans,candidates,i+1,targetNew, curr, false);
curr.pop_back();
}
}
return;
}
};

77. Combinations

The idea is similar with previous one. Be careful the meaning of the n and k. This can be viewed as the basic one for this type of question. Using the dfs can make things much more easier. For the first layer, there are n possibilites, and then the next layer, there is n-1 combinations.

class Solution {
public:
vector<vector<int>> combine(int n, int k) {
//k numbers
vector<vector<int>> ans;
vector<int> curr;
//in the range of 1 to n
dfs(ans, curr, 1, k, n);
return ans;
}
void dfs(vector<vector<int>>&ans, vector<int>&curr, int s, int k, int n){
int currsize = curr.size();
if(currsize==k){
ans.push_back(curr);
return;
}
for(int i=s;i<=n;i++){
curr.push_back(i);
dfs(ans, curr, i+1, k, n);
curr.pop_back();
}
return;
}
};

78. Subsets

This is also comparatively easy and we can use the similar ideas to solve this questions. The difference is that is stores the results into the ans for every possibilities. For previous quesitons, it stores the value only when it satisfies specific constraints. This one is more generous without specific constraints.

class Solution {
public:
vector<vector<int>> subsets(vector<int>& nums) {
//the number in sets can be 1 to n
sort(nums.begin(),nums.end());
vector<int> cur;
vector<vector<int>> ans;
ans.push_back({});
dfs(ans, cur, nums, 0);
return ans;
}

void dfs(vector<vector<int>>&ans, vector<int>&curr, vector<int>& nums, int s){
for(int i=s;i<nums.size();i++){
curr.push_back(nums[i]);
//for each possible results, put into ans
ans.push_back(curr);
dfs(ans,curr,nums,i+1);
curr.pop_back();
}
return;
}
};

90. Subsets II

There are some difference, it contains the repeated elements in queue, and results should not be repeative. We can solve this using the similar techniques used in 40

class Solution {
public:
vector<vector<int>> subsetsWithDup(vector<int>& nums) {
//the number in sets can be 1 to n
sort(nums.begin(),nums.end());
vector<int> cur;
vector<vector<int>> ans;
ans.push_back({});
dfs(ans, cur, nums, 0);
return ans;
}

void dfs(vector<vector<int>>&ans, vector<int>&curr, vector<int>& nums, int s){
for(int i=s;i<nums.size();i++){
if(i>s && nums[i]==nums[i-1]){
continue;
}
curr.push_back(nums[i]);
//for each possible results, put into ans
ans.push_back(curr);
dfs(ans,curr,nums,i+1);
curr.pop_back();
}
return;
}
};

216. Combination Sum III

There are not much differences compred with previous type, we put the curr into the ans when the constraints are satisfied.

class Solution {
public:
vector<vector<int>> combinationSum3(int k, int n) {
//n is the targeted value
//k represents the list from 1 to k
vector<vector<int>> ans;
vector<int> curr;
dfs(ans, curr, 1, k, n);
return ans;
}

void dfs(vector<vector<int>>& ans,vector<int>& curr, int s, int k, int target){
if(target==0 && curr.size()==k){
ans.push_back(curr);
return;
}
for(int i=s;i<=9;i++){
int targetnew = target-i;
if(targetnew<0){
//do not further go through along this path
continue;
}
//for case targetnew >0 or ==0 using dfs
//we list the return point at the begining of the dfs
//it can simplify the code a little bit compared with
//processing every if condition separately
curr.push_back(i);
dfs(ans,curr,i+1,k,targetnew);
curr.pop_back();
}
return;
}
};

22. Generate Parentheses

Evey node can be the left or right parentheses, and then we trim some results. From the top to bottom, the final path from the root to the leaf node is the actual ans we need to record. We can only place the right parathesis if the count of the left parathesis is larger than 0.

class Solution {
public:
int totalLen=0;
vector<string> generateParenthesis(int n) {
vector<string> ans;
totalLen=n*2;
helper(ans, "", 0);
return ans;
}

void helper(vector<string>& ans, string str, int leftc){
int strlen = str.length();
if(strlen == totalLen && leftc==0){
//put string into ans
ans.push_back(str);
return;
}

if(strlen<totalLen){
//can put leftc anyway, left child
helper(ans,str+"(",leftc+1);
if(leftc>0){
// can put right if there is remaining leftc
// left child
helper(ans,str+")",leftc-1);
}
}
return;
}
};

37. Sudoku Solver

the idea of checking the conflicts is just check it from the 1-9 and keep calling the next level, try to set the cell to a dot if there is no possible results. return the true if all cells are filled in.

The logic is not complex but it sill take me some time to figure it out. Originally we did not add return true or false, so it basically search all possibilities even if we find a valid results. The idea here is that we return it immediately if we find a valid ans. For the case when there is number at a specific position, we also need to add the return true or false.

class Solution {
public:
// assume i and j is less then 9
// assume value is from 1 to 9
// the current one is .
// make sure what are possible options for specific set
vector<int> getOptions(int row, int column, vector<vector<char>>& board) {
int boardDim = 9;
vector<bool> flag(10, true);
// filter row i
for (int j = 0; j < boardDim; j++) {
if (j == column) continue;
if (board[row][j] == '.') {
continue;
}
// filter the conflict value
flag[board[row][j] - '0'] = false;
}

// filter column j
for (int i = 0; i < boardDim; i++) {
if (i == row) continue;
if (board[i][column] == '.') {
continue;
}
// filter out the conflict one
flag[board[i][column] - '0'] = false;
}

// filter box
int rowBox = row / 3;
int columnBox = column / 3;

for (int i = rowBox * 3; i < (rowBox + 1) * 3; i++) {
for (int j = columnBox * 3; j < (columnBox + 1) * 3; j++) {
if (i == row && j == column) continue;
if (board[i][j] == '.') continue;
flag[board[i][j] - '0'] = false;
}
}

// get option
vector<int> ans;
for (int i = 1; i <= 9; i++) {
if (flag[i]) {
ans.push_back(i);
}
}
return ans;
}

void solveSudoku(vector<vector<char>>& board) {
// using recursion based dfs
dfs(0, 0, board);
return;
}

bool dfs(int row, int column, vector<vector<char>>& board) {
int boardRow = 9;
int boadColumn = 9;

if (row == boardRow) {
return true;
}

int nextRow = row;
if (column == (boadColumn - 1)) {
nextRow = row + 1;
}
int nextColumn = (column + 1) % boadColumn;
if (board[row][column] >= '1' && board[row][column] <= '9') {
return dfs(nextRow, nextColumn, board);
} else {
// current value node exist
// get options
vector<int> options = getOptions(row, column, board);
if (options.size() == 0) {
// current is conflict
return false;
}
// range options
for (int i = 0; i < options.size(); i++) {
board[row][column] = options[i] + '0';
bool find = dfs(nextRow, nextColumn, board);
if (find==true) return true;
// pop back (not necessary, there is valid answer anyway)
board[row][column] = '.';
}
}
return false;
}
};

51. N-Queens

This question is similar to previous one. The dfs need to go thrrough all possible combinations and store the valid results during the traverse. For each level, we can assume the possible options (namely the elements in the set) is the position for each one in different column. Then we use a function to decide if each position is conflict with the current grid results to decide if the current tree can move one step further. The recursion depth is the size of the grid.

Fur the current code, the meaning of each node means if the tree is at the specific row, column position. Since each row can only put one element, so we can simply use a for loop to detect each case and range it. (Every option is exclusive)

Origianlly, we consider a more general case and consider each cell separately. Each cell can put a queue or not, for that code, even though it can work, but we create too many recursions and the code become slow. The time complexity is (2^(n*n)) for that case.

class Solution {
public:
int m_gridDim;
vector<vector<string>> m_ans;
vector<vector<int>> dir = {{-1,1},{1,1},{-1,-1},{1,-1}};

vector<vector<string>> solveNQueens(int n) {
vector<string> initGrid(n);
m_gridDim = n;
// init grid
for (int i = 0; i < m_gridDim; i++) {
for (int j = 0; j < n; j++) {
initGrid[j] += ".";
}
}

dfs(n, 0, initGrid);
return m_ans;
}

//each row is a new level of the tree
bool conflict(int row, int column, vector<string>& grid) {

// edge case
if (m_gridDim == 1) {
return false;
}

// check row
if (grid[row].find("Q") != string::npos) {
// there is Q in this row
return true;
}

// check column
for (int i = 0; i < m_gridDim; i++) {
if (grid[i][column] == 'Q') {
return true;
}
}

//check for dir
for(int i=0;i<4;i++){
int nextr = row+dir[i][0];
int nextc = column+dir[i][1];
//if the results is in the boundry
while (nextr >= 0 && nextr < m_gridDim && nextc >= 0 && nextc < m_gridDim) {
if (grid[nextr][nextc] == 'Q') {
return true;
}
//continue move
nextr=nextr+dir[i][0];
nextc = nextc+dir[i][1];
}
}
return false;
}

// put one queue for one row
void dfs(int remain, int curRow, vector<string>& currGrid) {

//printf("r %d\n", curRow);
//for (int i = 0; i < m_gridDim; i++) {
// printf("%s\n", currGrid[i].c_str());
//}
//printf("\n---\n");

if (curRow == m_gridDim) {
// out of bound
// the current ans might ok
if (remain == 0) {
m_ans.push_back(currGrid);
}
return;
}

for(int c=0;c<m_gridDim;c++){
//if put at the current row and c
bool ifconflict = conflict(curRow, c, currGrid);
if(ifconflict){
//can not put at the current position
continue;
}
//put it at the current position
currGrid[curRow][c] = 'Q';
dfs(remain - 1, curRow+1, currGrid);
// set status back
currGrid[curRow][c] = '.';
}
return;
}
};

Naive DFS-UpToBottom-Permutation

46. Permutations

Typical permutation question, we can reuse the similar framework and put things into the ans only when all elements are checked. Still go through the path as previous one, put the path into the ans only when it get to the end.

Maybe there are more efficient way to consider which path is unnecessary, in this example, we just check if the particular elements is in the curr, if it is already in the curr, we do not go through along that path. The start position in this question start from the 0 to end instead of customized start position to end (since the results can be any order).

We can use another flag vector to label if the element is used or not, if then when we pop the element out, we set the specific label into false. This can further improve the code a little bit.

class Solution {
public:
vector<vector<int>> permute(vector<int>& nums) {
vector<vector<int>> ans;
vector<int> curr;
dfs(ans,curr,nums);
return ans;
}

bool exist(int k, vector<int>&curr){
for(auto v:curr){
if(v==k){
return true;
}
}
return false;
}

void dfs(vector<vector<int>>&ans, vector<int>&curr,vector<int>& nums){
if(curr.size()==nums.size()){
ans.push_back(curr);
return;
}

for(int i=0;i<nums.size();i++){
//if i is in the curr
//do not go through along this path
if(exist(nums[i],curr)){
continue;
}

curr.push_back(nums[i]);
dfs(ans,curr,nums);
curr.pop_back();
}

return;
}
};

This is the code with the visited flag. It looks that the permutation related problem use the visited flag a lot (each ans is a complete path) and the combination problems use the sort a lot (the answer can be half of the path).

class Solution {
public:
vector<vector<int>> permute(vector<int>& nums) {
vector<vector<int>> ans;
vector<int> curr;
vector<bool> visited(nums.size(),false);
dfs(ans,curr,nums,visited);
return ans;
}

void dfs(vector<vector<int>>&ans, vector<int>&curr,vector<int>& nums, vector<bool>&visited){
if(curr.size()==nums.size()){
//put current path into the ans
ans.push_back(curr);
return;
}

for(int i=0;i<nums.size();i++){
//if i is in the curr
//do not go through along this path
if(visited[i]){
continue;
}

curr.push_back(nums[i]);
visited[i]=true;
dfs(ans,curr,nums,visited);
//set status back
curr.pop_back();
visited[i]=false;
}
return;
}
};

47. Permutations II

Similar to the previous question, the input of this problem contains the repeated elements. In previous one, we solve this issue by sorting the input firstly and then compares nums[i] and nums[i-1], when they appears at the second time, we do not go through along that path. It use both the sorting strategy and the visited flag to trim the path

class Solution {
public:
vector<vector<int>> permuteUnique(vector<int>& nums) {
vector<vector<int>> ans;
vector<int> curr;
vector<bool> label(nums.size(),false);
sort(nums.begin(),nums.end());
dfs(ans,curr,nums,label);
return ans;
}

void dfs(vector<vector<int>>&ans, vector<int>&curr,vector<int>& nums, vector<bool>& label){
if(curr.size()==nums.size()){
ans.push_back(curr);
return;
}

for(int i=0;i<nums.size();i++){
//if i is in the curr
//do not go through along this path
if(label[i]){
continue;
}
//only in the current layer
//if the previous same one is visited
//we do not visit current one
if(i>0 && nums[i]==nums[i-1] && label[i-1]==true){
continue;
}

curr.push_back(nums[i]);
label[i]=true;
dfs(ans,curr,nums,label);
curr.pop_back();
label[i]=false;
}
return;
}
};

943. Find the Shortest Superstring

The original idea of this question is that, do all the permulations, and then remove the potential overlapping part.

The original idea to remove the overlapping part is to find if the start position of second string is in the first string. Assuming in the position i of the first string, then compare the i to the end, to see if it appears in the second string, it is possible there are multiple positions. This operation can be applied every time we add a new string.

this operation can be also done before searching (this is a good point), and we store the results explicitly. then we do not need to do the similar operation during the search (we store the information beforehand) This info is important to construct the final path, dfs call is n! for every node, we call n nodes.

//this is the naive search solution
//did not pass all the tests, time limit issue
//56 / 83 test cases passed.

class Solution {
public:
int overlappingLen2(string& first, string& second) {
int ans = 0;
int upperBound = min(first.length(), second.length());
for (int k = 1; k <= upperBound; k++) {
if (first.substr(first.length() - k) == second.substr(0, k)) {
ans = max(ans, k);
}
}
return ans;
}

string shortestSuperstring(vector<string>& words) {
string ans;
vector<bool> labelUsed(words.size(), false);
int wordsLen = words.size();
dfs(words,wordsLen,"",ans,labelUsed,0);
return ans;
}


//we use the parent Str to remember last steps results
void dfs(vector<string>& words, int& wordsSize, string parentStr, string& ans, vector<bool>& labelUsed, int count){
//check all the possible of strings here
if(count==wordsSize){
//update ans when all string is counted
if(ans.length()==0 || (parentStr.length()<ans.length())){
ans = parentStr;
}
return;
}

for(int i=0;i<wordsSize;i++){
if(labelUsed[i]==true){
continue;
}
int overlapLen = overlappingLen2(parentStr,words[i]);
string currStr = parentStr+words[i].substr(overlapLen);
labelUsed[i]=true;
//dfs check
dfs(words,wordsSize,currStr,ans,labelUsed,count+1);

//udpate status of the current layer
labelUsed[i]=false;
}
return;
}
};

This is the version that computes the prior information about the overlap length between each two pairs.

class Solution {
public:
//use shared part for the ans
vector<vector<int>> m_priorInfo;
vector<int> m_bestPath;
int m_bestLen;

int overlappingLen(string& first, string& second) {
int ans = 0;
int upperBound = min(first.length(), second.length());
for (int k = 1; k <= upperBound; k++) {
if (first.substr(first.length() - k) == second.substr(0, k)) {
ans = max(ans, k);
}
}
return ans;
}

string shortestSuperstring(vector<string>& words) {

int wordsLen = words.size();

m_priorInfo =vector<vector<int>> (wordsLen, vector<int>(wordsLen,0));

//process the prior info
for(int i=0;i<wordsLen;i++){
for(int j=0;j<wordsLen;j++){
if(i==j){
m_priorInfo[i][j]=words[i].length();
}else{
m_priorInfo[i][j]=overlappingLen(words[i],words[j]);
}
//printf("info i %d j %d len %d\n",i,j,priorInfo[i][j]);
}
}


vector<bool> labelUsed(words.size(), false);
vector<int> currPath;

//do the dfs
dfs(words,wordsLen,labelUsed,0,0,currPath);

//recover the string from the bestPath
string ans;
for(int i=0;i<m_bestPath.size();i++){
//printf("path i %d %d\n",i,m_bestPath[i]);
int currIndex = m_bestPath[i];
if(i==0){
ans+=words[currIndex];
}else{
int lastIndex = m_bestPath[i-1];
ans+=words[currIndex].substr(m_priorInfo[lastIndex][currIndex]);
}
}
return ans;
}

//we use the parent Str to remember last steps results
void dfs(vector<string>& words, int& wordsSize, vector<bool>& labelUsed, int count, int currLen, vector<int>& currPath){
//check all the possible of strings here
if(m_bestLen>0 && currLen>=m_bestLen){
return;
}

if(count==wordsSize){
//update best path and best len
m_bestLen=currLen;
m_bestPath=currPath;
return;
}

for(int i=0;i<wordsSize;i++){
if(labelUsed[i]==true){
continue;
}

int currLenNew = currLen;
if(currPath.size()==0){
//the first one
currLenNew=words[i].length();
}else{
int overlapLen = m_priorInfo[currPath.back()][i];
//overlappingLen2(words[currPath.back()],words[i]);
currLenNew = currLen+(words[i].length()-overlapLen);
}

currPath.push_back(i);

//string currStr = parentStr+words[i].substr(overlapLen);
labelUsed[i]=true;
//dfs check
dfs(words,wordsSize,labelUsed, count+1, currLenNew, currPath);

//udpate status of the current layer
labelUsed[i]=false;
currPath.pop_back();
//make sure currLen do not change
//after each iteration in the for loop
}
return;
}
};

This code can not pass all tests because and exceeds the limited time. The ideal way is still to view the problem as a weighted graph and find the shortest path. Refer to the graph listed in huahua’s blog and the official solution here. Using the mask to remember the existing visited nodes is a little bit tricky. The 1 or 0 in each bit position of the mask represents that if the associated value is visited or not. The mask ^ (1<<j) represents that we set jth position as 1 and put it into the mask. (The ^ is the xor opertaion, if the original bit is 0 and current one 1<<j is 1, the results here for specific bit is 1). Using s & (1 << j) operation can check if the specific bit is set. If it is set, the results is a non zero number, otherwise, the result is zero. We do not discuss the dp solution here. The dp solution can be written into the recursive way or not. The core ideas is that we keeping the intermidiate status during the search. How to memorise that information is key point to improve the speed.

996. Number of Squareful Arrays

This is a permutation operation pluse the constraint. The idea is a little bit similar with previous one. Same to the previous one, the naive searching method is still time consuming and the dp method can improve the code a lot, but it is not easy to figure out that dp solution. There are several small points, for example, 0 is also a square number. We can also sort the array and compute the prefix sum to see if the results is squere number. We can avoid the repeated solutions by this way.

class Solution {
public:
vector<vector<bool>> m_priorInfo;
bool squareNum(int num){
for(int i=0;i*i<=num;i++){
if(i*i==num){
return true;
}
}
return false;
}


int numSquarefulPerms(vector<int>& nums) {
int numSize = nums.size();
m_priorInfo = vector<vector<bool>> (numSize,vector<bool>(numSize,false));
sort(nums.begin(), nums.end());
//do the search
int ans=0;
vector<int> path;
vector<bool> labels(numSize, false);
dfs(ans,nums,labels,path);
return ans;
}

void dfs(int& ans, vector<int>& nums, vector<bool>& labels, vector<int>& path){

//check the results
//if valid
//how to remove the repeated one?
if(path.size() == nums.size()){
ans++;
return;
}
//we need to start from the 0 since we need to go through
//all permutations
for(int i=0;i<nums.size();i++){
if(labels[i]==true) continue;
if(i>0 && nums[i]==nums[i-1] && labels[i-1]==true) continue;
if(!path.empty() && !squareNum(path.back()+nums[i])) continue;

//all situation need to put the elements
path.push_back(nums[i]);
labels[i]=true;

//dfs for children layers
dfs(ans,nums,labels,path);

//back to current layer
labels[i]=false;
path.pop_back();
}

}
};

This is the version to utilize the prior knoledge, maybe the good strategy is not to compute them all at once, we can cache them during the searching process.

class Solution {
public:
bool squareNum(int num){
int s = sqrt(num);
return s * s == num;
}


int numSquarefulPerms(vector<int>& nums) {
int numSize = nums.size();
sort(nums.begin(), nums.end());
//do the search
int ans=0;
vector<int> path;
vector<int> labels(numSize, 0);
dfs(ans,nums,labels,path);
return ans;
}

void dfs(int& ans, vector<int>& nums, vector<int>& labels, vector<int>& path){

//check the results
//if valid
//how to remove the repeated one?
if(path.size() == nums.size()){
ans++;
return;
}
//we need to start from the 0 since we need to go through
//all permutations
for(int i=0;i<nums.size();i++){
if(labels[i]) continue;
//be carefule here
//if there are multiple same elements
//only the first can be used in the same level, the third one is important
//only when the labels for previous one is 0
//we can make sure its dfs complete
//otherwise, it might be the condition during the dfs process
if(i>0 && nums[i]==nums[i-1] && labels[i-1]==0) continue;
if(!path.empty() && !squareNum(path.back()+nums[i])) continue;

//all situation need to put the elements
path.push_back(nums[i]);
labels[i]=1;

//dfs for children layers
dfs(ans,nums,labels,path);

//back to current layer
labels[i]=0;
path.pop_back();
}

}
};

The efficient way is still to use the dp method, it is ensentially a hamitonion path in the graphy theory, refer to this explanation (https://www.youtube.com/watch?v=ISnktonx7rY). When there two numbers are prefect squre, there is a path between them.

Naive DFS-UpToBottom-Expression

This type of question provide a expression, then maybe we need to compute some properties according to the expression. There usually exist the inserted structure which can be abstracted into a tree. The question in this type is a little bit like decoding the string into a tree structure with layered information, even if we do not use an actually tree structure, we still need to keep a layer variable to trace current part is at which layer. The 856 is a good example.

856 Score of Parentheses

the code looks super easy we do not even write it into a recursion way, but it is not straightforward to figure out why it works. The whole idea for solving this problem is to emulate a tree structure, different with the previous one, every node is a () pair.

For example, if the input is (()), the imaginary tree is as follows:

()
|
()

If there is (()()), the imaginary tree (or the decision tree) is as follows:

()
| \
() ()

The key is that we do not consider the implementaion, we consider the abstraction first. Then according to the rule of the score, for each leaf node, its score is 2^depth, then we accumulate all the scores of leaf node together. Although the description of the question is to compute the score from the bottom to top, it is much easier to compute scores from the top to the bottom.

Regarding to the implementation, when there is (, the depth increase, when there is ) the depth decrease, and if the previous one is (, we assume that we met the leaf node. The whole sequence is just like the LRN traversal of the tree

class Solution {
public:
int scoreOfParentheses(string s) {
int score =0;
int len = s.length();
int depth=0;
//range s
for(int i=0;i<len;i++){
if(s[i] == '('){
depth++;
}else if(s[i] == ')'){
if(i>0 && s[i-1]=='('){
score+=pow(2,depth-1);
}
depth--;
}
}
return score;
}
};

Another way is to find the balanced outlayer strcutre, and then use the recursion to call the inner layer, which is like the tree taversal from the top to the bottom, in the string case, it can use the left and right index to label the each layer of the structure, when you find each layer, the tree depth increase one.

Enssentially, it like the process of decoding the string into the tree and then compute some properties of the tree. For the 394. Decode String, this idea is more obvious, and we basically use the bottom to up strategy of the tree based approach and we decode each subtree recursively. For each level, we just need a function to let it return the reuslts of the subtree. For the tree structure in logically, the leaf node is the character and the parent node is the number. The issue is that in the string structure, we basically flatten everything and not knwo when it end. So that’s why we also return a right index. That right index labels when the current level moves to the end. And then we start from the next segment (which is the node at the same level with the current node)

726. Number of Atoms

This is the a hard one, the main reason is that the rules are a liitle bit complex. The idea is same with previous one, the different place is that we read the string from right side to the left side, it might be easier to consider things in this way. By this way, we check the number first (the non leaf node) then the atom (the leaf node). The code can be further optimized. Although current code can work, it is still a little bit unclear for when to put the information into the map.

class Solution {
public:
map <string, int> atomMap;
string countOfAtoms(string formula) {
// ordered map to store the atom and its number
string str;
int len = formula.length();
int lindex=0;
helper(formula,lindex,len-1,len,1);
//output the value from map into string
for(auto it : atomMap){
string key = it.first;
//std::cout << "key" << key <<std::endl;
int value = it.second;
str+=key;
if(value!=1){
str+=to_string(value);
}
}

return str;
}

//paras: formular, start position, len of the formula
//read it from right to left for simplicity
void helper(string& f, int& lindex, int rindex, int& len, int multiplier){
string atomName;
int fNum=1;
int i=rindex;
while(i>=0){
//std::cout << "i " << i << std::endl;
//number
if(f[i] >= '0' && f[i]<='9'){
// H2O case, it can be a new started atom
if(i+1<len && f[i+1]>='A' && f[i+1]<='Z'){
//this condition is prior than case where there is (
//when this executed, we do not check ( further
//this is a new atom
//std::cout<<"anmed "<<atomName << "fnum " << fNum << "multi " << multiplier<<std::endl;
atomMap[atomName]+=fNum*multiplier;
//reset key variables
atomName="";
fNum=1;
}

//current number
int tempNum = f[i]-'0';
//read from right to left
if(i+1 < len && f[i+1] >= '0' && f[i+1]<='9'){
fNum = tempNum*10+fNum;
}else{
fNum = tempNum;
}

i--;
continue;
}

//character(multiple)
if((f[i]>='A' && f[i]<='Z') || (f[i]>='a' && f[i]<='z')){
//reset the fNum if there are two upper case
//the 1 is omitted in expression
//also update in this case
if(i+1<len && f[i+1]>='A' && f[i+1]<='Z'){
//this condition is prior than case where there is (
//when this executed, we do not check ( further
//this is a new atom
//std::cout<<"anmec "<<atomName<<std::endl;
atomMap[atomName]+=fNum*multiplier;
//reset key variables
atomName=f[i];
fNum=1;
}else{
atomName=f[i]+atomName;
}
i--;
continue;
}

if(f[i]==')'){
//recursion, process multiple elements inserted
//in current section
//update l index
helper(f, lindex, i-1, len, multiplier*fNum);
//remember to set new fNum, from this point, it is a new segment
fNum=1;
i=lindex-1;
continue;
}

// can start with chracter
// when i==0 also insert
if(f[i]=='(' || i==0){
//return point
//put thing into map
//std::cout<<"anmeb "<<atomName<<std::endl;
if(atomName!=""){
atomMap[atomName]+=fNum*multiplier;
}

//reset key variables
atomName="";
fNum=1;
lindex=i--;
return;
}
}
//process the first one
if(atomName!=""){
//std::cout<<"anmea "<<atomName<< " fNum " << fNum << " multi " << multiplier << std::endl;
atomMap[atomName]+=fNum*multiplier;
}
}
};

Another example is 736, which is a little bit complex. (we will solve it in future)

301. Remove Invalid Parentheses

For this type of question, we need to generate new expression based on given expression. The naive way is to use the binaray tree, if the results increase the specific character or not. Without the trim, the complexity is 2^n, every time we map the status array into a new case.

class Solution {
private:
set<string> m_ans;
int minUpdated = INT_MAX;
string inputStr;
public:
bool valid(string s){
int depth=0;
for(int i=0;i<s.length();i++){
if(s[i]=='('){
depth++;
}else if(s[i]==')'){
depth--;
}
//do nothing for the letter
if(depth<0){
//there are extra right parenthesis
return false;
}
}
//if the results is 0 all match
return depth==0;
}

vector<string> removeInvalidParentheses(string s) {
inputStr = s;
vector<int> removedLabel(s.length(), 0);
dfs(0,0,removedLabel);
vector<string> ansvec;
for (auto v:m_ans){
ansvec.push_back(v);
}
if(ansvec.size()==0){
ansvec.push_back("");
}
return ansvec;
}

string getNewStr(vector<int>& removedLabel){
string ans;
for(int i=0;i<inputStr.length();i++){
//keep the string where there is false labels
if(removedLabel[i]==0){
ans+=inputStr[i];
}
}
return ans;
}

void dfs(int removedNum, int s, vector<int>& removedLabel){
if(s>inputStr.length()){
return;
}
if(removedNum>minUpdated){
return;
}

//maybe add a cache here
string currStr=getNewStr(removedLabel);
//printf("currStr %s label %d %d\n",currStr.c_str(),removedLabel[0],removedLabel[1]);
if(removedNum==minUpdated){
if(valid(currStr)){
m_ans.insert(currStr);
}
}else if(removedNum<minUpdated){
//removedNum < minUpdated
//update resutls
if(valid(currStr)){
m_ans.clear();
m_ans.insert(currStr);
minUpdated = removedNum;
}
}

//todo reorganize the code, the s position mayber larger than
//length in previous section
if(s>=inputStr.length()){
return;
}

//trim cases
//if letters
if(inputStr[s]!='(' && inputStr[s]!=')'){
removedLabel[s] = false;
dfs(removedNum,s+1,removedLabel);
}else{
//remove current or not
removedLabel[s]=1;
//do the dfs, removed current
dfs(removedNum+1,s+1,removedLabel);

//next is
removedLabel[s]=0;
char c = inputStr[s];
//skip the repeated elements
while(s<inputStr.length() && inputStr[s]==c) s++; //if not take then don't take it until the same char
dfs(removedNum,s,removedLabel);
}
return;
}
};

This is the version based on the bit mask and the cache, it does not improve the performance a lot, it however slows down the code, not sure the reason. (maybe the cache is not necessary)

class Solution {
private:
set<string> m_ans;
int minUpdated = INT_MAX;
string inputStr;
map<int, string> cache; // map the mask into the string
public:
bool valid(string s){
int depth=0;
for(int i=0;i<s.length();i++){
if(s[i]=='('){
depth++;
}else if(s[i]==')'){
depth--;
}
//do nothing for the letter
if(depth<0){
//there are extra right parenthesis
return false;
}
}
//if the results is 0 all match
return depth==0;
}

vector<string> removeInvalidParentheses(string s) {
inputStr = s;
vector<int> removedLabel(s.length(), 0);
int mask=0;
dfs(0,0,mask);
vector<string> ansvec;
for (auto v:m_ans){
ansvec.push_back(v);
}
if(ansvec.size()==0){
ansvec.push_back("");
}
return ansvec;
}

//mask labels the removed part
string getNewStr(int mask){
string ans;
for(int i=0;i<inputStr.length();i++){
//keep the string where there is false labels
if((mask & (1<<i))==0) {
//if not labeled as removed
//added into the ans
ans+=inputStr[i];
}
}
return ans;
}

//void dfs(int removedNum, int s, vector<int>& removedLabel){
void dfs(int removedNum, int s, int mask){
if(s>inputStr.length()){
return;
}
if(removedNum>minUpdated){
return;
}

//maybe add a cache here
string currStr;
if(cache.count(mask)>0){
currStr=cache[mask];
}else{
currStr=getNewStr(mask);
cache[mask]=currStr;
}
//string currStr=getNewStr(removedLabel);
//printf("currStr %s label %d %d\n",currStr.c_str(),removedLabel[0],removedLabel[1]);
if(removedNum==minUpdated){
if(valid(currStr)){
m_ans.insert(currStr);
}
}else if(removedNum<minUpdated){
//removedNum < minUpdated
//update resutls
if(valid(currStr)){
m_ans.clear();
m_ans.insert(currStr);
minUpdated = removedNum;
}
}

//todo reorganize the code, the s position mayber larger than
//length in previous section
if(s>=inputStr.length()){
return;
}

//trim cases
//if letters
if(inputStr[s]!='(' && inputStr[s]!=')'){
//removedLabel[s] = false;
//do not need to set, it is 0 initially
mask = (mask & ~(1 << s));
dfs(removedNum,s+1,mask);
}else{
//remove current or not
//removedLabel[s]=1;
mask = (mask | (1<<s));
//do the dfs, removed current
dfs(removedNum+1,s+1,mask);

//next is
//removedLabel[s]=0;
mask = (mask & ~(1<<s));
char c = inputStr[s];
//skip the repeated elements
while(s<inputStr.length() && inputStr[s]==c) s++; //if not take then don't take it until the same char
dfs(removedNum,s,mask);
}
return;
}
};

394. Decode String

We write this code into a recursion way, once we got the bracket, we do a recursion call. It actually use the bottom to up way to do the dfs

class Solution {
public:
string decodeString(string s) {
int len = s.length();
string str = getSubstr(s, 0, len, len);
return str;
}

string getSubstr(string& s, int l, int len, int& rindex){
string fullstr;
int depth=0;
int i=l;
int repeat=0;

while(i<len){
if( s[i]==']'){
rindex=i;
repeat=0;
return fullstr;
}

if(s[i]>='a' && s[i]<='z'){
fullstr+=s[i];
i++;
continue;
}

if(s[i]>='0' && s[i]<='9'){
int tempRepeat = (s[i]-'0');
if(i>0 && s[i-1]>='0' && s[i-1]<='9'){
//consider previous one only when
//there are adjacent digits
repeat = repeat*10+tempRepeat;
}else{
repeat = tempRepeat;
}
i++;
continue;
}

string tempstr;
if(s[i]=='['){
tempstr = getSubstr(s,i+1,len,rindex);
i=rindex+1;
}

for(int j=0;j<repeat;j++){
fullstr+=tempstr;
}
}
return fullstr;
}
};

Partition DFS

Naive BFS

17 Output all possible combinations.

We basically need to output all the paths, each number represents a new level for each all possible nodes. Essentailly, there is a tree structure and it is good to use the bfs method to search all the tree and only put the results in vector at the last layer. We do not need to put the value in ans at the first several layers and we just need to put the value into the ans at the final layer. The code structure is same with the tree based version, the difference is that the node is not the actual tree node, it is just the imaginary tree node.

class Solution {
public:
vector<string> letterCombinations(string digits) {
vector<string> ans;
int len = digits.length();
if(len==0){
return ans;
}
//init dial table
map<int,vector<string>> dialMap;
dialMap[2]={"a","b","c"};
dialMap[3]={"d","e","f"};
dialMap[4]={"g","h","i"};
dialMap[5]={"j","k","l"};
dialMap[6]={"m","n","o"};
dialMap[7]={"p","q","r","s"};
dialMap[8]={"t","u","v"};
dialMap[9]={"w","x","y","z"};
//init the q
queue<string> q;
int initkey = digits[0]-'0';
for(auto v:dialMap[initkey]){
if(len>1){
q.push(v);
}else if(len==1){
ans.push_back(v);
}
}
for(int i=1;i<len;i++){
//construct all possible results at current layer
int key = digits[i]-'0';
//fetch the value at i-1 layer
int qsize = q.size();
while(qsize>0){
string preStr=q.front();
//put new nodes into the queue
for(auto v:dialMap[key]){
string currStr = preStr+v;
//put value into the q if i<len-1
//put the q value into the ans i ==len-1
if(i<len-1){
q.push(currStr);
}else if(i==(len-1)){
ans.push_back(currStr);
}
}
//pop out old nodes
q.pop();
qsize--;
}
}
return ans;
}
};

784. Letter Case Permutation

This is the sample code, the framework is still same with previous one (dfs split the current string into the current part and the remaning part). We just classifiy the type of each elements can consider each operations gradually. the idea is still the recursion based on dfs manner.

The goal of algorithm is to find all the combinations, for each character which is a letter, it can be either a small case or large case, so there are two options for current layer

class Solution {
public:
vector<string> letterCasePermutation(string s) {
vector<string> ans;
string curr;
dfs(ans, s, curr, 0);
return ans;
}

void dfs(vector<string>& ans, string&sample, string curr, int index){
if(index==sample.length()){
//all elements are pushed into the curr
ans.push_back(curr);
return;
}
//if not letter, move to next
char temp = sample[index];
if(temp>='a' && temp<='z'){
//if letter
//dfs results with the change
char letterNew = temp-'a' + 'A';
string currLower=curr+temp;
string currUpper=curr+letterNew;
dfs(ans, sample, currLower, index+1);
dfs(ans, sample, currUpper, index+1);
}else if(temp>='A' && temp<='Z'){
char letterNew = temp-'A' + 'a';
string currUpper=curr+temp;
string currLower=curr+letterNew;
dfs(ans, sample, currLower, index+1);
dfs(ans, sample, currUpper, index+1);
}else{
//not letter
curr=curr+temp;
dfs(ans, sample, curr, index+1);
}
return;
}
};

BFS with visited flag

BFS+DFS

Accelartion by map

推荐文章