Regular bracket strings, also called Dyck words, are defined as follows:
- An empty string is a regular bracket string
- If you have a regular bracket string, you can put a
(
in front and)
at the end of the string to make another regular bracket string - If you have two regular bracket strings, you can concatenate them together to make another regular bracket string
In RegularBracketString.v
, this definition is named isBalanced
.
There are many competitive programming problems about regular bracket strings. Every competitive programming problem about regular bracket strings is a novel snowflake.
Here are some results about these strings.
There is another way to describe regular bracket strings
There is another definition of regular bracket strings that is widely used in the competitive programming community: For a string to be a regular bracket string, the number of (
characters must be equal to the number of )
characters and for every prefix of the string, the number of )
characters must not exceed the number of (
characters. In RegularBracketString.v
, this definition is named balanceFactorBasedDefinition
.
We now prove that this definition is equivalent to the inductive definition isBalanced
above.
First, we prove isBalanced
implies balanceFactorBasedDefinition
.
Let countOpen(s)
be the number of (
characters in s
, and countClose(s)
be the number of )
characters in s
.
We perform induction on the isBalanced
predicate. There are three cases to consider.
Case 1: Empty string. Since it is an empty string, the number of (
is equal to the number of )
. Furthermore, since it is an empty string, its only prefix is also an empty string. Both conditions of balanceFactorBasedDefinition
are satisfied.
Case 2: The current string s
is equal to "(" + s' + ")"
, where s'
is a string that satisfies balanceFactorBasedDefinition
. We have countOpen(s) = countOpen(s') + 1
, and countClose(s) = countClose(s') + 1
. Since countOpen(s') = countClose(s')
, we have countOpen(s) = countClose(s)
. The string now satisfies the first condition of balanceFactorBasedDefintion
.
To prove that the string s
satisfies the second condition of balanceFactorBasedDefinition
, we need to consider several cases.
- The prefix
s1
ofs
is empty. Since the prefix is empty,countOpen(s1) = countClose(s1) = 0
, thereforecountOpen(s1) >= countClose(s1)
. - The prefix
s1
is equal to"(" + s2
, wheres2
is a prefix ofs'
. We havecountOpen(s1) = countOpen(s2) + 1
, andcountClose(s1) = countClose(s2)
. Ass2
is a prefix ofs'
, we havecountOpen(s2) >= countClose(s2)
. SocountOpen(s1) = countOpen(s2) + 1 >= countClose(s2) + 1 >= countClose(s1)
. - The prefix
s1
is identical tos
. Since this prefix is identical tos
,countOpen(s1) = countClose(s1)
, thereforecountOpen(s1) >= countClose(s1)
.
Now the string s
satisfies both conditions.
Case 3: The current string s
is the concatenation of two strings s1
and s2
. Both s1
and s2
satisfy the balanceFactorBasedDefinition
conditions. Since s1
satisfies balanceFactorBasedDefinition
, countOpen(s1) = countClose(s1)
. Additionally, since s2
satisfies balanceFactorBasedDefinition
, countOpen(s2) = countClose(s2)
. Since the current string s
is the concatenation of s1
and s2
, countOpen(s) = countOpen(s1) + countOpen(s2)
. Therefore, countClose(s) = countClose(s1) + countClose(s2) = countOpen(s1) + countOpen(s2) = countOpen(s)
, thus satisfying the first condition of balanceFactorBasedDefinition
.
To prove that the string s
satisfies the second condition of balanceFactorBasedDefinition
, we need to consider two cases. If the prefix s'
of s
is also a prefix of s1
, countOpen(s') >= countClose(s')
. Otherwise, the prefix s'
is equal to s1 + s''
, where s''
is a prefix of s2
. Now, countOpen(s1) = countClose(s1)
, and countOpen(s'') >= countClose(s'')
, so countOpen(s1) + countOpen(s'') >= countClose(s1) + countClose(s'')
, therefore countOpen(s') >= countClose(s')
.
Now the string s
satisfies both conditions.
We now prove balanceFactorBasedDefinition
implies isBalanced
.
We perform induction on the length of the string s
.
We handle several special cases first.
First, if the string s
is empty, it already satisfies isBalanced
.
Second, if the string s
consists of a single character, we have two cases.
- If
s = "("
,countOpen(s) = 1
andcountClose(s) = 0
, thereforecountOpen(s)
is not equal tocountClose(s)
. Therefore the string does not satisfybalanceFactorBasedDefinition
. We can ignore this case. - If
s = "("
,countOpen(s) = 0
andcountClose(s) = 1
, thereforecountOpen(s)
is not equal tocountClose(s)
. Therefore the string does not satisfybalanceFactorBasedDefinition
. We can ignore this case.
Now we need to consider the case where the length of s
is at least 2 and satisfies balanceFactorBasedDefinition
.
We need to prove a small fact: s
can always be decomposed as "(" + s1 + ")"
. s1
might not necessarily satisfy balanceFactorBasedDefinition
.
- First character: We consider the case where the first character is
)
. We take a prefix of length 1 of the string. SincecountOpen(")") = 0
, andcountClose(")") = 1
, this meanscountOpen(")") < countClose(")")
, which is a contradiction. The first character is always(
. - Last character:
s
is equal tos' + a
, wherea
is a single character. We havecountOpen(s') >= countClose(s')
, andcountOpen(s' + a) = countClose(s' + a)
. We consider the case wherea
is(
. We havecountOpen(s' + "(") = countOpen(s') + 1
. Also, we havecountClose(s' + "(") = countClose(s')
. ThereforecountOpen(s') + 1 = countClose(s')
, which meanscountOpen(s') < countClose(s')
, which is a contradiction. The last character is always)
.
We now consider two cases.
- For all prefixes
s'
of length from 1 tolength(s) - 1
(inclusive),countOpen(s') > countClose(s')
. We decomposes
as"(" + s1 + ")"
. AscountOpen(s) = countClose(s)
, andcountOpen(s) = countOpen(s1) + 1
,countClose(s) = countClose(s1) + 1
, we havecountOpen(s1) = countClose(s1)
. This meanss1
satisfies the first condition ofbalanceFactorBasedDefinition
. For all prefixess'
ofs1
, we take a prefixt
of lengthlength(s') + 1
of the strings
. We havecountOpen(t) > countClose(t)
and ast = "(" + s'
, we havecountOpen(s') + 1 > countClose(s')
. Therefore,countOpen(s') >= countClose(s')
. Now the strings1
satisfies the second condition ofbalanceFactorBasedDefinition
. Ass1
satisfiesbalanceFactorBasedDefinition
, and the length ofs1
is less than the length ofs
, we can apply the induction hypothesis to deduce thats1
satisfiesisBalanced
. Ass1
satisfiesisBalanced
,"(" + s1 + ")"
satisfiesisBalanced
and thuss
satisfiesisBalanced
. - There exists a prefix
s1
of length from1
tolength(s) - 1
(inclusive) wherecountOpen(s1) = countClose(s1)
. Every prefix ofs1
is also a prefix ofs
, sos'
satisfiesbalanceFactorBasedDefinition
. As the length ofs1
is less than the length ofs
ands1
satisfiesbalanceFactorBasedDefinition
, we can apply the induction hypothesis to deduce thats1
satisfiesisBalanced
. The strings
can be decomposed ass1 + s2
. AscountOpen(s) = countClose(s)
andcountOpen(s1) + countClose(s1)
, we deduce thatcountOpen(s2) = countClose(s2)
. For every prefixs'
ofs2
, we take a prefixt
of lengthlength(s1) + length(s')
froms
. We havecountOpen(t) >= countClose(t)
. We havet = s1 + s'
, and ascountOpen(s1) = countClose(s1)
, we havecountOpen(s2) >= countClose(s2)
. Nows2
satisfiesbalanceFactorBasedDefinition
. Ass2
satisfiesbalanceFactorBasedDefinition
and the length ofs2
is less than the length ofs
, we can apply the induction hypothesis to deduce thats2
satisfiesisBalanced
. As boths1
ands2
satisfyisBalanced
, the concatenation also satisfiesisBalanced
and thuss
satisfiesisBalanced
.
Therefore the two definitions are equivalent.
Given a string with (
, )
and ?
, there is a quick way to check if it is possible to replace the question marks with (
and )
to make a regular bracket string
Let the number of already filled (
characters be filledOpen
. Similarly, let the number of already filled )
characters be filledClose
. Let the length of the string be length
. If length
is odd, there is no solution. If length / 2 < filledOpen
, there is no solution. If length / 2 < filledClose
, there is no solution. Now fill the first length / 2 - filledOpen
question marks with (
, and the rest with )
. If the result string is a regular bracket string, there is a solution. Otherwise, there is no solution.
We prove that the above approach is correct, that is if there is a solution, the construction we provided is a solution.
A solution is described as an array of (
and )
characters to be filled sequentially from left to right into the string. The length of a solution must always be equal to the number of ?
characters present in the string.
We notice that all solutions are permutations of each other because the number of (
characters to be filled and the number of )
characters to be filled are both fixed. If we sort any arbitrary solution, we get our construction. We will prove that for every regular bracket string, if we swap two characters, not necessarily adjacent, where the left character is )
and the right character is (
, the resulting string is also a regular bracket string. Then it follows that for an arbitray solution, if we swap two characters, not necessarily adjacent, where the left character is )
and the right character is (
, we also get a valid solution. And since there are a lot of sorting algorithms that only swap inversions, it follows that when we sort an arbitrary solution, we also get a valid solution and thus, our construction is a valid solution. In FillInTheBlanks.v
, we use the selection sort algorithm. But it is also possible to use bubble sort, quick sort and other algorithms, as long as they only swap inversions. Check out this file for the correctness proof.
Now, we prove for a regular bracket string of the form a = s1 + ")" + s2 + "(" + s3
, b = s1 + "(" + s2 + ")" + s3
is also a regular bracket string. We will use balanceFactorBasedDefinition
as it is more convenient. First, b
already satisfies the first condition of balanceFactorBasedDefinition
as it is a permutation of a
. To prove that b
satisfies the second definition, we need to consider several cases.
- The prefix
s
is a prefix ofs1
. This meanss
is a prefix ofa
, and asa
is a regular bracket string,countOpen(s) >= countClose(s)
. - The prefix
s
is equal tos1 + "(" + s'
, wheres'
is a prefix ofs2
. We haves1 + ")" + s'
is a prefix ofa
, thereforecountOpen(s1) + countOpen(s') >= countClose(s1) + countClose(s') + 1
. To provecountOpen(s) >= countClose(s)
, we provecountOpen(s1) + countOpen(s') + 1 >= countClose(s1) + countClose(s')
. This is true becausecountOpen(s1) + countOpen(s') + 1 >= countClose(s1) + countClose(s') + 2 >= countClose(s1) + countClose(s')
. - The prefix
s
is equal tos1 + "(" + s2 + ")" + s'
, wheres'
is a prefix ofs3
. We haves1 + ")" + s2 + "(" + s'
is a prefix ofa
, thereforecountOpen(s1) + countOpen(s2) + countOpen(s') + 1 >= countClose(s1) + countClose(s2) + countClose(s') + 1
. From this, we can conclude thatcountOpen(s) >= countClose(s)
, since if we expand the two sides, we get the same inequality as the previous step.
With this result, we can prove the next result.
There is also a quick way to check if there are two separate ways to replace the question marks
Problem link: https://codeforces.com/contest/1709/problem/C
This is proved in the TwoWaysToFill.v
file.
To check, we fill in the question marks according to the construction described in the previous result. But then we swap the last question mark filled with (
and the first question mark filled with )
and then check if the resulting string is a regular bracket string.
As described above, a solution is an array of characters to be filled sequentially from left to right into the string. Now we have two arbitrary solutions, witness1
and witness2
. These two solutions are different. We perform these preprocessing steps:
- If
witness1
is already equal to our construction, that is, we fill the first several question marks with(
and the rest with)
, don't do anything. - Otherwise, swap
witness1
andwitness2
then replacewitness1
with our construction.
After these preprocessing steps, witness1
and witness2
are still two distinct valid solutions.
witness1
has the form ((((...())))...)
. We can decompose witness1
into two segments: the first segment filled with (
and the second segment filled with )
. While witness2
has an arbitrary shape, we will still split it into two segments, the first segment with length equal to the first segment of witness1
and the second segment with length equal to the second segment of witness1
. And we will apply this decomposition to any other witness that we create.
As witness1
and witness2
are different, there is at least one character that is different.
There are two cases.
-
The different character is in the first segment. In
witness1
, the character is(
. Inwitness2
, the character is)
. We have no information about other characters inwitness2
. In order to proceed, we create another witness that is easier to inspect. We keep the different character, but for the other characters, we erase them and then fill them according to our construction.Let's take the string
(????)
. And for example, ourwitness1
is(())
, and ourwitness2
is()()
. The first different character is the second character inwitness2
. Let's applywitness2
to the string, we then get(()())
. Now, let's empty all the characters except the different character. We then get(?)??)
. And then we can fill the question marks again with our construction and get a new witness that is easier to inspect.Let's call the new witness
witness3
. There is already a)
residing in the first segment. Because of the way we fill the other characters, there is also a(
residing in the second segment. -
The different character is in the second segment. Similarly, we create a new witness
witness3
where there is a)
residing in the first segment and a(
residing in the second segment.
Let's take a look at witness3
again. witness3
looks something like this:
a ( in segment 2
----first------ |
((((((()((((((()))())))))
| --second--
a ) in segment 1
We perform these steps to transform witness3
into the witness checked by the problem's solution.
- If the
)
in segment 1 is at the end of the segment, we need to do nothing. Otherwise, we swap the)
in segment 1 with the character at the end of segment 1, which is guaranteed to be(
. The witness remains a valid witness. - If the
(
in segment 2 is at the beginning of the segment, we need to do nothing. Otherwise, we swap the character at the beginning of segment 2, which is guaranteed to be)
, with the(
in segment 2. The witness remains a valid witness.