Skip to content

SIGSEGV if Match is called before Compile #484

@masklinn

Description

@masklinn
>>> f = Filter()
>>> f.Compile()
re2/filtered_re2.cc:74: Compile called before Add.
>>> f.Match("")
>>> f = Filter()
>>> f.Match("")
py311: exit -11

This is inside tox, using python installed with pyenv, but reproduces with both Python 3.8 and 3.11.

I would assume it's because while there is a guard to check that the filter is compiled in FirstMatch

re2/re2/filtered_re2.cc

Lines 96 to 99 in 108914d

if (!compiled_) {
LOG(DFATAL) << "FirstMatch called before Compile.";
return -1;
}
there is no such guard for AllMatches:

re2/re2/filtered_re2.cc

Lines 108 to 118 in 108914d

bool FilteredRE2::AllMatches(absl::string_view text,
const std::vector<int>& atoms,
std::vector<int>* matching_regexps) const {
matching_regexps->clear();
std::vector<int> regexps;
prefilter_tree_->RegexpsGivenStrings(atoms, &regexps);
for (size_t i = 0; i < regexps.size(); i++)
if (RE2::PartialMatch(text, *re2_vec_[regexps[i]]))
matching_regexps->push_back(regexps[i]);
return !matching_regexps->empty();
}

I assume the PrefilterTree would be the issue but apparently it guards against this

re2/re2/prefilter_tree.cc

Lines 271 to 276 in 108914d

if (!compiled_) {
// Some legacy users of PrefilterTree call Compile() before
// adding any regexps and expect Compile() to have no effect.
// This kludge is a counterpart to that kludge.
if (prefilter_vec_.empty())
return;
so I'm not sure why it SIGSEGVs, but I have not debugged the C++ code just looked at it.

Interestingly there is a guard in Set::Match:

re2/re2/set.cc

Lines 128 to 133 in 108914d

if (!compiled_) {
if (error_info != NULL)
error_info->kind = kNotCompiled;
LOG(DFATAL) << "RE2::Set::Match() called before compiling";
return false;
}
but Filter::Match does not check it:
set_->Match(text, &atoms);
so it does not protect the FilteredRE2 (assuming that's what crashes, for all I know it might be something else entirely).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions