"""

The regular expression engine in '_sre' can segfault when interpreting

bogus bytecode.



It is unclear whether this is a real bug or a "won't fix" case like

bogus_code_obj.py, because it requires bytecode that is built by hand,

as opposed to compiled by 're' from a string-source regexp.  The

difference with bogus_code_obj, though, is that the only existing regexp

compiler is written in Python, so that the C code has no choice but

accept arbitrary bytecode from Python-level.



The test below builds and runs random bytecodes until 'match' crashes

Python.  I have not investigated why exactly segfaults occur nor how

hard they would be to fix.  Here are a few examples of 'code' that

segfault for me:



    [21, 50814, 8, 29, 16]

    [21, 3967, 26, 10, 23, 54113]

    [29, 23, 0, 2, 5]

    [31, 64351, 0, 28, 3, 22281, 20, 4463, 9, 25, 59154, 15245, 2,

                  16343, 3, 11600, 24380, 10, 37556, 10, 31, 15, 31]



Here is also a 'code' that triggers an infinite uninterruptible loop:



    [29, 1, 8, 21, 1, 43083, 6]



"""



import _sre, random



def pick():

    n = random.randrange(-65536, 65536)

    if n < 0:

        n &= 31

    return n



ss = ["", "world", "x" * 500]



while 1:

    code = [pick() for i in range(random.randrange(5, 25))]

    print code

    pat = _sre.compile(None, 0, code)

    for s in ss:

        try:

            pat.match(s)

        except RuntimeError:

            pass

