PyPy Advent Calendar 21日目 - 文法を拡張してみた

改めまして PyPy Advent Calendar 21日目担当の [twitter:@shomah4a] です。
三周目です。
大変です。
誰かか書く人いませんかー?

というわけで、今回は 11日目 に [twitter:@yanolab] さんが PyPy の Python インタプリタに手を入れていたのにインスパイアされて拡張してみました。

拡張したのは Python3 Advent Calendar一日目で紹介した yield from 構文です。

構文定義に手を入れる

まずは、 Python の構文定義ファイルである pypy/interpreter/pyparser/data/Grammer2.7 に手を入れます。

以下のような small_stmt の定義に

small_stmt: (expr_stmt | print_stmt  | del_stmt | pass_stmt | flow_stmt |
             import_stmt | global_stmt | exec_stmt | assert_stmt)

yield_from_stmt を追加します。

small_stmt: (expr_stmt | print_stmt  | del_stmt | pass_stmt | flow_stmt |
             import_stmt | global_stmt | exec_stmt | assert_stmt | yield_from_stmt)

そして yield_from_stmt の定義を追加します。

yield_from_stmt: 'yld' 'from' [testlist]

'yield' 'from' ではなく 'yld' 'from' になっているのは、そのまま定義したら「曖昧だから一意に決まらねえぞこの野郎!」などと怒られてしまったためです。
そこから直すのは結構大変そうだったので…。

AST を組み立てる部分に手を入れる

さて、以上で yld from をパースする部分ができあがったので、次は抽象構文木を組み立てる部分に手を付けます。

pypy/interpreter/astcompiler/astbuilder.py に handle_yield_from メソッドを追加します。

    def handle_yield_from_stmt(self, stmt):

        expr = stmt.children[2]
        
        print expr

        values = ast.Name('__________', ast.Load, stmt.lineno, stmt.column)
        iters = ast.Name('_________________', ast.Load, stmt.lineno, stmt.column)
        valuel = ast.Name('__________', ast.Load, stmt.lineno, stmt.column)
        iterl = ast.Name('_________________', ast.Load, stmt.lineno, stmt.column)

        self.set_context(values, ast.Store)
        self.set_context(iters, ast.Store)
        
        num = ast.Num(self.parse_number('1'), stmt.lineno, stmt.column)
        yld = ast.Yield(valuel, stmt.lineno, stmt.column)
        none = ast.Name('None', ast.Load, stmt.lineno, stmt.column)
        stopiteration = ast.Name('StopIteration', ast.Load, stmt.lineno, stmt.column)

        # iter.next()
        funcall = ast.Call(ast.Attribute(iterl, 'next', ast.Load, stmt.lineno, stmt.column), None, None, None, None, stmt.lineno, stmt.column)

        # 初期化
        initi = ast.Assign([iters], self.handle_testlist(expr), stmt.lineno, stmt.column)
        initv = ast.Assign([values], funcall, stmt.lineno, stmt.column)
        assignv = ast.Assign([values], funcall, stmt.lineno, stmt.column)

        # while
        whl = ast.While(num, [assignv, yld], None, stmt.lineno, stmt.column)

        # try
        excp = ast.ExceptHandler(stopiteration, None, [], stmt.lineno, stmt.column)
        tr = ast.TryExcept([initi, initv, yld, whl], [excp], None, stmt.lineno, stmt.column)

        return tr

この中身は何をやっているかというと、以下のような Python コードの抽象構文木を作っています。

try:
    iter = expr()
    value = iter.next()
    yield value

    while 1:
        value = iter.next()
        yield value

except StopIteration:
    pass

端的に言えば 既存の yield を使って yield from っぽい動きをさせているだけです。
これだけでは完璧ではありませんが、それっぽく書ければいいかなということでこうなっています。

ちなみに PEP380 によると yield from は以下のような Python コードと等価なのだそうです。

_i = iter(EXPR)
try:
    _y = next(_i)
except StopIteration as _e:
    _r = _e.value
else:
    while 1:
        try:
            _s = yield _y
        except GeneratorExit as _e:
            try:
                _m = _i.close
            except AttributeError:
                pass
            else:
                _m()
            raise _e
        except BaseException as _e:
            _x = sys.exc_info()
            try:
                _m = _i.throw
            except AttributeError:
                raise _e
            else:
                try:
                    _y = _m(*_x)
                except StopIteration as _e:
                    _r = _e.value
                    break
        else:
            try:
                if _s is None:
                    _y = next(_i)
                else:
                    _y = _i.send(_s)
            except StopIteration as _e:
                _r = _e.value
                break
RESULT = _r

とっても大変そうですね。

で、 yield from っぽい構文木を作ったら、それを定義する部分に手を入れます。

handle_stmt メソッドの if stmt_type == syms.small_stmt: のブロックに以下の文を追加します。

            elif stmt_type == syms.yield_from_stmt:
                return self.handle_yield_from_stmt(stmt)

これで yld from が使える Python 処理系のできあがりです。

実行してみる

試しにこんなソースを書いてみました。

#-*- coding:utf-8 -*-

print 10

def genA():

    yld from genB()


def genB():

    for i in xrange(3):

        yield i


if __name__ == '__main__':

    for i in genA():
        print i

    for i in genB():
        print i

これを実行するのですが、 translate.py で変換するのは正直面倒です。
そんなときは pypy/bin/py.py を使うことで CPython/PyPy の上で Python 処理系を動かせます。

$ python /path/to/pypy/bin/py.py yieldfrom.py
[version:WARNING] Errors getting Mercurial information: command does not identify itself as Mercurial
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-1/platcheck_2.c -o /tmp/usession-unknown-1/platcheck_2.o
[platform:execute] gcc /tmp/usession-unknown-1/platcheck_2.o -pthread -lintl -lrt -o /tmp/usession-unknown-1/platcheck_2
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-1/platcheck_3.c -o /tmp/usession-unknown-1/platcheck_3.o
[platform:execute] gcc /tmp/usession-unknown-1/platcheck_3.o -pthread -lrt -o /tmp/usession-unknown-1/platcheck_3
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-1/platcheck_4.c -o /tmp/usession-unknown-1/platcheck_4.o
[platform:execute] gcc /tmp/usession-unknown-1/platcheck_4.o -pthread -lrt -o /tmp/usession-unknown-1/platcheck_4
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-1/platcheck_5.c -o /tmp/usession-unknown-1/platcheck_5.o
[platform:execute] gcc /tmp/usession-unknown-1/platcheck_5.o -pthread -lrt -o /tmp/usession-unknown-1/platcheck_5
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-1/platcheck_7.c -o /tmp/usession-unknown-1/platcheck_7.o
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-1/platcheck_8.c -o /tmp/usession-unknown-1/platcheck_8.o
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-1/platcheck_14.c -o /tmp/usession-unknown-1/platcheck_14.o
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused -I/home/shoma/files/pypy/pypy/translator/c /tmp/usession-unknown-1/platcheck_18.c -o /tmp/usession-unknown-1/platcheck_18.o
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused -I/home/shoma/files/pypy/pypy/translator/c /tmp/usession-unknown-1/module_cache/module_0.c -o /tmp/usession-unknown-1/module_cache/module_0.o
[platform:execute] gcc /tmp/usession-unknown-1/platcheck_18.o /tmp/usession-unknown-1/module_cache/module_0.o -pthread -Wl,--export-dynamic,--version-script=/tmp/usession-unknown-1/dynamic-symbols-0 -lrt -o /tmp/usession-unknown-1/platcheck_18
faking <type 'module'>
faking <type 'member_descriptor'>
[version:WARNING] Errors getting Mercurial information: command does not identify itself as Mercurial
[version:WARNING] Errors getting Mercurial information: command does not identify itself as Mercurial
Node(type=329, children=[Node(type=328, children=[Node(type=312, children=[Node(type=258, children=[Node(type=309, children=[Node(type=271, children=[Node(type=287, children=[Node(type=339, children=[Node(type=257, children=[Node(type=319, children=[Node(type=261, children=[Node(type=327, children=[Node(type=290, children=[Node(type=315, children=[Node(type=263, children=[Node(type=1, value='genB')]), Node(type=333, children=[Node(type=7, value='('), Node(type=8, value=')')])])])])])])])])])])])])])])])
10
0
1
2
0
1
2

動きました!

これを traslate.py で変換しても動かせます。
ほら、簡単でしょう?

オチ

と、動いてはいるのですが、 genB の xrange で繰り返す回数を 10 位にするとエラーが出てしまいます。
どこかしらに問題があるようです。

$ python /path/to/pypy/bin/py.py yieldfrom.py
[version:WARNING] Errors getting Mercurial information: command does not identify itself as Mercurial
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-2/platcheck_2.c -o /tmp/usession-unknown-2/platcheck_2.o
[platform:execute] gcc /tmp/usession-unknown-2/platcheck_2.o -pthread -lintl -lrt -o /tmp/usession-unknown-2/platcheck_2
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-2/platcheck_7.c -o /tmp/usession-unknown-2/platcheck_7.o
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-2/platcheck_8.c -o /tmp/usession-unknown-2/platcheck_8.o
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused /tmp/usession-unknown-2/platcheck_14.c -o /tmp/usession-unknown-2/platcheck_14.o
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused -I/home/shoma/files/pypy/pypy/translator/c /tmp/usession-unknown-2/platcheck_18.c -o /tmp/usession-unknown-2/platcheck_18.o
[platform:execute] gcc -c -O3 -pthread -fomit-frame-pointer -Wall -Wno-unused -I/home/shoma/files/pypy/pypy/translator/c /tmp/usession-unknown-2/module_cache/module_0.c -o /tmp/usession-unknown-2/module_cache/module_0.o
[platform:execute] gcc /tmp/usession-unknown-2/platcheck_18.o /tmp/usession-unknown-2/module_cache/module_0.o -pthread -Wl,--export-dynamic,--version-script=/tmp/usession-unknown-2/dynamic-symbols-0 -lrt -o /tmp/usession-unknown-2/platcheck_18
faking <type 'module'>
faking <type 'member_descriptor'>
[version:WARNING] Errors getting Mercurial information: command does not identify itself as Mercurial
[version:WARNING] Errors getting Mercurial information: command does not identify itself as Mercurial
Node(type=329, children=[Node(type=328, children=[Node(type=312, children=[Node(type=258, children=[Node(type=309, children=[Node(type=271, children=[Node(type=287, children=[Node(type=339, children=[Node(type=257, children=[Node(type=319, children=[Node(type=261, children=[Node(type=327, children=[Node(type=290, children=[Node(type=315, children=[Node(type=263, children=[Node(type=1, value='genB')]), Node(type=333, children=[Node(type=7, value='('), Node(type=8, value=')')])])])])])])])])])])])])])])])
10
0
1
2
3
4
Traceback (most recent call last):
  File "/home/shoma/files/pypy/pypy/bin/py.py", line 185, in <module>
    sys.exit(main_(sys.argv))
  File "/home/shoma/files/pypy/pypy/bin/py.py", line 156, in main_
    verbose=interactiveconfig.verbose):
  File "/home/shoma/files/pypy/pypy/interpreter/main.py", line 103, in run_toplevel
    f()
  File "/home/shoma/files/pypy/pypy/bin/py.py", line 140, in doit
    main.run_file(args[0], space=space)
  File "/home/shoma/files/pypy/pypy/interpreter/main.py", line 68, in run_file
    run_string(istring, filename, space)
  File "/home/shoma/files/pypy/pypy/interpreter/main.py", line 59, in run_string
    _run_eval_string(source, filename, space, False)
  File "/home/shoma/files/pypy/pypy/interpreter/main.py", line 48, in _run_eval_string
    retval = pycode.exec_code(space, w_globals, w_globals)
  File "/home/shoma/files/pypy/pypy/interpreter/eval.py", line 34, in exec_code
    return frame.run()
  File "/home/shoma/files/pypy/pypy/interpreter/pyframe.py", line 142, in run
    return self.execute_frame()
  File "/home/shoma/files/pypy/pypy/interpreter/pyframe.py", line 176, in execute_frame
    executioncontext)
  File "/home/shoma/files/pypy/pypy/interpreter/pyopcode.py", line 85, in dispatch
    next_instr = self.handle_bytecode(co_code, next_instr, ec)
  File "/home/shoma/files/pypy/pypy/interpreter/pyopcode.py", line 91, in handle_bytecode
    next_instr = self.dispatch_bytecode(co_code, next_instr, ec)
  File "/home/shoma/files/pypy/pypy/interpreter/pyopcode.py", line 266, in dispatch_bytecode
    res = meth(oparg, next_instr)
  File "/home/shoma/files/pypy/pypy/interpreter/pyopcode.py", line 880, in FOR_ITER
    w_nextitem = self.space.next(w_iterator)
  File "/home/shoma/files/pypy/pypy/objspace/descroperation.py", line 295, in next
    return space.get_and_call_function(w_descr, w_obj)
  File "/home/shoma/files/pypy/pypy/objspace/descroperation.py", line 142, in get_and_call_function
    return descr.funccall(w_obj, *args_w)
  File "/home/shoma/files/pypy/pypy/interpreter/function.py", line 83, in funccall
    return code.fastcall_1(self.space, self, args_w[0])
  File "/home/shoma/files/pypy/pypy/interpreter/gateway.py", line 711, in fastcall_1
    raise self.handle_exception(space, e)
  File "/home/shoma/files/pypy/pypy/interpreter/gateway.py", line 704, in fastcall_1
    w_result = self.fastfunc_1(space, w1)
  File "<124-codegen /home/shoma/files/pypy/pypy/tool/sourcetools.py:174>", line 3, in fastfunc_descr_next_1
  File "/home/shoma/files/pypy/pypy/interpreter/generator.py", line 113, in descr_next
    return self.send_ex(self.space.w_None)
  File "/home/shoma/files/pypy/pypy/interpreter/generator.py", line 76, in send_ex
    w_result = frame.execute_frame(w_arg, operr)
  File "/home/shoma/files/pypy/pypy/interpreter/pyframe.py", line 176, in execute_frame
    executioncontext)
  File "/home/shoma/files/pypy/pypy/interpreter/pyopcode.py", line 85, in dispatch
    next_instr = self.handle_bytecode(co_code, next_instr, ec)
  File "/home/shoma/files/pypy/pypy/interpreter/pyopcode.py", line 91, in handle_bytecode
    next_instr = self.dispatch_bytecode(co_code, next_instr, ec)
  File "/home/shoma/files/pypy/pypy/interpreter/pyopcode.py", line 266, in dispatch_bytecode
    res = meth(oparg, next_instr)
  File "/home/shoma/files/pypy/pypy/interpreter/pyopcode.py", line 330, in LOAD_FAST
    self.pushvalue(w_value)
  File "/home/shoma/files/pypy/pypy/interpreter/pyframe.py", line 193, in pushvalue
    self.locals_stack_w[depth] = w_object
IndexError: list assignment index out of range

まあ、とりあえず動いたからよしとします。

明日

というわけで PyPy Advent Calendar 21日目でした。

明日はきっと [twitter:@aodag] 先生のターンです。