I just filed a bug to request a review for merging my invoke-rewrite branch. This is a huge overhaul of internals of the introspection layer of PyGObject which I feel both speeds up calls as well as makes it easier to fix bugs and add features.
Right now the invoke-rewrite branch is stable and complete enough to be merged to master. It passes all the tests and even cleans up after itself when an error is raised while processing input parameters (something the current code doesn’t do).
A brief list of the advantages of this branch
- over 2 times faster running thought the test suite than the current implementation.
- more robust – handles a couple of corner cases which the current code does not such as multiple lists pointing to the same aux length value, having argument 0 be a length value and correctly blocking all NULL values which are not marked as allow-none
- correctly handles argument parameter counting and reduces the amount of complicated math used to determine argument counts and indexes
- splits code into logical layers such as caching, invoking, marshalling and cleanup layers
- splits code into manageable units so, for example, if there is an error in list processing it is easier to set break points and focus on the list code instead of wading through large switch statements
- ripe for even more optimizations and better layout for profiling
Things that are yet to be done
- Memory management still leaks and some types such as gclosures are not yet cleaned
- Some of the earlier code like struct marshalling can be split up further
- signals and closures still use the old marshalling paths
- remove the FFI/GArgument layer and marshal directly to FFI
Why merge now?
Code rots and getting more people to use the new codepath makes it easier to fix bugs. It is already feature complete with the old code and simply trades off some new bugs while fixing a slew of other bugs. Because of the caching layer the code is somewhat more complex but follows a much more consistent pattern. In my eyes it is easier to understand once learned but developers need ramp up time since the structure is different. This is the future, there is no need to keep bandaging the old invoke code. As we get to more and more corner cases it gets harder to patch the old code without breaking other pieces or causing memory leaks that are hard to track down and plug (see garray hack we added to handle C Array vs GArray failures).
A brief overrview of the design
New modules
- pygi-invoke-ng – the new invoke API handles setup/invoking/teardown at a high level. Replaces pygi-invoke and does not attempt to handle memory it did not itself create (see garray handling for example)
- pygi-invoke-state-structure – the state struct used to pass around changing state during the caching, marshalling, invoking and cleanup stages
- pygi-cache – sets up the cache which normalizes the information and support methods such as marshalling and cleanup of each interface when called. This allows us to process method calls in a more efficent manner as well as makes it easier to debug
- pygi-marshal – marshalling functions for each parameter type. Replaces pygi-arguments
- pygi-marshal-cleanup – cleanup functions for each parameter type. Includes cleanup for parameters which were not processed due to an error.
Basic flow
invoke is called on an GICallableInfo object | |-> if GICallableInfo does not have a cache we loop over the interface | creating the PyGICallableCache and ArgCaches for the interface | and its arguments | |----> use the cache to marshal the parameters into the state struct | |---------> using the marshalled values invoke the interface using the | GI FFI calls | |----> use the cache to cleanup any in values that need cleaning | |---------> use the cache to marshal the out values to python objects | |----> use the cache to cleanup any out values that need cleaning | |-> cleanup state | return out values to python app
Standard paterns
All marshalling and cleanup methods for types are added to the argument cache as function pointers. A PyGIArgCache has these four function pointers no matter what the type we are caching for:
- in_marshaller – the function which marshals in values for this argument
- out_marshaller – the function which marshals out values for this argument
- in_cleanup – the function that cleans up in values for this argument
- out_cleanup – the function that cleans up out and in/out values for this argument
All type marshallers and cleanup methods follow this naming pattern:
- in marshallers – _pygi_marshal_in_<type> (e.g. _pygi_marshal_in_utf8 or _pygi_marshal_in_interface_enum)
- out marshallers – _pygi_marshal_out_<type> (e.g. _pygi_marshal_out_utf8 or _pygi_marshal_out_interface_enum)
- in_cleanup – _pygi_marshal_cleanup_in_<type> (e.g. _pygi_marshal_cleanup_in_utf8)
- out_cleanup – _pygi_marshal_cleanup_out_<type> (e.g. _pygi_marshal_cleanup_out_utf8)
These are all setup during the caching phase in functions named similarly (e.g. _arg_cache_in_utf8_setup)
[read this post in: ar de es fr it ja ko pt ru zh-CN ]