GilgaLab

3May/112

Python C API – Second Step

As promissed, here is the second step on playing with the Python C API. If you haven't read the first post, I strongly suggest you to do so: Python C API - First Step

This will be a much more exciting read with many new concepts. If you find something to be confusing or is having a hard time understanding it, please let me know and we can discuss!

For this post I will show a code that does many things, such as:

  • Look for the 'sys' module in the default loaded modules
  • Add a new path to the sys.path list in the sys module
  • Import a module wirtten by us
  • Access the members of the module that we loaded
  • Call a function from the module that we loaded
  • And more!

I hope that from this introduction you feel excited about what is coming ahead!

So, let's start looking at the module we will import:

''' Save this code in a file called example.py '''
 
def doubleValue(x):
    y = 2 * x
    return y
 
myString = "Hello World!"

And finally the code that will do all those wonderful things mentioned in the list above:

/**
 * Save the code in a file called main.c
 * Compile with one of the following:
 *    gcc -Wall `python3.2-config --libs` `python3.2-config --include` main.c -o pyx
 *    gcc -Wall -lpthread -ldl -lutil -lm -lpython3.2mu -I/usr/include/python3.2mu main.c -o pyx
 * 
 * Run with: ./pyx
 * 
 * Use the code as you wish. If any help is needed, please let me know.
 * If you are going to republish this code somewhere, I would be grateful
 * if you can keep the credits of the code and/or a link to the original.
 *
 * Author: Henrique M. D.
 * Email: typoon at gmail dot com
 */
 
#include "Python.h"
#include <stdio.h>
#include <stdlib.h>
 
int main(int argc, char** argv) {
    int i;
    char module_dir[255];
    char *name;
    char *str;
    //wchar_t *module_path;
    PyObject *poModule;
    PyObject *poDictModule;
    PyObject *poKey;
    PyObject *poValue;
    PyObject *poString;
    PyObject *poSys;
    PyObject *poList;
    PyObject *poListItem;
    PyObject *poPath;
    PyObject *poMyString;
    PyObject *poDoubleValue;
    PyObject *poResult;
    Py_ssize_t size;
 
    /* Start Part 1 */
 
    // This is the path where our .py scripts are. We will need to
    // add this path to the sys.path list later on prior to loading
    // it. If we don't do that, Python won't be able to load our script
    // as it won't find it.
    // Change this path accordingly to your setup
    memset(module_dir, 0, sizeof(module_dir));
    sprintf(module_dir, "/home/gilgamesh/codes/Pyx/scripts");
 
    Py_Initialize();
 
    poDictModule = PyImport_GetModuleDict();
 
    poString = PyUnicode_FromString("sys");
    poSys = PyDict_GetItem(poDictModule, poString);
    Py_DECREF(poString); // refcount = 0, it can be freed now
 
    if(poSys == NULL)
    {
        printf("Cannot find the sys item in the dictionary. Aborting...\n");
        return -1;
    }
 
    if(!PyModule_Check(poSys))
    {
        printf("This is not the sys module. Leaving...\n");
        return -1;
    }
 
    poDictModule = PyModule_GetDict(poSys);
    poList = PyDict_GetItemString(poDictModule, "path");
 
    if(!PyList_Check(poList))
    {
        printf("This is not the path list. Aborting...\n");
        return -1;
    }
 
    printf("Path list before:\n");
    for(i = 0; i < PyList_Size(poList); i++)
    {
        poListItem = PyList_GetItem(poList, i);
        if(PyUnicode_Check(poListItem))
        {
            char *value = PyBytes_AsString(PyUnicode_AsEncodedString(poListItem, "utf-8", "Error"));
            printf("\tValue [%d] = %s\n", i, value);
        }
    }
    printf("\n");
 
    poPath = PyUnicode_FromString(module_dir);
    PyList_Append(poList, poPath);
    Py_DECREF(poPath);
 
    printf("Path list after:\n");
    for(i = 0; i < PyList_Size(poList); i++)
    {
        poListItem = PyList_GetItem(poList, i);
        if(PyUnicode_Check(poListItem))
        {
            char *value = PyBytes_AsString(PyUnicode_AsEncodedString(poListItem, "utf-8", "Error"));
            printf("\tValue [%d] = %s\n", i, value);
        }
    }
 
    /* End Part 1 */
 
    /* Start Part 2 */
 
    // Here we will import our script. It is called example.py
    // We just pass the name of the module without the .py as the
    // Python interpreter knows that it should look for the file
    // example.py
    poModule = PyImport_ImportModule("example");
 
    if(poModule == NULL)
    {
        printf("Error importing module\n");
        PyErr_Print();
        return -1;
    }
 
    printf("Module imported succesfully!\n\n");
    poDictModule = PyModule_GetDict(poModule);
 
    size = 0;
    while(PyDict_Next(poDictModule, &size, &poKey, &poValue))
    {
        name = PyBytes_AsString(PyUnicode_AsEncodedString(poKey, "utf-8", "Error"));
        printf("Type: %s - Symbol name: %s \n", (char *)poValue->ob_type->tp_name, name);
    }
    printf("\n");
 
    // Get the objects of the symbols we defined in our script    
    poMyString = PyDict_GetItemString(poDictModule, "myString");
 
    if(poMyString)
    {
        if(PyUnicode_Check(poMyString))
        {
            str = PyBytes_AsString(PyUnicode_AsEncodedString(poMyString, "utf-8", "Error"));
            printf("myString value is: %s\n", str);
        }
        else
        {
            printf("myString is not a PyUnicode object. Are you messing up with the script?\n");
        }
    }
    else
    {
        printf("Symbol 'myString' not found...\n");
    }
 
    poDoubleValue = PyDict_GetItemString(poDictModule, "doubleValue");
 
    if(poDoubleValue)
    {
        if(PyFunction_Check(poDoubleValue))
        {
            poResult = PyObject_CallFunction(poDoubleValue, "i", 2);
            printf("doubleValue(2) = %ld\n", PyLong_AsLong(poResult));
            Py_DECREF(poResult);
        }
        else
        {
            printf("doubleValue is not a PyFunction object. Are you messing up with the script?\n");
        }
    }
    else
    {
        printf("Symbol 'doubleValue' not found...\n");
    }
 
    Py_Finalize();
 
    /* End Part 2 */
 
    return 0;
 
}

I will not repeat myself about things that have already been explained in the first post, so I will just give an overview on what is happening here.

This code will import a Python module (a script written by us), print the value of the variable 'myString' that is declared in the script and execute the function 'doubleValue' that (as the name suggests) takes one integer argument, multiplies it by 2 and returns the result.

In order to achieve this, we need some preparation.

First, for Python to be able to find our module, we need to first provide it with the path to where our script is. This is what 'Part One' in the code is all about.

When one wants to import a module, the Python interpreter will check all the paths defined in the list "sys.path" for the module to be imported. If the module is found, it is then loaded otherwise the interpreter will throw an error about the module not being found.

As our module will be in a place different than the default places that python looks for, we need to add this path to the list of module paths.

    poDictModule = PyImport_GetModuleDict();
    poString = PyUnicode_FromString("sys");
    poSys = PyDict_GetItem(poDictModule, poString);
    Py_DECREF(poString); // refcount = 0, it can be freed now

First, we grab the sys.modules dictionary so we can navigate later to the sys module and grab the path list.

We then create a PyUnicode representing the string "sys" that will be used to search for the "sys" module in the dictionary.

PyDict_GetItem() will two arguments: the dictionary we want to get an item from, and a PyObject with the string (a PyUnicode) representing the key in the dictionary. It then returns a PyObject* of the item associated with that key.

After we have our item, we decrement the reference count of 'poString' so it can be free()'d by the system. I will talk about refcounts in another post that is on its way, so don't worry too much about it for now.

At this point we have the module 'sys' in the poSys variable, and can finally go ahead to grab its dictionary and finding the 'path' list in it.

    poDictModule = PyModule_GetDict(poSys);
    poList = PyDict_GetItemString(poDictModule, "path");

This is the other way of getting an item from the dictionary. Instead of having all the trouble of creating a PyUnicode object, we can use PyDict_GetItemString() to get the item identified by the key passed as a char*.
Not much new stuff here, we grab the dictionary for the "sys" module and then grab the "path" list from it.
I know that "sys.path" is a list, because it is documented in the Python documentation somewhere (don't recall where thou).
So now have the sys.path list in the 'poList' variable.

    printf("Path list before:\n");
    for(i = 0; i < PyList_Size(poList); i++)
    {
        poListItem = PyList_GetItem(poList, i);
        if(PyUnicode_Check(poListItem))
        {
            char *value = PyBytes_AsString(PyUnicode_AsEncodedString(poListItem, "utf-8", "Error"));
            printf("\tValue [%d] = %s\n", i, value);
        }
    }
    printf("\n");

Here we are just printing all the items in the sys.path list. Prior to this point one can see that I have made all the checks to make sure that poList is really a PyList and that we can access it as such.

The sys.path list is a list of Strings, so it is okay to just go ahead and convert each item (that we retrieved using PyList_GetItem()) to a char* and then print it out.

An equivalent code in Python to do this would be:

import sys
 
print(sys.path)

Now we want to add the diretory where our module will be

    poPath = PyUnicode_FromString(module_dir);
    PyList_Append(poList, poPath);
    Py_DECREF(poPath);

As we are working with a list of PyUnicode (strings), we need to create a PyObject that represents our path as a PyUnicode, so again we are using PyUnicode_FromString() for this.

After that, we simply append the new path to the list, and again decremet the refcount of the recently created PyObject to make sure it will be cleaned up by the interpreter when it is time.

    printf("Path list after:\n");
    for(i = 0; i < PyList_Size(poList); i++)
    {
        poListItem = PyList_GetItem(poList, i);
        if(PyUnicode_Check(poListItem))
        {
            char *value = PyBytes_AsString(PyUnicode_AsEncodedString(poListItem, "utf-8", "Error"));
            printf("\tValue [%d] = %s\n", i, value);
        }
    }

Print the list again, to prove that our path is now in the sys.path list! :D

Great! Up to this point we saw some new functions and learned some cool stuff on how to handle a list and repracticed handling dictionaries.

What we want to do now is load our module, print all its dictionary entries, access its 'myString' member and call its 'doubleValue' function.
All this things are identified in the code as Part 2.

    // Here we will import our script. It is called example.py
    // We just pass the name of the module without the .py as the
    // Python interpreter knows that it should look for the file
    // example.py
    poModule = PyImport_ImportModule("example");

I guess the comment in this part says it all. Just to make sure, the script presented in the begining of this post should be saved in a file 'example.py' for this to work. If you have saved it in a file with a different name, just make the changes here :)

    printf("Module imported succesfully!\n\n");
    poDictModule = PyModule_GetDict(poModule);
 
    size = 0;
    while(PyDict_Next(poDictModule, &size, &poKey, &poValue))
    {
        name = PyBytes_AsString(PyUnicode_AsEncodedString(poKey, "utf-8", "Error"));
        printf("Type: %s - Symbol name: %s \n", (char *)poValue->ob_type->tp_name, name);
    }
    printf("\n");

The module was succesfully imported, we get its dictionary and start printing everything on it!

    poMyString = PyDict_GetItemString(poDictModule, "myString");
 
    if(poMyString)
    {
        if(PyUnicode_Check(poMyString))
        {
            str = PyBytes_AsString(PyUnicode_AsEncodedString(poMyString, "utf-8", "Error"));
            printf("myString value is: %s\n", str);
        }
        else
        {
            printf("myString is not a PyUnicode object. Are you messing up with the script?\n");
        }
    }
    else
    {
        printf("Symbol 'myString' not found...\n");
    }

We retrieve the object that refers to the 'myString' variable in our Python code. If the symbol 'myString' is not present in the dictionary, the function PyDict_GetItemString() will return NULL, so that is why we check to see if there is anything in the poMyString variable. After that, we just want to make sure it is a PyUnicode object, so we can print its value. The printing part is pretty much the same thing we have been doing for every other PyUnicode.

    poDoubleValue = PyDict_GetItemString(poDictModule, "doubleValue");
 
    if(poDoubleValue)
    {
        if(PyFunction_Check(poDoubleValue))
        {
            poResult = PyObject_CallFunction(poDoubleValue, "i", 2);
            printf("doubleValue(2) = %ld\n", PyLong_AsLong(poResult));
            Py_DECREF(poResult);
        }
        else
        {
            printf("doubleValue is not a PyFunction object. Are you messing up with the script?\n");
        }
    }
    else
    {
        printf("Symbol 'doubleValue' not found...\n");
    }

Now we retrieve the object that refers to the function 'myDouble' that we declared in the script. So far, this is pretty much the same thing we have been doing all along. After that we check to make sure that what we got is really a function, and then the interesting part!
PyObject_CallFunction() is one of the many ways to execute a function. Let's have a look at the prototype for PyObject_CallFunction() then:

PyObject* PyObject_CallFunction(PyObject *callable, char *format, ...)

Let's talk about the arguments first.
callable is the PyObject that represents the function we want to call.

format just like printf() is a function that receives a 'format' parameter, this parameter works the same way, the difference is that the characters used to represent thigs are different. This string will have the format of what kind of parameters will come right after. For our code, we have PyObject_CallFunction(poDoubleValue, "i", 2) - Here what happens is, the parameter right after the format is taken as an int, because the format string is "i". If the format string was "iii", we would need to pass 3 parameters, like this: PyObject_CallFunction(poDoubleValue, "iii", 2,3,4);

... all other parameters that are needed depending on the 'format' string.

The "format" parameter shall reflect as many parameters as the function is expecting.

After calling the function, simply print the result (which is an int, treated as a PyLong in Python).
We decrement the reference count for cleanup.

After all this, we just execute a PyFinalize() to finish the interpreter, and we are done!

References:

Python/C API Reference Manual

Posted by Henrique

Comments (2) Trackbacks (0)
  1. … swig for python c (http://www.swig.org/) …

  2. Thanks! I will have a look into that.
    I am doing this using the Python C API mostly because of the fun and the opportunity to learn a little bit about how Python works internally ;D


Leave a comment

(required)

No trackbacks yet.