Skip to content

GLib.MarkupParseContext

record (struct)

A parse context is used to parse a stream of bytes that you expect to contain marked-up text.

See MarkupParseContext.new, MarkupParser, and so on for more details.

Constructors

new

@classmethod
def new(cls, parser: MarkupParser, flags: MarkupParseFlags | int, user_data: int | None, user_data_dnotify: DestroyNotify) -> MarkupParseContext

Creates a new parse context. A parse context is used to parse marked-up documents. You can feed any number of documents into a context, as long as no errors occur; once an error occurs, the parse context can't continue to parse text (you have to free it and create a new parse context).

Parameters:

  • parser — a MarkupParser
  • flags — one or more MarkupParseFlags
  • user_data — user data to pass to MarkupParser functions
  • user_data_dnotify — user data destroy notifier called when the parse context is freed

Methods

end_parse

def end_parse(self) -> bool

Signals to the MarkupParseContext that all data has been fed into the parse context with MarkupParseContext.parse.

This function reports an error if the document isn't complete, for example if elements are still open.

free

def free(self) -> None

Frees a MarkupParseContext.

This function can't be called from inside one of the MarkupParser functions or while a subparser is pushed.

get_element

def get_element(self) -> str

Retrieves the name of the currently open element.

If called from the start_element or end_element handlers this will give the element_name as passed to those functions. For the parent elements, see MarkupParseContext.get_element_stack.

get_element_stack

def get_element_stack(self) -> list[str]

Retrieves the element stack from the internal state of the parser.

The returned SList is a list of strings where the first item is the currently open tag (as would be returned by MarkupParseContext.get_element) and the next item is its immediate parent.

This function is intended to be used in the start_element and end_element handlers where MarkupParseContext.get_element would merely return the name of the element that is being processed.

get_offset

def get_offset(self) -> int

Retrieves the current offset from the beginning of the document, in bytes.

The information is meant to accompany the values returned by MarkupParseContext.get_position, and comes with the same accuracy guarantees.

get_position

def get_position(self) -> tuple[int, int]

Retrieves the current line number and the number of the character on that line. Intended for use in error messages; there are no strict semantics for what constitutes the "current" line number other than "the best number we could come up with for error messages."

get_tag_start

def get_tag_start(self) -> tuple[int, int, int]

Retrieves the start position of the current start or end tag.

This function can be used in the start_element or end_element callbacks to obtain location information for error reporting.

Note that line_number and char_number are intended for human readable error messages and are therefore 1-based and in Unicode characters. offset on the other hand is meant for programmatic use, and thus is 0-based and in bytes.

The information is meant to accompany the values returned by MarkupParseContext.get_position, and comes with the same accuracy guarantees.

get_user_data

def get_user_data(self) -> int | None

Returns the user_data associated with context.

This will either be the user_data that was provided to MarkupParseContext.new or to the most recent call of MarkupParseContext.push.

parse

def parse(self, text: str, text_len: int) -> bool

Feed some data to the MarkupParseContext.

The data need not be valid UTF-8; an error will be signaled if it's invalid. The data need not be an entire document; you can feed a document into the parser incrementally, via multiple calls to this function. Typically, as you receive data from a network connection or file, you feed each received chunk of data into this function, aborting the process if an error occurs. Once an error is reported, no further data may be fed to the MarkupParseContext; all errors are fatal.

Parameters:

  • text — chunk of text to parse
  • text_len — length of text in bytes

pop

def pop(self) -> int | None

Completes the process of a temporary sub-parser redirection.

This function exists to collect the user_data allocated by a matching call to MarkupParseContext.push. It must be called in the end_element handler corresponding to the start_element handler during which MarkupParseContext.push was called. You must not call this function from the error callback -- the user_data is provided directly to the callback in that case.

This function is not intended to be directly called by users interested in invoking subparsers. Instead, it is intended to be used by the subparsers themselves to implement a higher-level interface.

push

def push(self, parser: MarkupParser, user_data: int | None = ...) -> None

Temporarily redirects markup data to a sub-parser.

This function may only be called from the start_element handler of a MarkupParser. It must be matched with a corresponding call to MarkupParseContext.pop in the matching end_element handler (except in the case that the parser aborts due to an error).

All tags, text and other data between the matching tags is redirected to the subparser given by parser. user_data is used as the user_data for that parser. user_data is also passed to the error callback in the event that an error occurs. This includes errors that occur in subparsers of the subparser.

The end tag matching the start tag for which this call was made is handled by the previous parser (which is given its own user_data) which is why MarkupParseContext.pop is provided to allow "one last access" to the user_data provided to this function. In the case of error, the user_data provided here is passed directly to the error callback of the subparser and MarkupParseContext.pop should not be called. In either case, if user_data was allocated then it ought to be freed from both of these locations.

This function is not intended to be directly called by users interested in invoking subparsers. Instead, it is intended to be used by the subparsers themselves to implement a higher-level interface.

As an example, see the following implementation of a simple parser that counts the number of tags encountered.

typedef struct
{
  gint tag_count;
} CounterData;

static void
counter_start_element (GMarkupParseContext  *context,
                       const gchar          *element_name,
                       const gchar         **attribute_names,
                       const gchar         **attribute_values,
                       gpointer              user_data,
                       GError              **error)
{
  CounterData *data = user_data;

  data->tag_count++;
}

static void
counter_error (GMarkupParseContext *context,
               GError              *error,
               gpointer             user_data)
{
  CounterData *data = user_data;

  g_slice_free (CounterData, data);
}

static GMarkupParser counter_subparser =
{
  counter_start_element,
  NULL,
  NULL,
  NULL,
  counter_error
};

In order to allow this parser to be easily used as a subparser, the following interface is provided:

void
start_counting (GMarkupParseContext *context)
{
  CounterData *data = g_slice_new (CounterData);

  data->tag_count = 0;
  g_markup_parse_context_push (context, &counter_subparser, data);
}

gint
end_counting (GMarkupParseContext *context)
{
  CounterData *data = g_markup_parse_context_pop (context);
  int result;

  result = data->tag_count;
  g_slice_free (CounterData, data);

  return result;
}

The subparser would then be used as follows:

static void start_element (context, element_name, ...)
{
  if (strcmp (element_name, "count-these") == 0)
    start_counting (context);

  // else, handle other tags...
}

static void end_element (context, element_name, ...)
{
  if (strcmp (element_name, "count-these") == 0)
    g_print ("Counted %d tags\n", end_counting (context));

  // else, handle other tags...
}

Parameters:

ref

def ref(self) -> MarkupParseContext

Increases the reference count of context.

unref

def unref(self) -> None

Decreases the reference count of context. When its reference count drops to 0, it is freed.