Parser

Parser — RDF parsers - from a syntax to RDF triples

Synopsis

typedef             raptor_parser;
raptor_parser *     raptor_new_parser                   (raptor_world *world,
                                                         const char *name);
raptor_parser *     raptor_new_parser_for_content       (raptor_world *world,
                                                         raptor_uri *uri,
                                                         const char *mime_type,
                                                         const unsigned char *buffer,
                                                         size_t len,
                                                         const unsigned char *identifier);
void                raptor_free_parser                  (raptor_parser *parser);
void                (*raptor_graph_mark_handler)        (void *user_data,
                                                         raptor_uri *graph,
                                                         int flags);
void                (*raptor_namespace_handler)         (void *user_data,
                                                         raptor_namespace *nspace);
void                raptor_parser_set_statement_handler (raptor_parser *parser,
                                                         void *user_data,
                                                         raptor_statement_handler handler);
enum                raptor_graph_mark_flags;
void                raptor_parser_set_graph_mark_handler
                                                        (raptor_parser *parser,
                                                         void *user_data,
                                                         raptor_graph_mark_handler handler);
void                raptor_parser_set_namespace_handler (raptor_parser *parser,
                                                         void *user_data,
                                                         raptor_namespace_handler handler);
const raptor_syntax_description * raptor_parser_get_description
                                                        (raptor_parser *rdf_parser);
raptor_locator *    raptor_parser_get_locator           (raptor_parser *rdf_parser);
void                raptor_parser_parse_abort           (raptor_parser *rdf_parser);
int                 raptor_parser_parse_chunk           (raptor_parser *rdf_parser,
                                                         const unsigned char *buffer,
                                                         size_t len,
                                                         int is_end);
int                 raptor_parser_parse_file            (raptor_parser *rdf_parser,
                                                         raptor_uri *uri,
                                                         raptor_uri *base_uri);
int                 raptor_parser_parse_file_stream     (raptor_parser *rdf_parser,
                                                         FILE *stream,
                                                         const char *filename,
                                                         raptor_uri *base_uri);
int                 raptor_parser_parse_iostream        (raptor_parser *rdf_parser,
                                                         raptor_iostream *iostr,
                                                         raptor_uri *base_uri);
int                 raptor_parser_parse_start           (raptor_parser *rdf_parser,
                                                         raptor_uri *uri);
int                 raptor_parser_parse_uri             (raptor_parser *rdf_parser,
                                                         raptor_uri *uri,
                                                         raptor_uri *base_uri);
int                 raptor_parser_parse_uri_with_connection
                                                        (raptor_parser *rdf_parser,
                                                         raptor_uri *uri,
                                                         raptor_uri *base_uri,
                                                         void *connection);
raptor_uri *        raptor_parser_get_graph             (raptor_parser *rdf_parser);
const char *        raptor_parser_get_name              (raptor_parser *rdf_parser);
int                 raptor_parser_set_option            (raptor_parser *parser,
                                                         raptor_option option,
                                                         const char *string,
                                                         int integer);
int                 raptor_parser_get_option            (raptor_parser *parser,
                                                         raptor_option option,
                                                         char **string_p,
                                                         int *integer_p);
const char *        raptor_parser_get_accept_header     (raptor_parser *rdf_parser);
void                raptor_parser_set_uri_filter        (raptor_parser *parser,
                                                         raptor_uri_filter_func filter,
                                                         void *user_data);
raptor_world *      raptor_parser_get_world             (raptor_parser *rdf_parser);

Description

The parsing class that allows creating a parser for reading from a particular syntax (or can guess and use contextual information) that will on demand generate RDF triples to a handler function, as chunks of syntax data are passed into the parser. Parsing can be done from strings in memory, files or from URIs on the web.

There are also methods to deal with handling errors, warnings and returned triples as well as setting options (features) that can adjust how parsing is performed.

Details

raptor_parser

raptor_parser* raptor_parser;

Raptor Parser class


raptor_new_parser ()

raptor_parser *     raptor_new_parser                   (raptor_world *world,
                                                         const char *name);

Constructor - create a new raptor_parser object.

world :

world object

name :

the parser name or NULL for default parser

Returns :

a new raptor_parser object or NULL on failure

raptor_new_parser_for_content ()

raptor_parser *     raptor_new_parser_for_content       (raptor_world *world,
                                                         raptor_uri *uri,
                                                         const char *mime_type,
                                                         const unsigned char *buffer,
                                                         size_t len,
                                                         const unsigned char *identifier);

Constructor - create a new raptor_parser.

Uses raptor_world_guess_parser_name() to find a parser by scoring recognition of the syntax by a block of characters, the content identifier or a mime type. The content identifier is typically a filename or URI or some other identifier.

world :

world object

uri :

URI identifying the syntax (or NULL)

mime_type :

mime type identifying the content (or NULL)

buffer :

buffer of content to guess (or NULL)

len :

length of buffer

identifier :

identifier of content (or NULL)

Returns :

a new raptor_parser object or NULL on failure

raptor_free_parser ()

void                raptor_free_parser                  (raptor_parser *parser);

Destructor - destroy a raptor_parser object.

parser :

raptor_parser object

raptor_graph_mark_handler ()

void                (*raptor_graph_mark_handler)        (void *user_data,
                                                         raptor_uri *graph,
                                                         int flags);

Graph start/end mark handler function.

Records start and end of graphs happening in a stream of generated raptor_statement via the statement handler. The callback starts a graph when flags has RAPTOR_GRAPH_MARK_START bit set.

The start and ends may be either declared in the syntax via some keyword or mechanism such as TRiG {} syntax when flags has bit RAPTOR_GRAPH_MARK_DECLARED set, or be implied by the start/end of the data in other syntaxes, and the bit will be unset.

user_data :

user data

graph :

graph to report, NULL for the default graph

flags :

bitmask of raptor_graph_mark_flags flags

raptor_namespace_handler ()

void                (*raptor_namespace_handler)         (void *user_data,
                                                         raptor_namespace *nspace);

XML Namespace declaration reporting handler set by raptor_parser_set_namespace_handler().

user_data :

user data

nspace :

raptor_namespace declared

raptor_parser_set_statement_handler ()

void                raptor_parser_set_statement_handler (raptor_parser *parser,
                                                         void *user_data,
                                                         raptor_statement_handler handler);

Set the statement handler function for the parser.

Use this to set the function to receive statements as the parsing proceeds. The statement argument to handler is shared and must be copied by the caller with raptor_statement_copy().

parser :

raptor_parser parser object

user_data :

user data pointer for callback

handler :

new statement callback function

enum raptor_graph_mark_flags

typedef enum {
  RAPTOR_GRAPH_MARK_START = 1,
  RAPTOR_GRAPH_MARK_DECLARED = 2
} raptor_graph_mark_flags;

Graph mark handler bitmask flags

RAPTOR_GRAPH_MARK_START

mark is start of graph (otherwise is end)

RAPTOR_GRAPH_MARK_DECLARED

mark was declared in syntax rather than implict

raptor_parser_set_graph_mark_handler ()

void                raptor_parser_set_graph_mark_handler
                                                        (raptor_parser *parser,
                                                         void *user_data,
                                                         raptor_graph_mark_handler handler);

Set the graph mark handler function for the parser.

See raptor_graph_mark_handler and raptor_graph_mark_flags for the marks that may be returned by the handler.

parser :

raptor_parser parser object

user_data :

user data pointer for callback

handler :

new graph callback function

raptor_parser_set_namespace_handler ()

void                raptor_parser_set_namespace_handler (raptor_parser *parser,
                                                         void *user_data,
                                                         raptor_namespace_handler handler);

Set the namespace handler function for the parser.

When a prefix/namespace is seen in a parser, call the given handler with the prefix string and the raptor_uri namespace URI. Either can be NULL for the default prefix or default namespace.

The handler function does not deal with duplicates so any namespace may be declared multiple times.

parser :

raptor_parser parser object

user_data :

user data pointer for callback

handler :

new namespace callback function

raptor_parser_get_description ()

const raptor_syntax_description * raptor_parser_get_description
                                                        (raptor_parser *rdf_parser);

Get description of the syntaxes of the parser.

The returned description is static and lives as long as the raptor library (raptor world).

rdf_parser :

raptor_parser parser object

Returns :

description of syntax

raptor_parser_get_locator ()

raptor_locator *    raptor_parser_get_locator           (raptor_parser *rdf_parser);

Get the current raptor locator object.

rdf_parser :

raptor parser

Returns :

raptor locator

raptor_parser_parse_abort ()

void                raptor_parser_parse_abort           (raptor_parser *rdf_parser);

Abort an ongoing parsing.

Causes any ongoing generation of statements by a parser to be terminated and the parser to return controlto the application as soon as draining any existing buffers.

Most useful inside raptor_parser_parse_file() or raptor_parser_parse_uri() when the Raptor library is directing the parsing and when one of the callback handlers such as as set by raptor_parser_set_statement_handler() requires to return to the main application code.

rdf_parser :

raptor_parser parser object

raptor_parser_parse_chunk ()

int                 raptor_parser_parse_chunk           (raptor_parser *rdf_parser,
                                                         const unsigned char *buffer,
                                                         size_t len,
                                                         int is_end);

Parse a block of content into triples.

This method can only be called after raptor_parser_parse_start() has initialised the parser.

rdf_parser :

RDF parser

buffer :

content to parse

len :

length of buffer

is_end :

non-0 if this is the end of the content (such as EOF)

Returns :

non-0 on failure.

raptor_parser_parse_file ()

int                 raptor_parser_parse_file            (raptor_parser *rdf_parser,
                                                         raptor_uri *uri,
                                                         raptor_uri *base_uri);

Parse RDF content at a file URI.

If uri is NULL (source is stdin), then the base_uri is required.

rdf_parser :

parser

uri :

URI of RDF content or NULL to read from standard input

base_uri :

the base URI to use (or NULL if the same)

Returns :

non 0 on failure

raptor_parser_parse_file_stream ()

int                 raptor_parser_parse_file_stream     (raptor_parser *rdf_parser,
                                                         FILE *stream,
                                                         const char *filename,
                                                         raptor_uri *base_uri);

Parse RDF content from a FILE*.

After draining the FILE* stream (EOF), fclose is not called on it.

rdf_parser :

parser

stream :

FILE* of RDF content

filename :

filename of content or NULL if it has no name

base_uri :

the base URI to use

Returns :

non 0 on failure

raptor_parser_parse_iostream ()

int                 raptor_parser_parse_iostream        (raptor_parser *rdf_parser,
                                                         raptor_iostream *iostr,
                                                         raptor_uri *base_uri);

Parse content from an iostream

If the parser requires a base URI and base_uri is NULL, an error will be generated and the function will fail.

rdf_parser :

parser

iostr :

iostream to read from

base_uri :

the base URI to use (or NULL)

Returns :

non 0 on failure, <0 if a required base URI was missing

raptor_parser_parse_start ()

int                 raptor_parser_parse_start           (raptor_parser *rdf_parser,
                                                         raptor_uri *uri);

Start a parse of content with base URI.

Parsers that need a base URI can be identified using a syntax description returned by raptor_world_get_parser_description() statically or raptor_parser_get_description() on a constructed parser.

rdf_parser :

RDF parser

uri :

base URI or may be NULL if no base URI is required

Returns :

non-0 on failure, <0 if a required base URI was missing

raptor_parser_parse_uri ()

int                 raptor_parser_parse_uri             (raptor_parser *rdf_parser,
                                                         raptor_uri *uri,
                                                         raptor_uri *base_uri);

Parse the RDF content at URI.

Sends an HTTP Accept: header whent the URI is of the HTTP protocol, see raptor_parser_parse_uri_with_connection() for details including how the base_uri is used.

rdf_parser :

parser

uri :

URI of RDF content

base_uri :

the base URI to use (or NULL if the same)

Returns :

non 0 on failure

raptor_parser_parse_uri_with_connection ()

int                 raptor_parser_parse_uri_with_connection
                                                        (raptor_parser *rdf_parser,
                                                         raptor_uri *uri,
                                                         raptor_uri *base_uri,
                                                         void *connection);

Parse RDF content at URI using existing WWW connection.

If base_uri is not given and during resolution of the URI, a protocol redirection occurs, the final resolved URI will be used as the base URI. If redirection does not occur, the base URI will be uri.

If base_uri is given, it overrides the process above.

When connection is NULL and a MIME Type exists for the parser type, this type is sent in an HTTP Accept: header in the form Accept: MIME-TYPE along with a wildcard of 0.1 quality, so MIME-TYPE is prefered rather than the sole answer. The latter part may not be necessary but should ensure an HTTP 200 response.

rdf_parser :

parser

uri :

URI of RDF content

base_uri :

the base URI to use (or NULL if the same)

connection :

connection object pointer or NULL to create a new one

Returns :

non 0 on failure

raptor_parser_get_graph ()

raptor_uri *        raptor_parser_get_graph             (raptor_parser *rdf_parser);

Get the current graph for the parser

The returned URI is owned by the caller and must be freed with raptor_free_uri()

rdf_parser :

parser

Returns :

raptor_uri* graph name or NULL for the default graph

raptor_parser_get_name ()

const char *        raptor_parser_get_name              (raptor_parser *rdf_parser);

Get the name of a parser.

Use raptor_parser_get_description() to get the alternate names and aliases as well as other descriptive values.

rdf_parser :

raptor_parser parser object

Returns :

the short name for the parser.

raptor_parser_set_option ()

int                 raptor_parser_set_option            (raptor_parser *parser,
                                                         raptor_option option,
                                                         const char *string,
                                                         int integer);

Set parser option.

If string is not NULL and the option type is numeric, the string value is converted to an integer and used in preference to integer.

If string is NULL and the option type is not numeric, an error is returned.

The string values used are copied.

The allowed options are available via raptor_world_get_option_description().

parser :

raptor_parser parser object

option :

option to set from enumerated raptor_option values

string :

string option value (or NULL)

integer :

integer option value

Returns :

non 0 on failure or if the option is unknown

raptor_parser_get_option ()

int                 raptor_parser_get_option            (raptor_parser *parser,
                                                         raptor_option option,
                                                         char **string_p,
                                                         int *integer_p);

Get parser option.

Any string value returned in *string_p is shared and must be copied by the caller.

The allowed options are available via raptor_world_get_option_description().

parser :

raptor_parser parser object

option :

option to get value

string_p :

pointer to where to store string value

integer_p :

pointer to where to store integer value

Returns :

option value or < 0 for an illegal option

raptor_parser_get_accept_header ()

const char *        raptor_parser_get_accept_header     (raptor_parser *rdf_parser);

Get an HTTP Accept value for the parser.

The returned string must be freed by the caller such as with raptor_free_memory().

rdf_parser :

parser

Returns :

a new Accept: header string or NULL on failure

raptor_parser_set_uri_filter ()

void                raptor_parser_set_uri_filter        (raptor_parser *parser,
                                                         raptor_uri_filter_func filter,
                                                         void *user_data);

Set URI filter function for WWW retrieval.

parser :

parser object

filter :

URI filter function

user_data :

User data to pass to filter function

raptor_parser_get_world ()

raptor_world *      raptor_parser_get_world             (raptor_parser *rdf_parser);

Get the raptor_world object associated with a parser.

rdf_parser :

parser

Returns :

raptor_world* pointer