eulfedora – Python objects to interact with the Fedora Commons repository

Server objects

Repository

eulfedora.server.Repository has the capability to automatically use connection configuration parameters pulled from Django settings, when available, but it can also be used without Django.

When you create an instance of Repository, if you do not specify connection parameters, it will attempt to initialize the repository connection based on Django settings, using the configuration names documented below.

If you are writing unit tests that use eulfedora, you may want to take advantage of eulfedora.testutil.FedoraTestSuiteRunner, which has logic to set up and switch configurations between a development fedora repository and a test repository.

Projects that use this module should include the following settings in their settings.py:

# Fedora Repository settings
FEDORA_ROOT = 'http://fedora.host.name:8080/fedora/'
FEDORA_USER = 'user'
FEDORA_PASSWORD = 'password'
FEDORA_PIDSPACE = 'changeme'
FEDORA_TEST_ROOT = 'http://fedora.host.name:8180/fedora/'
FEDORA_TEST_PIDSPACE = 'testme'

If username and password are not specified, the Repository instance will be initialized without credentials and access Fedora as an anonymous user. If pidspace is not specified, the Repository will use the default pidspace for the configured Fedora instance.

Projects that need unit test setup and clean-up tasks (syncrepo and test object removal) to access Fedora with different credentials than the configured Fedora credentials should use the following settings:

FEDORA_TEST_USER = 'testuser'
FEDORA_TEST_PASSWORD = 'testpassword'

class eulfedora.server.Repository(root=None, username=None, password=None, request=None)

Pythonic interface to a single Fedora Commons repository instance.

best_subtype_for_object(obj, content_models=None)

Given a DigitalObject, examine the object to select the most appropriate subclass to instantiate. This generic implementation examines the object’s content models and compares them against the defined subclasses of DigitalObject to pick the best match. Projects that have a more nuanced understanding of their particular objects should override this method in a Repository subclass. This method is intended primarily for use by infer_object_subtype().

Parameters:
  • obj – a DigitalObject to inspect
  • content_models – optional list of content models, if they are known ahead of time (e.g., from a Solr search result), to avoid an additional Fedora look-up
Return type:

a subclass of DigitalObject

default_object_type

Default type to use for methods that return fedora objects - DigitalObject

alias of DigitalObject

find_objects(terms=None, type=None, chunksize=None, **kwargs)

Find objects in Fedora. Find query should be generated via keyword args, based on the fields in Fedora documentation. By default, the query uses a contains (~) search for all search terms. Calls ApiFacade.findObjects(). Results seem to return consistently in ascending PID order.

Example usage - search for all objects where the owner contains ‘jdoe’:

repository.find_objects(ownerId='jdoe')

Supports all search operators provided by Fedora findObjects query (exact, gt, gte, lt, lte, and contains). To specify the type of query for a particular search term, call find_objects like this:

repository.find_objects(ownerId__exact='lskywalker')
repository.find_objects(date__gt='20010302')
Parameters:
  • type – type of objects to return; defaults to DigitalObject
  • chunksize – number of objects to return at a time
Return type:

generator for list of objects

get_next_pid(namespace=None, count=None)

Request next available pid or pids from Fedora, optionally in a specified namespace. Calls ApiFacade.getNextPID().

Deprecated since version 0.14: Mint pids for new objects with eulfedora.models.DigitalObject.get_default_pid() instead, or call ApiFacade.getNextPID() directly.

Parameters:
  • namespace – (optional) get the next pid in the specified pid namespace; otherwise, Fedora will return the next pid in the configured default namespace.
  • count – (optional) get the specified number of pids; by default, returns 1 pid
Return type:

string or list of strings

get_object(pid=None, type=None, create=None)

Initialize a single object from Fedora, or create a new one, with the same Fedora configuration and credentials.

Parameters:
  • pid – pid of the object to request, or a function that can be called to get one. if not specified, get_next_pid() will be called if a pid is needed
  • type – type of object to return; defaults to DigitalObject
Return type:

single object of the type specified

Create:

boolean: create a new object? (if not specified, defaults to False when pid is specified, and True when it is not)

get_objects_with_cmodel(cmodel_uri, type=None)

Find objects in Fedora with the specified content model.

Parameters:
  • cmodel_uri – content model URI (should be full URI in info:fedora/pid:### format)
  • type – type of object to return (e.g., class:DigitalObject)
Return type:

list of objects

infer_object_subtype(api, pid=None, create=False, default_pidspace=None)

Construct a DigitalObject or appropriate subclass, inferring the appropriate subtype using best_subtype_for_object(). Note that this method signature has been selected to match the DigitalObject constructor so that this method might be passed directly to get_object() as a type:

>>> obj = repo.get_object(pid, type=repo.infer_object_subtype)

See also: TypeInferringRepository

ingest(text, log_message=None)

Ingest a new object into Fedora. Returns the pid of the new object on success. Calls ApiFacade.ingest().

Parameters:
  • text – full text content of the object to be ingested
  • log_message – optional log message
Return type:

string

purge_object(pid, log_message=None)

Purge an object from Fedora. Calls ApiFacade.purgeObject().

Parameters:
  • pid – pid of the object to be purged
  • log_message – optional log message
Return type:

boolean

risearch

instance of eulfedora.api.ResourceIndex, with the same root url and credentials

search_fields = ['pid', 'label', 'state', 'ownerId', 'cDate', 'mDate', 'dcmDate', 'title', 'creator', 'subject', 'description', 'publisher', 'contributor', 'date', 'type', 'format', 'identifier', 'source', 'language', 'relation', 'coverage', 'rights']

fields that can be searched against in find_objects()

search_fields_aliases = {'owner': 'ownerId', 'dc_modified': 'dcmDate', 'modified': 'mDate', 'created': 'cDate'}

human-readable aliases for oddly-named fedora search fields

class eulfedora.server.TypeInferringRepository(root=None, username=None, password=None, request=None)

A simple Repository subclass whose default object type for get_object() is infer_object_subtype(). Thus, each call to get_object() on a repository such as this will automatically use best_subtype_for_object() (or a subclass override) to infer the object’s proper type.

default_object_type(api, pid=None, create=False, default_pidspace=None)

Construct a DigitalObject or appropriate subclass, inferring the appropriate subtype using best_subtype_for_object(). Note that this method signature has been selected to match the DigitalObject constructor so that this method might be passed directly to get_object() as a type:

>>> obj = repo.get_object(pid, type=repo.infer_object_subtype)

See also: TypeInferringRepository

Resource Index

class eulfedora.api.ResourceIndex(opener)

Python object for accessing Fedora’s Resource Index.

RISEARCH_FLUSH_ON_QUERY = False

Specify whether or not RI search queries should specify flush=true to obtain the most recent results. If flush is specified to the query method, that takes precedence.

Irrelevant if Fedora RIsearch is configured with syncUpdates = True.

count_statements(query, language='spo', type='triples', flush=None)

Run a query in a format supported by the Fedora Resource Index (e.g., SPO or Sparql) and return the count of the results.

Parameters:
  • query – query as a string
  • language – query language to use; defaults to ‘spo’
  • flush – flush results to get recent changes; defaults to False
Return type:

integer

find_statements(query, language='spo', type='triples', flush=None)

Run a query in a format supported by the Fedora Resource Index (e.g., SPO or Sparql) and return the results.

Parameters:
  • query – query as a string
  • language – query language to use; defaults to ‘spo’
  • type – type of query - tuples or triples; defaults to ‘triples’
  • flush – flush results to get recent changes; defaults to False
Return type:

rdflib.ConjunctiveGraph when type is triples; list of dictionaries (keys based on return fields) when type is tuples

get_objects(subject, predicate)

Search for all subjects related to the specified subject and predicate.

Parameters:
  • subject
  • object
Return type:

generator of RDF statements

get_predicates(subject, object)

Search for all subjects related to the specified subject and object.

Parameters:
  • subject
  • object
Return type:

generator of RDF statements

get_subjects(predicate, object)

Search for all subjects related to the specified predicate and object.

Parameters:
  • predicate
  • object
Return type:

generator of RDF statements

sparql_query(query, flush=None)

Run a Sparql query.

Parameters:query – sparql query string
Return type:list of dictionary

Create and run a subject-predicate-object (SPO) search. Any search terms that are not specified will be replaced as a wildcard in the query.

Parameters:
  • subject – optional subject to search
  • predicate – optional predicate to search
  • object – optional object to search
Return type:

rdflib.ConjunctiveGraph

spoencode(val)

Encode search terms for an SPO query.

Parameters:val – string to be encoded
Return type:string

RDF Namespaces

Predefined RDF namespaces for convenience, for use with RdfDatastream objects, in ResourceIndex queries, for defining a eulfedora.models.Relation, for adding relationships via eulfedora.models.DigitalObject.add_relationship(), or anywhere else that Fedora-related rdflib.term.URIRef objects might come in handy.

Example usage:

from eulfedora.models import DigitalObject, Relation
from eulfedora.rdfns import relsext as relsextns

class Item(DigitalObject):
  collection = Relation(relsextns.isMemberOfCollection)

eulfedora.rdfns.model = rdf.namespace.ClosedNamespace('info:fedora/fedora-system:def/model#')

rdflib.namespace.ClosedNamespace for the Fedora model namespace (currently only includes hasModel).

eulfedora.rdfns.oai = rdf.namespace.ClosedNamespace('http://www.openarchives.org/OAI/2.0/')

rdflib.namespace.ClosedNamespace for the OAI relations commonly used with Fedora and the PROAI OAI provider. Available URIs are: itemID, setSpec, and setName.

eulfedora.rdfns.relsext = rdf.namespace.ClosedNamespace('info:fedora/fedora-system:def/relations-external#')

rdflib.namespace.ClosedNamespace for the Fedora external relations ontology.

Django integration

views Fedora views

indexdata Fedora Indexing

Management commands

The following management commands will be available when you include eulfedora in your django INSTALLED_APPS and rely on the existdb settings described above.

For more details on these commands, use manage.py <command> help

  • syncrepo - load simple content models and fixture object to the configured fedora repository

eulfedora Template tags

eulfedora adds custom template tags for use in templates.

fedora_access

Catch fedora failures and permission errors encountered during template rendering:

{% load fedora %}

{% fedora_access %}
   <p>Try to access data on fedora objects which could be
     <span class='{{ obj.inaccessible_ds.content.field }}'>inaccessible</span>
     or when fedora is
     <span class='{{ obj.regular_ds.content.other_field }}'>down</span>.</p>
{% permission_denied %}
   <p>Fall back to this content if the main body results in a permission
     error while trying to access fedora data.</p>
{% fedora_failed %}
   <p>Fall back to this content if the main body runs into another error
     while trying to access fedora data.</p>
{% end_fedora_access %}

The permission_denied and fedora_failed sections are optional. If only permission_denied is present then non-permission errors will result in the entire block rendering empty. If only fedora_failed is present then that section will be used for all errors whether permission-related or not. If neither is present then all errors will result in the entire block rendering empty.

Note that when Django’s TEMPLATE_DEBUG setting is on, it precludes all error handling and displays the Django exception screen for all errors, including fedora errors, even if you use this template tag. To disable this Django internal functionality and see the effects of the fedora_access tag, add the following to your Django settings:

TEMPLATE_DEBUG = False

testutil Unittest utilities