eulfedora.models - Fedora models

DigitalObject

class eulfedora.models.DigitalObject(api, pid=None, create=False, default_pidspace=None)

A single digital object in a Fedora respository, with methods and properties to easy creating, accessing, and updating a Fedora object or any of its component parts, with pre-defined datastream mappings for the standard Fedora Dublin Core (dc) and RELS-EXT (rels_ext) datastreams.

Note

If you want idiomatic access to other datastreams, consider extending DigitalObject and defining your own datastreams using XmlDatastream, RdfDatastream, or FileDatastream as appropriate.

OWNER_ID_SEPARATOR = u','

Owner ID separator for multiple owners. Should match the OWNER-ID-SEPARATOR configured in Fedora. For more detail, see https://jira.duraspace.org/browse/FCREPO-82

add_relationship(rel_uri, obj)

Add a new relationship to the RELS-EXT for this object. Calls API_M.addRelationship().

Example usage:

isMemberOfCollection = 'info:fedora/fedora-system:def/relations-external#isMemberOfCollection'
collection_uri = 'info:fedora/foo:456'
object.add_relationship(isMemberOfCollection, collection_uri)
Parameters:
  • rel_uri – URI for the new relationship
  • obj – related object; can be DigitalObject or string; if string begins with info:fedora/ it will be treated as a resource, otherwise it will be treated as a literal
Return type:

boolean

audit_trail

Fedora audit trail as an instance of eulfedora.xml.AuditTrail

Note

Since Fedora (as of 3.5) does not make the audit trail available via an API call or as a datastream, accessing the audit trail requires loading the foxml for the object. If an object has large, versioned XML datastreams this may be slow.

audit_trail_users

A set of all usernames recorded in the audit_trail, if available.

dc

XmlDatastream for the required Fedora DC datastream; datastream content will be automatically loaded as an instance of eulxml.xmlmap.dc.DublinCore

default_pidspace = None

Default namespace to use when generating new PIDs in get_default_pid() (by default, calls Fedora getNextPid, which will use Fedora-configured namespace if default_pidspace is not set).

ds_list

Dictionary of all datastreams that belong to this object in Fedora. Key is datastream id, value is an ObjectDatastream for that datastream.

Only retrieved when requested; cached after first retrieval.

exists
Type:bool

True when the object actually exists (and can be accessed by the current user) in Fedora

getDatastreamObject(dsid, dsobj_type=None, as_of_date=None)

Get any datastream on this object as a DatastreamObject or add a new datastream. If the datastream id corresponds to a predefined datastream, the configured object will be returned and the datastream object will be returned. If type is not specified for an existing datastream, attempts to infer the appropriate subclass of datastream object to return based on the mimetype (for XML and RELS-EXT).

Note that if you use this method to add new datastreams you should be sure to set all datastream metadata appropriately for your content (i.e., label, mimetype, control group, etc).

Parameters:
  • dsid – datastream id
  • dsobj_type – optional DatastreamObject type to be returned
  • as_of_date – optional datetime, used to load a historical version of the requested datastream
getDatastreamProfile(dsid, date=None)

Get information about a particular datastream belonging to this object.

Parameters:dsid – datastream id
Return type:DatastreamProfile
getProfile()

Get information about this object (label, owner, date created, etc.).

Return type:ObjectProfile
get_default_pid()

Get the next default pid when creating and ingesting a new DigitalObject instance without specifying a pid. By default, calls ApiFacade.getNextPID() with the configured class default_pidspace (if specified) as the pid namespace.

If your project requires custom pid logic (e.g., object pids are based on an external pid generator), you should extend DigitalObject and override this method.

get_models()

Get a list of content models the object subscribes to.

get_object(pid, type=None)

Initialize and return a new DigitalObject instance from the same repository, passing along the connection credentials in use by the current object. If type is not specified, the current DigitalObject class will be used.

Parameters:
  • pid – pid of the object to return
  • type – (optional) DigitalObject type to initialize and return
has_model(model)

Check if this object subscribes to the specified content model.

Parameters:model – URI for the content model, as a string (currently only accepted in info:fedora/foo:### format)
Return type:boolean
has_requisite_content_models
Type:bool

True when the current object has the expected content models for whatever subclass of DigitalObject it was initialized as.

index_data()

Generate and return a dictionary of default fields to be indexed for searching (e.g., in Solr). Includes top-level object properties, Content Model URIs, and Dublin Core fields.

This method is intended to be customized and extended in order to easily modify the fields that should be indexed for any particular type of object in any project; data returned from this method should be serializable as JSON (the current implementation uses django.utils.simplejson).

This method was designed for use with eulfedora.indexdata.

index_data_descriptive()

Descriptive data to be included in index_data() output. This implementation includes all Dublin Core fields, but should be extended or overridden as appropriate for custom DigitalObject classes.

index_data_relations()

Standard Fedora relations to be included in index_data() output. This implementation includes all standard relations included in the Fedora relations namespace, but should be extended or overridden as appropriate for custom DigitalObject classes.

ingest_user

Username responsible for ingesting this object into the repository, as recorded in the audit_trail, if available.

label

object label

label_max_size = 255

maximum label size allowed by fedora

modify_relationship(rel_uri, old_object, new_object)

Modify a relationship from RELS-EXT for this object. As the Fedora API-M does not contain a native “modifyRelationship”, this method purges an existing one, then adds a new one, pivoting on the predicate. Calls API_M.purgeRelationship(), API_M.addRelationship()

Example usage:

predicate = 'info:fedora/fedora-system:def/relations-external#isMemberOfCollection'
old_object = 'info:fedora/foo:456'
new_object = 'info:fedora/foo:789'

object.modify_relationship(predicate, old_object, new_object)
Parameters:
  • rel_uri – URI for the existing relationship
  • old_object – previous target object for relationship; can be DigitalObject or string; if string begins with info:fedora/ it will be treated as a resource, otherwise it will be treated as a literal
  • new_object – new target object for relationship; can be DigitalObject or string; if string begins with info:fedora/ it will be treated as a resource, otherwise it will be treated as a literal
Return type:

boolean

object_xml

Fedora object XML as an instance of FoxmlDigitalObject. (via REST_API. getObjectXML()).

owner

object owner

owner_max_size = 64

maximum owner size allowed by fedora

owners

Read-only list of object owners, separated by the configured OWNER_ID_SEPARATOR, with whitespace stripped.

pidspace

Fedora pidspace of this object

purge_relationship(rel_uri, obj)

Purge a relationship from RELS-EXT for this object. Calls API_M.purgeRelationship().

Example usage:

isMemberOfCollection = 'info:fedora/fedora-system:def/relations-external#isMemberOfCollection'
collection_uri = 'info:fedora/foo:789'
object.purge_relationship(isMemberOfCollection, collection_uri)
Parameters:
  • rel_uri – URI for the existing relationship
  • obj – related object; can be DigitalObject or string; if string begins with info:fedora/ it will be treated as a resource, otherwise it will be treated as a literal
Return type:

boolean

rels_ext

RdfDatastream for the standard Fedora RELS-EXT datastream

risearch

Instance of eulfedora.api.ResourceIndex, with the same root url and credentials

save(logMessage=None)

Save to Fedora any parts of this object that have been modified (including object profile attributes such as label, owner, or state, and any changes to datastream content or datastream properties). If a failure occurs at any point on saving any of the parts of the object, will back out any changes that have been made and raise a DigitalObjectSaveFailure with information about where the failure occurred and whether or not it was recoverable.

If the object is new, ingest it. If object profile information has been modified before saving, this data is used in the ingest. Datastreams are initialized to sensible defaults: XML objects are created using their default constructor, and RDF graphs start empty. If they’re updated before saving then those updates are included in the initial version. Datastream profile information is initialized from defaults specified in the Datastream declaration, though it too can be overridden prior to the initial save.

state

object state (Active/Inactive/Deleted)

uri

Fedora URI for this object (info:fedora/foo:### form of object pid)

uriref

Fedora URI for this object, as an rdflib.URIRef URI object

Custom Exception

class eulfedora.models.DigitalObjectSaveFailure(pid, failure, to_be_saved, saved, cleaned)

Custom exception class for when a save error occurs part-way through saving an instance of DigitalObject. This exception should contain enough information to determine where the save failed, and whether or not any changes saved before the failure were successfully rolled back.

These properties are available:
  • obj_pid - pid of the DigitalObject instance that failed to save
  • failure - string indicating where the failure occurred (either a datastream ID or ‘object profile’)
  • to_be_saved - list of datastreams that were modified and should have been saved
  • saved - list of datastreams that were successfully saved before failure occurred
  • cleaned - list of saved datastreams that were successfully rolled back
  • not_cleaned - saved datastreams that were not rolled back
  • recovered - boolean, True indicates all saved datastreams were rolled back

Datastream

Datastream Descriptors

class eulfedora.models.Datastream(id, label, defaults=None)

Datastream descriptor to simplify configuration and access to datastreams that belong to a particular DigitalObject.

When accessed, will initialize a DatastreamObject and cache it on the DigitalObject that it belongs to.

Example usage:

class MyDigitalObject(DigitalObject):
    text = Datastream("TEXT", "Text content", defaults={'mimetype': 'text/plain'})

All other configuration defaults are passed on to the DatastreamObject.

class eulfedora.models.XmlDatastream(id, label, objtype=None, defaults=None)

XML-specific version of Datastream. Datastreams are initialized as instances of XmlDatastreamObject. An additional, optional parameter objtype is passed to the Datastream object to configure the type of eulxml.xmlmap.XmlObject that should be used for datastream content.

Example usage:

from eulxml.xmlmap.dc import DublinCore

class MyDigitalObject(DigitalObject):
    extra_dc = XmlDatastream("EXTRA_DC", "Dublin Core", DublinCore)

my_obj = repo.get_object("example:1234", type=MyDigitalObject)
my_obj.extra_dc.content.title = "Example object"
my_obj.save(logMessage="automatically setting dc title")
class eulfedora.models.RdfDatastream(id, label, defaults=None)

RDF-specific version of Datastream for accessing datastream content as an rdflib RDF graph. Datastreams are initialized as instances of RdfDatastreamObject.

Example usage:

from rdflib import RDFS, Literal

class MyDigitalObject(DigitalObject):
    extra_rdf = RdfDatastream("EXTRA_RDF", "an RDF graph of stuff")

my_obj = repo.get_object("example:4321", type=MyDigitalObject)
my_obj.extra_rdf.content.add((my_obj.uriref, RDFS.comment,
                              Literal("This is an example object.")))
my_obj.save(logMessage="automatically setting rdf comment")
class eulfedora.models.FileDatastream(id, label, defaults=None)

File-based content version of Datastream. Datastreams are initialized as instances of FileDatastreamObject.

Datastream Objects

class eulfedora.models.DatastreamObject(obj, id, label, mimetype=None, versionable=False, state=u'A', format=None, control_group=u'M', checksum=None, checksum_type=u'MD5', as_of_date=None)

Object to ease accessing and updating a datastream belonging to a Fedora object. Handles datastream content as well as datastream profile information. Content and datastream info are only pulled from Fedora when content and info fields are accessed.

Intended to be used with DigitalObject and intialized via Datastream.

Initialization parameters:
param obj:the DigitalObject that this datastream belongs to.
param id:datastream id
param label:default datastream label
param mimetype:default datastream mimetype
param versionable:
 default configuration for datastream versioning
param state:default configuration for datastream state (default: A [active])
param format:default configuration for datastream format URI
param control_group:
 default configuration for datastream control group (default: M [managed])
param checksum:default configuration for datastream checksum
param checksum_type:
 default configuration for datastream checksum type (default: MD5)
param as_of_date:
 load a historical version of this datastream as of a particular date time. (Note that historical datastream versions are for access only, and cannot be saved.)
as_of_date = None

optional datetime for accessing a historical datastream version

content

contents of the datastream; for existing datastreams, content is only pulled from Fedora when first requested, and cached after first access; can be used to set or update datastream contents.

For an alternate method to set datastream content, see ds_location.

ds_location = None

Datastream content location: set this attribute to a URI that Fedora can resolve (e.g., http:// or file://) in order to add or update datastream content from a known, accessible location, rather than posting via content. If ds_location is set, it takes precedence over content.

get_chunked_content(chunksize=4096)

Generator that returns the datastream content in chunks, so larger datastreams can be used without reading the entire contents into memory.

history()

Get history/version information for this datastream and return as an instance of DatastreamHistory.

isModified()

Check if either the datastream content or profile fields have changed and should be saved to Fedora.

Return type:boolean
label

datastream label

mimetype

datastream mimetype

save(logmessage=None)

Save datastream content and any changed datastream profile information to Fedora.

Return type:boolean for success
size

Size of the datastream content

state

datastream state (Active/Inactive/Deleted)

undo_last_save(logMessage=None)

Undo the last change made to the datastream content and profile, effectively reverting to the object state in Fedora as of the specified timestamp.

For a versioned datastream, this will purge the most recent datastream. For an unversioned datastream, this will overwrite the last changes with a cached version of any content and/or info pulled from Fedora.

validate_checksum(date=None)

Check if this datastream has a valid checksum in Fedora, by running the REST_API.compareDatastreamChecksum() API call. Returns a boolean based on the checksum valid response from Fedora.

Parameters:date – (optional) check the datastream validity at a particular date/time (e.g., for versionable datastreams)
versionable

boolean; indicates if Fedora is configured to version the datastream

class eulfedora.models.XmlDatastreamObject(obj, id, label, objtype=<class 'eulxml.xmlmap.core.XmlObject'>, **kwargs)

Extends DatastreamObject in order to initialize datastream content as an instance of a specified XmlObject.

See DatastreamObject for more details. Has one additional parameter:

Parameters:objtype – xml object type to use for datastream content; if not specified, defaults to XmlObject
class eulfedora.models.RdfDatastreamObject(obj, id, label, mimetype=None, versionable=False, state=u'A', format=None, control_group=u'M', checksum=None, checksum_type=u'MD5', as_of_date=None)

Extends DatastreamObject in order to initialize datastream content as an rdflib RDF graph.

replace_uri(src, dest)

Replace a uri reference everywhere it appears in the graph with another one. It could appear as the subject, predicate, or object of a statement, so for each position loop through each statement that uses the reference in that position, remove the old statement, and add the replacement.

class eulfedora.models.FileDatastreamObject(obj, id, label, mimetype=None, versionable=False, state=u'A', format=None, control_group=u'M', checksum=None, checksum_type=u'MD5', as_of_date=None)

Extends DatastreamObject in order to allow setting and reading datastream content as a file. To update contents, set datastream content property to a new file object. For example:

class ImageObject(DigitalObject):
    image = FileDatastream('IMAGE', 'image datastream', defaults={
        'mimetype': 'image/png'
    })

Then, with an instance of ImageObject:

obj.image.content = open('/path/to/my/file')
obj.save()
content

contents of the datastream; only pulled from Fedora when accessed, cached after first access

Relations

class eulfedora.models.Relation(relation, type=None, ns_prefix=None, rdf_type=None, related_name=None, related_order=None)

This descriptor is intended for use with DigitalObject RELS-EXT relations, and provides get, set, and delete functionality for a single related DigitalObject instance or literal value in the RELS-EXT of an individual object.

Example use for a related object: a Relation should be initialized with a predicate URI and optionally a subclass of DigitalObject that should be returned:

class Page(DigitalObject):
    volume = Relation(relsext.isConstituentOf, type=Volume)

When a Relation is created with a type that references a DigitalObject subclass, a corresponding ReverseRelation will automatically be added to the related subclass. For the example above, the fictional Volume class would automatically get a page_set attribute configured with the same URI and a class of Page. Reverse property names can be customized using the related_name parameter, which is documented below and follows the basic conventions of Django’s ForeignKey model field (to which Relation is roughly analogous).

Note

Currently, auto-generated ReverseRelation properties will always be initialized with multiple=True, since that is the most common pattern for Fedora object relations (one to many). Other variants may be added later, if and when use cases arise.

Relation also supports configuring the RDF type and namespace prefixes that should be used for serialization; for example:

from rdflib import XSD, URIRef
from rdflib.namespace import Namespace

MYNS = Namespace(URIRef("http://example.com/ns/2011/my-test-namespace/#"))

class MyObj(DigitalObject):
    total = Relation(MYNS.count, ns_prefix={"my": MYNS}, rdf_type=XSD.int)

This would allow us to access total as an integer on a MyObj object, e.g.:

myobj.total = 3

and when the RELS-EXT is serialized it will use the configured namespace prefix, e.g.:


<rdf:RDF xmlns:my=”xmlns:fedora-model=”info:fedora/fedora-system:def/model#”
xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”>
<rdf:Description rdf:about=”info:fedora/myobj:1”>
<my:count rdf:datatype=”http://www.w3.org/2001/XMLSchema#int”>3</my:count>

</rdf:Description>

</rdf:RDF>

Note

If a namespace prefix is not specified, rfdlib will automatically generate a namespace to produce valid output, but it may be less readable than a custom namespace.

Initialization options:

Parameters:
  • relation – the RDF predicate URI as a rdflib.URIRef
  • type – optional DigitalObject subclass to initialize (for object relations); use type="self" to specify that the current DigitalObject class should be used (currently no reverse relation will be created for recursive relations).
  • ns_prefix – optional dictionary to configure namespace prefixes to be used for serialization; key should be the desired prefix, value should be an instance of rdflib.namespace.Namespace
  • rdf_type – optional rdf type for literal values (passed to rdflib.Literal as the datatype option)
  • related_name – optional name for the auto-generated ReverseRelation property, when the relation is to a subclass of DigitalObject; if not specified, the related name will be classname_set; a value of + indicates no ReverseRelation should be created
  • related_order – optional URI for sorting related objects in the auto-generated ReverseRelation property.
class eulfedora.models.ReverseRelation(relation, type=None, multiple=False, order_by=None)

Descriptor for use with DigitalObject RELS-EXT reverse relations, where the owning object is the RDF object of the predicate and the related object is the RDF subject. This descriptor will query the Fedora ResourceIndex for the requested subjects, based on the configured predicate, and return resulting items.

This descriptor only provides read access; there is no functionality for setting or deleting reverse-related objects.

It is recommended to use Relation and let the corresponding ReverseRelation be automatically generated for you.

Example use:

class Volume(DigitalObject):
    pages = ReverseRelation(relsext.isConstituentOf, type=Page, multiple=True)
Parameters:
  • relation – RDF relation to be used for querying to find the items
  • type – object type for the related item or items
  • multiple – set to true if there multiple related items, which will be returned as a list (defaults to false)
  • order_by – RDF predicate to be used for sorting multiple items (must be available for query in the RIsearch, as a property of the items being returned)