Utilities¶
RDF Namespaces¶
Predefined RDF namespaces for convenience, for use with
RdfDatastream
objects, in
ResourceIndex
queries, for defining a
eulfedora.models.Relation
, for adding relationships via
eulfedora.models.DigitalObject.add_relationship()
, or anywhere
else that Fedora-related rdflib.term.URIRef
objects might
come in handy.
Example usage:
from eulfedora.models import DigitalObject, Relation
from eulfedora.rdfns import relsext as relsextns
class Item(DigitalObject):
collection = Relation(relsextns.isMemberOfCollection)
-
eulfedora.rdfns.
model
= rdf.namespace.ClosedNamespace('info:fedora/fedora-system:def/model#')¶ rdflib.namespace.ClosedNamespace
for the Fedora model namespace (currently only includeshasModel
).
-
eulfedora.rdfns.
oai
= rdf.namespace.ClosedNamespace('http://www.openarchives.org/OAI/2.0/')¶ rdflib.namespace.ClosedNamespace
for the OAI relations commonly used with Fedora and the PROAI OAI provider. Available URIs are:itemID
,setSpec
, andsetName
.
-
eulfedora.rdfns.
relsext
= rdf.namespace.ClosedNamespace('info:fedora/fedora-system:def/relations-external#')¶ rdflib.namespace.ClosedNamespace
for the Fedora external relations ontology.
Testing utilities¶
eulfedora.testutil
provides custom Django test suite runners with
Fedora environment setup / teardown for all tests.
To use, configure as test runner in your Django settings:
TEST_RUNNER = 'eulfedora.testutil.FedoraTextTestSuiteRunner'
When xmlrunner
is available, xmlrunner variants are also available.
To use this test runner, configure your Django test runner as follows:
TEST_RUNNER = 'eulfedora.testutil.FedoraXmlTestSuiteRunner'
The xml variant honors the same django settings that the xmlrunner django testrunner does (TEST_OUTPUT_DIR, TEST_OUTPUT_VERBOSE, and TEST_OUTPUT_DESCRIPTIONS).
Any Repository
instances created after the
test suite starts will automatically connect to the test collection.
If you have a test pidspace configured, that will be used for the
default pidspace when creating test objects; if you have a pidspace
but not a test pidspace, the set to use a pidspace of
‘yourpidspace-test’ for the duration of the tests. Any objects in the
test pidspace will be removed from the Fedora instance after the tests
finish.
Note
The test configurations are not switched until after your test code is loaded, so any repository connections should not be made at class instantiation time, but in a setup method.
If you are using nose
or django-nose
, you should use the
EulfedoraSetUp
plugin to use a separate Fedora Repository for
testing. With django-nose
, you should add
eulfedora.testutil.EulfedoraSetUp
to NOSE_PLUGINS and
'--with-eulfedorasetup'
to NOSE_ARGS to ensure the plugin is
automatically enabled.
-
class
eulfedora.testutil.
FedoraTestWrapper
¶ A context manager that replaces the Django fedora configuration with a test configuration inside the block, replacing the original configuration when the block exits. All objects are purged from the defined test pidspace before and after running tests.
-
class
eulfedora.testutil.
FedoraTextTestRunner
(stream=<open file '<stderr>', mode 'w'>, descriptions=True, verbosity=1, failfast=False, buffer=False, resultclass=None, tb_locals=False)¶ A
unittest.TextTestRunner
that wraps test execution in aFedoraTestWrapper
.
-
class
eulfedora.testutil.
FedoraTextTestSuiteRunner
(pattern=None, top_level=None, verbosity=1, interactive=True, failfast=False, keepdb=False, reverse=False, debug_sql=False, parallel=0, tags=None, exclude_tags=None, **kwargs)¶ Extend
django.test.simple.DjangoTestSuiteRunner
to setup and teardown the Fedora test environment.
-
eulfedora.testutil.
alternate_test_fedora
¶ alias of
FedoraTestWrapper
Synchronization¶
-
syncutil.
sync_object
(src_obj, dest_repo, export_context='migrate', overwrite=False, show_progress=False, requires_auth=False, omit_checksums=False)¶ Copy an object from one repository to another using the Fedora export functionality.
Parameters: - src_obj – source
DigitalObject
to be copied - dest_repo – destination
Repository
where the object will be copied to - export_context – Fedora export format to use, one of “migrate” or “archive”; migrate is generally faster, but requires access from destination repository to source and may result in checksum errors for some content; archive exports take longer to process (default: migrate)
- overwrite – if an object with the same pid is already present in the destination repository, it will be removed only if overwrite is set to true (default: false)
- show_progress – if True, displays a progress bar with content size, progress, speed, and ETA (only applicable to archive exports)
- requires_auth – content datastreams require authentication, and should have credentials patched in (currently only supported in archive-xml export mode) (default: False)
- omit_checksums – scrubs contentDigest – aka checksums – from datastreams; helpful for datastreams with Redirect (R) or External (E) contexts (default: False)
Returns: result of Fedora ingest on the destination repository on success
- src_obj – source
-
class
eulfedora.syncutil.
ArchiveExport
(obj, dest_repo, verify=False, progress_bar=None, requires_auth=False, xml_only=False)¶ Iteratively process a Fedora archival export in order to copy an object into another fedora repository. Use
object_data()
to process the content and provides the foxml to be ingested into the destination repository.Parameters: - obj – source
DigitalObject
to be copied - dest_repo – destination
Repository
where the object will be copied to - verify – if True, datastream sizes and MD5 checksums will be calculated as they are decoded and logged for verification (default: False)
- progress_bar – optional progressbar object to be updated as the export is read and processed
- requires_auth – content datastreams require authentication, and should have credentials patched in; currently only relevant when xml_only is True. (default: False)
- xml_only – only use archival data for xml datastreams; use fedora datastream dissemination urls for all non-xml content (optionally with credentials, if requires_auth is set). (default: False)
-
dsinfo_regex
= <_sre.SRE_Pattern object at 0x463ff10>¶ regular expression used to identify datastream version information that is needed for processing datastream content in an archival export
-
encoded_datastream
()¶ Generator for datastream content. Takes a list of sections of data within the current chunk (split on binaryContent start and end tags), runs a base64 decode, and yields the data. Computes datastream size and MD5 as data is decoded for sanity-checking purposes. If binary content is not completed within the current chunk, it will retrieve successive chunks of export data until it finds the end. Sets a flag when partial content is left within the current chunk for continued processing by
object_data()
.Parameters: sections – list of export data split on binary content start and end tags, starting with the first section of binary content
-
get_datastream_info
(dsinfo)¶ Use regular expressions to pull datastream [version] details (id, mimetype, size, and checksum) for binary content, in order to sanity check the decoded data.
Parameters: dsinfo – text content just before a binaryContent tag Returns: dict with keys for id, mimetype, size, type and digest, or None if no match is found
-
object_data
()¶ Process the archival export and return a buffer with foxml content for ingest into the destination repository.
Returns: io.BytesIO
for ingest, with references to uploaded datastream content or content location urls
-
url_credentials
= ''¶ url credentials, if needed for datastream content urls
- obj – source
-
syncutil.
estimate_object_size
(obj, archive=True)¶ Calculate a rough estimate of object size, based on the sizes of all versions of all datastreams. If archive is true, adjusts the size estimate of managed datastreams for base64 encoded data.