OAI-PMH
Insights in Biology and Medicine provides comprehensive Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) support to enable efficient discovery, harvesting, and aggregation of our published content. Our OAI-PMH implementation follows version 2.0 of the protocol and supports multiple metadata formats for maximum interoperability.
OAI-PMH Overview
What is OAI-PMH?
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a widely adopted framework designed to facilitate the discovery and sharing of digital resources across repositories. As a data provider, we expose our metadata through standardized interfaces that allow service providers to harvest and aggregate content.
Base URL
https://www.biologymedjournal.com/oai
This is the primary endpoint for all OAI-PMH requests. All protocol operations are performed by adding specific parameters to this base URL.
Protocol Implementation
Supported OAI-PMH Verbs
Our implementation supports all six OAI-PMH protocol verbs:
Verb | Purpose | Required Parameters | Response |
---|---|---|---|
Identify | Retrieve information about the repository | None | Repository identification details |
ListMetadataFormats | List available metadata formats | None (or identifier) | Supported metadata schemas |
ListSets | List available collection sets | None (or resumptionToken) | Repository organizational structure |
ListIdentifiers | List record identifiers | metadataPrefix | Record headers without metadata |
ListRecords | Harvest complete records | metadataPrefix | Full records with metadata |
GetRecord | Retrieve individual record | identifier, metadataPrefix | Single complete record |
Metadata Formats
Repository Sets
Organizational Structure
Our repository is organized into logical sets for targeted harvesting:
Set Specification | Set Name | Description | Content |
---|---|---|---|
research_articles | Research Articles | Original research contributions | All research articles since 2015 |
review_articles | Review Articles | Comprehensive literature reviews | Systematic and narrative reviews |
methodology | Methodology Papers | Novel methods and protocols | Technical and methodological advances |
case_studies | Case Studies | Clinical and research case reports | Detailed case analyses |
year:2023 | 2023 Publications | Current year publications | All articles published in 2023 |
open_access | Open Access Content | All open access articles | Complete OA collection |
Implementation Details
Harvesting Configuration
Our OAI-PMH implementation includes the following technical features:
- Granularity: Day-level granularity (YYYY-MM-DD)
- Deleted Records: No support for deleted records (transient)
- Compression: gzip compression supported for large responses
- Rate Limiting: 1000 requests per hour per IP address
- Resumption Tokens: Automatic for result sets > 500 records
- Update Frequency: Real-time updates upon publication
Usage Examples
Identify Request
https://www.biologymedjournal.com/oai?verb=Identify
List Records (Recent Articles)
https://www.biologymedjournal.com/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2023-10-01
Get Specific Record
https://www.biologymedjournal.com/oai?verb=GetRecord&identifier=oai:biologymedjournal.com:article/123&metadataPrefix=jats
List Records by Set
https://www.biologymedjournal.com/oai?verb=ListRecords&metadataPrefix=oai_dc&set=research_articles
Common Use Cases
Institutional Repository Integration
Libraries and institutions can harvest our metadata to include in their discovery systems:
Recommended Harvesting Strategy
- Frequency: Weekly incremental harvests
- Metadata Format: oai_dc for basic discovery
- Sets: Use subject sets for targeted collection
- From Parameter: Use last harvest date for incremental updates
Aggregator Services
Subject-specific aggregators can harvest content for specialized indexes:
Aggregator Type | Recommended Configuration | Update Frequency |
---|---|---|
Biomedical Aggregators | jats format, research_articles set | Daily |
Multidisciplinary Services | oai_dc format, complete repository | Weekly |
Citation Databases | crossref format, all content | Real-time |
Compliance and Standards
Protocol Compliance
Our implementation fully complies with OAI-PMH version 2.0 specifications:
OAI-PMH 2.0 Dublin Core JATS 1.2 OpenAIRE
- Error Handling: Comprehensive error codes and messages
- XML Validation: All responses validate against OAI-PMH schema
- Character Encoding: UTF-8 encoding for international support
- Date Handling: ISO 8601 date formatting
Best Practices for Harvesters
To ensure efficient and respectful harvesting:
- Implement exponential backoff for rate limit errors
- Use resumption tokens for large result sets
- Respect robots.txt directives
- Cache responses appropriately to reduce server load
- Monitor for protocol changes and updates
- Provide clear user-agent identification
Technical Specifications
Supported Parameters
Detailed parameter specifications for each verb:
Parameter | Verb | Required | Description | Format |
---|---|---|---|---|
metadataPrefix | ListRecords, ListIdentifiers, GetRecord | Yes | Requested metadata format | oai_dc, jats, crossref |
from | ListRecords, ListIdentifiers | No | Start date for selective harvesting | YYYY-MM-DD |
until | ListRecords, ListIdentifiers | No | End date for selective harvesting | YYYY-MM-DD |
set | ListRecords, ListIdentifiers | No | Set specification for targeted harvesting | Set identifiers |
resumptionToken | ListRecords, ListIdentifiers, ListSets | No | Token for retrieving next result set | Opaque token string |
identifier | GetRecord, ListMetadataFormats | Yes for GetRecord | Unique record identifier | OAI identifier format |
Identifier Structure
OAI Identifier Format
Our OAI identifiers follow the recommended structure:
Examples:
oai:biologymedjournal.com:article/123
oai:biologymedjournal.com:article/456
oai:biologymedjournal.com:review/789
The identifier components are:
- Repository Identifier: biologymedjournal.com
- Item Type: article, review, methodology, etc.
- Local Identifier: Unique numeric ID for the item
Performance and Scalability
System Architecture
Our OAI-PMH implementation is built for performance and reliability:
- Response Time: Average < 500ms for standard requests
- Availability: 99.9% uptime guarantee
- Caching: Multi-layer caching for frequently accessed data
- Load Balancing: Distributed across multiple servers
- Monitoring: Real-time performance monitoring and alerts
Troubleshooting
Common Issues and Solutions
Issue | Error Code | Solution |
---|---|---|
Invalid metadataPrefix | cannotDisseminateFormat | Use ListMetadataFormats to see supported formats |
Rate limiting | 503 Service Unavailable | Reduce request frequency or implement backoff |
Invalid date format | badArgument | Use YYYY-MM-DD format for dates |
Large result sets | N/A (incomplete results) | Use resumptionToken to retrieve all results |
Unknown identifier | idDoesNotExist | Verify identifier format and existence |
Testing and Validation
Validation Tools
We regularly test our implementation using standard validation tools:
- Repository Explorer: Interactive testing interface
- Protocol Validators: Automated compliance checking
- Metadata Validation: Schema validation for all formats
- Performance Testing: Load testing for scalability
Developers can use our test endpoint for experimentation: https://www.biologymedjournal.com/oai/test
Future Enhancements
Planned Improvements
We continuously enhance our OAI-PMH implementation:
Enhancement | Timeline | Description |
---|---|---|
Additional Metadata Formats | Q1 2024 | Datacite schema, ORCID integration |
Enhanced Sets | Q2 2024 | Subject-based sets, funder sets |
Bulk Download | Q3 2024 | Compressed metadata packages |
REST API | 2025 | Modern RESTful interface alongside OAI-PMH |
Technical Support: For technical questions about our OAI-PMH implementation or to report issues:
Technical Support: [email protected]
OAI-PMH Specialist: [email protected]
Documentation: https://www.biologymedjournal.com/oai/docs
We welcome feedback from the community to improve our services and interoperability.