Skip to main content

Documents

Overview

Documents are the core resource in FormKiQ. A document combines a stable documentId, optional file content, a path, metadata, attributes, actions, events, and site-level access control.

FormKiQ stores document content in Amazon S3 and stores document metadata in Amazon DynamoDB. Documents belong to a FormKiQ site, identified by siteId, so the same deployment can support separate teams, departments, customers, or document spaces.

Use this page to understand how documents work conceptually. For implementation examples, use the tutorials and generated API reference:

What Is a Document?

A FormKiQ document can represent:

  • A file uploaded into FormKiQ-managed S3 storage
  • A metadata-only record
  • A reference to an external file through a deep link
  • A document participating in workflows, rulesets, search, OCR, classification, or integrations

The main document fields are:

FieldPurpose
documentIdStable identifier used by the API to retrieve, update, process, or delete the document.
siteIdTenant or site boundary that controls where the document belongs.
pathUser-facing folder and filename path used for organization and browsing.
contentTypeMIME type for the document content.
contentLengthSize of the document content when available.
checksum / checksumTypeOptional integrity check for uploaded content.
deepLinkPathExternal URL or S3 URI when the document points to content outside FormKiQ-managed storage.
metadataSmall set of document metadata fields stored with the document record.
attributesStructured business fields used for classification, search, workflows, and access control.

Document Lifecycle

A typical document lifecycle looks like this:

  1. Create the document metadata.
  2. Upload file content directly or through a presigned S3 URL.
  3. Add attributes, classifications, or relationships.
  4. Run document actions such as OCR, full-text extraction, antivirus scanning, webhooks, or EventBridge notifications.
  5. Search, retrieve, download, or route the document through workflows.
  6. Update metadata, attributes, content, or version information.
  7. Audit activity, review versions, or report on document state.
  8. Soft delete, restore, or permanently delete the document when required.

Document Identity and Organization

Sites and Tenancy

Every document belongs to a site. In a single-tenant setup, most documents use the default site. In a multi-tenant setup, different sites can represent teams, departments, customers, or regional repositories.

For tenancy planning, see Multi-Tenant and Multi-Instance Deployments.

Paths

The path field gives users and systems a familiar way to organize documents. Paths can mirror folders, departments, projects, customers, cases, or workflow stages.

Examples:

  • contracts/acme/master-service-agreement.pdf
  • finance/invoices/2026/05/invoice-1042.pdf
  • hr/policies/benefits-handbook.pdf

Paths are useful for browsing and display, but API integrations should store and use documentId for durable document references.

Relationships

Document relationships connect related files and provide context for how documents belong together.

RelationshipUse case
PRIMARYMain document in a related group.
ATTACHMENTSupporting file linked to a primary document.
APPENDIXAdditional reference material.
SUPPLEMENTStandalone supplementary information.
ASSOCIATEDNon-hierarchical relationship between documents.
RENDITIONAlternative format, translation, or generated output.
TEMPLATETemplate used by document generation.
DATASOURCEData source used by document generation.

Metadata vs Attributes

FormKiQ supports both document metadata and document attributes. They are related, but they are not the same thing.

TypeBest forNotes
Standard metadataCore document facts such as path, content type, size, checksum, dates, and deep link path.Created and maintained as part of the document record.
Extended metadataSmall amounts of custom data stored directly with the document.Limited to 25 metadata entries per document.
AttributesStructured business fields such as invoice number, department, owner, confidentiality, status, or classification.Better for search, workflows, schemas, classification, and access policies.
note

Each document supports up to 25 metadata entries. For business classification, search filters, rules, and access decisions, prefer document attributes.

Document Attributes

Document attributes help organize and automate document-heavy processes. Common uses include:

  • Classification, such as documentType, department, or confidentiality
  • Search filters, such as vendor, invoiceNumber, or projectId
  • Workflow routing, such as status, priority, or assignedTeam
  • Access control, such as owner or classification
  • Reporting, such as region, customer, or retentionCategory

For the full attributes guide, see Attributes. For schema-based classification, see Schemas.

Upload and Storage Options

Choose the upload pattern based on file size, source system, and integration requirements.

MethodBest forAPI or feature
Inline document createSmall text, JSON, Markdown, or base64 content under the inline content limit.POST /documents
Presigned uploadMost binary files, large files, PDFs, images, Office files, and browser uploads.POST /documents/upload
Existing upload URLUploading content for a document where metadata has already been created.GET /documents/{documentId}/upload
Deep linkReferencing external content without copying it into FormKiQ storage.Deep Links / External Documents
FileSync or import toolingLarge imports, migrations, and bulk file movement.FileSync CLI

For storage architecture and bucket behavior, see Document Storage.

Deep links let FormKiQ manage a document record whose content lives outside FormKiQ-managed storage. The document can still have a FormKiQ path, attributes, relationships, and metadata.

Use deep links when:

  • Content already exists in another repository.
  • You want FormKiQ to index, classify, or organize external records.
  • You want to avoid copying large volumes of content during an initial integration.
  • Another system remains the system of record for the file content.

Deep links also have limitations. FormKiQ may not be able to process, OCR, extract, scan, or download external content unless it has access to the target object. For stronger integrations with Microsoft 365, SharePoint, Google Workspace, and related systems, see Document Gateways.

S3 deep links reference documents that already exist in your own S3 bucket or another AWS account's S3 bucket. This can reduce migration overhead and avoid duplicate storage when content already lives in S3.

For cross-account S3 deep links:

  • The source bucket must allow FormKiQ to read the referenced object.
  • Access should be scoped to the FormKiQ AWS account and expected execution role.
  • Bucket policy conditions should require secure transport.
  • If the source bucket uses a customer managed KMS key, the key policy must allow decrypt access through S3.

The exact bucket and KMS policy should be reviewed against your account, region, encryption, and least-privilege requirements. For broader storage guidance, see Document Storage.

Document Management Features

Document Actions

Document actions run processing or integration tasks against a document. Actions can be requested when a document is created, added later through the API, or triggered by workflows and rules depending on the deployment.

Common document action use cases:

Use caseWhat it does
OCRExtracts text from images and PDFs so scanned documents can become searchable.
Full-text extractionExtracts text content for indexing and search.
Document taggingUses AI or configured extraction logic to generate tags or metadata.
WebhookCalls an external system after a document event or processing step.
NotificationSends an email notification when configured conditions occur.
Intelligent Document ProcessingClassifies, extracts, and validates structured data from documents.
EventBridgePublishes document events to AWS EventBridge for downstream automation.
AntivirusScans documents for malware when the antivirus module is enabled.

Supported Actions

ActionDescriptionAvailability
ANTIVIRUSScans documents using ClamAV for malicious content.Explore and commercial deployments.
DOCUMENTTAGGINGGenerates document tags using configured AI or tagging logic.Core.
EVENTBRIDGESends document data and metadata to Amazon EventBridge.Core.
FULLTEXTExtracts and indexes text content for search using Typesense or OpenSearch.Core with Typesense; OpenSearch is available as an add-on for Advanced and Enterprise.
IDPRuns intelligent document processing using mappings and attributes.Explore, commercial deployments, and optional add-ons.
NOTIFICATIONSends email notifications.Core.
OCRExtracts text from images or PDFs.Core with Tesseract; Amazon Textract is available with Explore, commercial deployments, and optional add-ons.
PUBLISHPublishes approved documents.Explore and commercial deployments.
WEBHOOKCalls an external webhook.Core.

For endpoint details, see Get Document Actions and Add Document Actions.

Document Versions

Document versioning preserves the history of document content and metadata changes. It can support audit review, rollback, comparison, and controlled document management.

Versioning is useful for:

  • Legal contracts
  • Policies and procedures
  • Financial reports
  • Compliance records
  • Customer-facing generated documents
note

Document versioning is not available with FormKiQ Core. It is available as part of FormKiQ Explore and commercial deployments.

For details, see Document Versioning and Get Document Versions.

Soft Deletes and Restore

Soft delete removes a document from normal active-document listings without immediately purging all record information. This can help with recovery, review, and controlled deletion workflows.

Use:

For walkthrough details, see Soft Deletes.

Document User Activities

Document activity records provide visibility into who accessed or changed a document and when the action occurred. Activity data can support audit reviews, governance workflows, security investigations, and reporting.

note

Document user activities are not available with FormKiQ Core. They are available as part of FormKiQ Explore and commercial deployments.

For details, see Get Document User Activities and Reporting, Analytics, and Audit.

Document Events Features

FormKiQ can publish document events when documents are created, updated, processed, deleted, restored, or changed. Events let downstream systems react without tightly coupling those systems to the document API.

Use document events for:

  • Workflow automation
  • Search indexing
  • Analytics pipelines
  • External notifications
  • CRM, ERP, or case management integration
  • Compliance or audit reporting

Amazon EventBridge

Document EventBridge

As of version 1.17.0, each FormKiQ installation includes an Amazon EventBridge integration. FormKiQ publishes document events to EventBridge so other systems can subscribe and react.

EventBridge supports:

  • Real-time routing to subscribed targets
  • Decoupled downstream processing
  • Scalable event-driven integration
  • AWS-native filtering and routing

Supported Event Types

Each EventBridge document event uses the DetailType field to identify the type of event.

DetailTypeDescription
New Document Create MetadataTriggered when a new document metadata record is created.
New Document Create ContentTriggered when a document is created or updated with new content.
Document DeleteTriggered when a document is deleted.
Document Soft DeleteTriggered when a document is soft deleted.
Document RestoreTriggered when a document is restored from soft delete.
Document Create MetadataTriggered when existing document metadata or attributes are created.
Document Update MetadataTriggered when document metadata or attributes are changed.
Document Delete MetadataTriggered when document metadata is deleted.

Event Payload Schema

EventBridge payloads use a consistent document schema. Fields can vary based on event type and document state.

{
"siteId": "string",
"path": "string",
"deepLinkPath": "string",
"insertedDate": "string",
"lastModifiedDate": "string",
"checksum": "string",
"checksumType": "SHA1",
"documentId": "string",
"contentType": "string",
"userId": "string",
"contentLength": 0,
"versionId": "string",
"metadata": [
{
"key": "string",
"value": "string",
"values": ["string"]
}
],
"changed": {
"path": "previous/path.txt"
},
"attributes": {
"department": {
"stringValue": "Finance"
}
},
"url": "S3 presigned URL",
"addedAttributes": ["department"],
"changedAttributes": {
"status": {
"stringValue": "Approved"
}
}
}

Amazon SNS (Legacy)

Document SNS

FormKiQ also supports a legacy Amazon SNS notification mechanism for backward compatibility with systems that have not migrated to EventBridge.

Supported Event Types

TypeDetailTypeDescription
CONTENTDocument Create EventTriggered when a document is created or updated with new content.
DELETE_METADATADocument Delete MetadataTriggered when metadata is removed from a document.
SOFT_DELETE_METADATADocument Soft Delete MetadataTriggered when metadata is soft deleted for a document.

SNS Subscription Policy Filter

SNS subscribers can use a subscription filter policy to receive only selected event types. The subscription filter inspects the type message attribute.

Supported values:

  • create
  • delete
  • softDelete

Example filter for create events:

{
"type": ["create"]
}

Example filter for delete and soft-delete events:

{
"type": ["delete", "softDelete"]
}

Event Payload Schema

{
"siteId": "string",
"path": "string",
"s3bucket": "string",
"s3key": "string",
"type": "string",
"documentId": "string",
"content": "string",
"contentType": "string",
"userId": "string",
"url": "S3 presigned URL"
}

Document Checksum

FormKiQ supports S3 checksum validation for document uploads. When requesting an upload URL, you can provide a checksum and checksum type. Amazon S3 validates the checksum when the file is uploaded and rejects the upload if the checksum does not match.

Use checksums when:

  • Upload integrity is important.
  • Files are transferred over unreliable networks.
  • A source system already computes SHA checksums.
  • Regulated workflows require evidence that content was not altered during upload.

SHA-256 Example

Request an upload URL with SHA-256 checksum validation:

{
"path": "mydoc.txt",
"checksum": "6719766fe1a874fcf79c636a1be3ae37d0bf84ca08032c26fbd63f3fd837cda3",
"checksumType": "SHA256"
}

The API response returns upload headers that must be included when uploading the file to S3:

{
"headers": {
"x-amz-checksum-sha256": "Zxl2b+GodPz3nGNqG+OuN9C/hMoIAywm+9Y/P9g3zaM=",
"x-amz-sdk-checksum-algorithm": "SHA256"
},
"documentId": "09c10219...",
"url": "https://..."
}

Upload the file using the returned URL and headers:

curl -X PUT "<url from response>" \
-H "x-amz-checksum-sha256: Zxl2b+GodPz3nGNqG+OuN9C/hMoIAywm+9Y/P9g3zaM=" \
-H "x-amz-sdk-checksum-algorithm: SHA256" \
--upload-file ./mydoc.txt

Best Practices

Document Organization

Use consistent paths and naming conventions so documents remain easy to browse, search, and govern.

Good path examples:

  • clients/acme/contracts/master-service-agreement.pdf
  • finance/invoices/2026/05/invoice-1042.pdf
  • projects/alpha/design/final-specification.docx

Prefer attributes for business meaning instead of encoding every detail into the path. For example:

{
"key": "department",
"stringValue": "Legal"
}
{
"key": "status",
"stringValue": "In Review"
}
{
"key": "keywords",
"stringValues": ["contract", "renewal", "client-facing"]
}

Version Control

Use document versioning for records where history matters, such as legal contracts, financial reports, compliance documents, and policies.

Consider adding attributes that describe version state:

{
"key": "changeNotes",
"stringValue": "Updated payment terms"
}
{
"key": "approvedBy",
"stringValue": "Thomas Johnson"
}

Security

Configure access controls based on how documents are used.

  • Use RBAC for site-level and group-level access.
  • Use folder permissions when a subset of a site requires narrower access.
  • Use ABAC and OPA when document attributes should affect access decisions.
  • Use GOVERN access for data governance and document control roles.
  • Review user activity data where auditability is required.

For details, see Security.

Backup and Recovery

Plan recovery around the document content, metadata, search index, and audit data.

  • Review S3 versioning and lifecycle requirements.
  • Confirm DynamoDB Point-in-Time Recovery.
  • Plan OpenSearch snapshot and restore if enhanced search is installed.
  • Test restore procedures before production.
  • Align retention with compliance and operational requirements.

For details, see Backup and Recovery.

API Document Endpoints

Use the generated API reference for exact request and response schemas.

OperationPurposeAPI reference
Add documentCreate a document, optionally with small inline content.POST /documents
List documentsRetrieve recent or filtered documents.GET /documents
Get documentRetrieve metadata for a specific document.GET /documents/{documentId}
Update documentUpdate document metadata or content.PATCH /documents/{documentId}
Delete documentSoft delete or delete a document.DELETE /documents/{documentId}
Restore documentRestore a soft-deleted document.PUT /documents/{documentId}/restore
Get content URLRetrieve a presigned URL for document content.GET /documents/{documentId}/url
Get contentRetrieve text content directly or a content URL for binary content.GET /documents/{documentId}/content
Create upload URLCreate a presigned URL for uploading document content.POST /documents/upload
Get upload URLRetrieve a presigned URL for uploading content to an existing document.GET /documents/{documentId}/upload

POST /documents

Creates a document. This endpoint is useful for metadata-only documents and small inline content. For most files, use POST /documents/upload.

GET /documents

Lists documents the caller is authorized to access. Results can be filtered and paginated.

GET /documents/<documentId>

Retrieves metadata for a specific document.

PATCH /documents/<documentId>

Updates document metadata and, in supported cases, document content.

DELETE /documents/<documentId>

Deletes a document. Depending on parameters and configuration, this can be a soft delete or permanent delete.

GET /documents/<documentId>/url

Returns a presigned URL for downloading document content.

GET /documents/<documentId>/content

Returns direct text content for supported text content types or a presigned content URL for other files.

POST /documents/upload

Creates a document upload URL. This is the recommended path for most file uploads.

API Document Attribute Endpoints

OperationPurposeAPI reference
Add attributesAdd multiple attributes to a document.POST /documents/{documentId}/attributes
Get attributesList attributes for a document.GET /documents/{documentId}/attributes
Set attributesReplace or set document attributes.PUT /documents/{documentId}/attributes
Delete attributeDelete an attribute from a document.DELETE /documents/{documentId}/attributes/{attributeKey}

POST /documents/<documentId>/attributes

Adds standard attributes, classification attributes, or relationship attributes to a document.

GET /documents/<documentId>/attributes

Retrieves the attributes attached to a document.

API Document Action Endpoints

OperationPurposeAPI reference
Get actionsRetrieve a document's actions and current status.GET /documents/{documentId}/actions
Add actionRun an action such as OCR, full-text extraction, webhook, notification, IDP, EventBridge, or antivirus.POST /documents/{documentId}/actions
Retry actionRetry a failed or selected document action.POST /documents/{documentId}/actions/{actionId}/retry

GET /documents/<documentId>/actions

Retrieves the processing actions associated with a document.

POST /documents/<documentId>/actions

Adds a processing or integration action to a document. For action-specific parameters, use the generated API reference.

Where to Go Next