Architecture

Architecture

Overview

FormKiQ is an API-first document management platform that can run in any Amazon Web Services (AWS) account.

Document management is at the core of most organizations, with numerous documents that need to be stored, tracked, managed and organized. FormKiQ strives to be the most flexible, customizable and cost-effective document management system available.

FormKiQ’s flexibility and customizability comes from its API-first design. All document management functionality (CRUD) is exposed through a robust set of APIs. This allows anyone to quickly and easily to add document management functionality to any application and cut months off of development time, while the FormKiQ Console enables organizations to have a full-featured document management system with as little or as much customization as they need.

FormKiQ was built to run in any AWS account. This ensures that you maintain full control of your documents 100% of the time.

Building From Source

FormKiQ-Core was built using Java and JavaScript languages. In order to build from source you will need to install the development tools listed below.

Required Development Tools

Running Build

FormKiQ-Core uses Gradle as the main build tool.

To compile:

./gradlew clean build

Database

FormKiQ uses Amazon DynamoDB as the database for storing document data. DynamoDB provides fast, flexible NoSQL database service for single-digit millisecond performance at any scale. It is also fully-managed by AWS, so there are no database servers to manage.

FormKiQ uses a few different DynamoDB tables to store its data. FormKiQ for the most part uses a single table design. Because all of the data is stored in a single table, querying and retrieving data is simplified, as there is no need to join multiple tables together. A single table design also allows for more efficient and flexible data access, as well as easier scalability and lower costs.

Multiple DynamoDB tables are used in situations where data can get stale or out dated and there needs to be data archival implemented.

Documents

The documents tables stores document records along with all metadata.

If siteId is used the PK is prefix with "{siteId}/"

Document Record

Key(s)

Key

Format

PK

docs#{documentId}

SK

document

GSI1PK

docts#{yyyy-MM-dd}

GSI1SK

{yyyy-MM-dd’T’HH:mm:ssZ}#{documentId}

Subdocument Record

Key(s)

Key

Format

PK

docs#{documentId}

SK

document#{childDocumentId}

GSI1PK

docts#{yyyy-MM-dd}

GSI1SK

{yyyy-MM-dd’T’HH:mm:ssZ}#{documentId}

Document Tag

Key

Format

PK

docs#{documentId}

SK

tags#{tagKey}

GSI1PK

tag#{tagKey}#{tagValue}

GSI1SK

{yyyy-MM-dd’T’HH:mm:ssZ}#{documentId}

GSI2PK

tag#{tagKey}

GSI2SK

{tagValue}#{yyyy-MM-dd’T’HH:mm:ssZ}#{documentId}

Document Tag (Multi-Value)

Key

Format

PK

docs#{documentId}

SK

tags#{tagKey}#idx{index}

GSI1PK

tag#{tagKey}#{tagValue}

GSI1SK

{yyyy-MM-dd’T’HH:mm:ssZ}#{documentId}

GSI2PK

tag#{tagKey}

GSI2SK

{tagValue}#{yyyy-MM-dd’T’HH:mm:ssZ}#{documentId}

Document Actions

Key

Format

PK

docs#{documentId}

SK

action#{index}#{type}

Document Syncs

The syncs tables records document synchronization timestamps from external services.

Key

Format

PK

docs#{documentId}

SK

syncs#{yyyy-MM-dd’T’HH:mm:ssZ}#{UUID}

Data Caching

The data caching table is a table for temporary holding of data.

Storage

FormKiQ uses the Amazon Simple Storage Service (Amazon S3) as the backend object store for all documents. Amazon S3 is a manage object storage service that offers industry-leading scalability, data availability, security, and performance.

Amazon S3 is a cost-effective storage solution that’s easy-to-use, supports multiple storage classes for cost optimization, and allows for fine-tuned access controls to meet specific business, organizational, and compliance requirements.

By default FormKiQ installs with two S3 buckets.

Bucket

Description

Staging

A temporary holding place for documents waiting for processing

Documents

The permanent post-processing document storage

Path Layout

FormKiQ is a multi-tenant application, so a specific S3 key structure is used to identify which tenant owns the document.

Documents added to ROOT

Any documents that are added to the "ROOT" of the S3 bucket, e.g. a document with S3 key of document1.txt, are assumed to be part of the DEFAULT siteId.

Documents can also be added to the DEFAULT siteId if the key starts with default, e.g. S3 key of default/document1.txt.

Documents added to SiteId

Documents can be added to a specific siteId by having that siteId as the first "folder" of they key, e.g. S3 key of group1/document1.txt will add the document1.txt to the group1 siteId.

Documents with a PATH

As of version 1.7.0, documents can be added and have a path tag automatically created. Following the same pattern as above EXCEPT the S3 key MUST start with either default or the siteId path.

For examples:

S3 key of default/dir1/dir2/document1.txt will add a document with a path tag of dir1/dir2/document1.txt to the default siteId.

S3 key of group1/dir1/dir2/document2.txt will add a document with a path tag of dir1/dir2/document2.txt to the group1 siteId.

Add Document Workflow

S3 Architecture

Documents can be added to S3 via the FormKiQ API or directly to the Staging S3 bucket. While it is recommended to only use the API for your standard workflow, it can be advantageous to add documents directly to the Staging S3 bucket, for operations such as initial document migration.

When a document is added to the Staging S3 bucket, an S3 object create event is created that calls the Document Create AWS Lambda. The Document Create Lambda writes a record to Amazon DynamoDB, and moves the document to the Documents S3 bucket.

Once the document is added to the Documents S3 bucket, another S3 event is created which adds a message to the Update Document Amazon SQS queue. An Update Document Lambda is listening to the Update Document SQS queue and adds and updates document metadata whenever an event is added to the queue. Any S3 object tags that have been specified will also be included as document metadata.

Each time a document is create or updated the AWS Lambda function also posts a message to Amazon Simple Notification Service, which can be used to trigger additional document processing.

FKB64 File Format

For initial document migration or other occasional uses, the Staging S3 bucket does allow direct uploads using a internal file format.

Writing files directly to the Documents S3 bucket (i.e., not the Staging bucket) is NOT supported and may cause stability issues.

As of version 1.7.0, you can use the S3 Layout describe above if the S3 key ends in .fkb64

For example creating the following JSON and saving it as document1.fkb64 in the ROOT of the Staging bucket will add the content field as a document in the default siteId.

Required fields are marked below.

{
  "path": "document1.txt",
  "userId": "joesmith", // <required>
  "contentType": "text/plain", // <required>
  "isBase64": true, // <required>
  "content": "dGhpcyBpcyBhIHRlc3Q=", // <required>
  "tags": [
    {
      "key": "category",
      "value": "document"
    },
    {
      "key": "user",
      "values": ["1", "2"]
    }
  ],
  "metadata": [
    {
      "key": "property1",
      "value": "value1"
    }
  ]
}

Note: The .fkb64 matches the Add Document Request. Refer to the API for a listing of all properties.

API

The API is built using Amazon API Gateway. Amazon API Gateway is a fully-managed service that handles all of the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, CORS support, authorization and access control, throttling, and monitoring.

FormKiQ deploys with two APIs. One API is deployed with JWT authentication using Amazon Cognito as the JWT authorizer.

A second identical API is deployed using AWS Identity and Access Management (IAM).

The JWT-authenticated API is great for handling users requests, while the IAM-authenticated API is great for machine-to-machine or backend processing.

All endpoints require either Cognito / IAM Authentication unless the URL starts with /public; the /public endpoint can be used to allow publicly-submitted documents such as web forms.

Services

The following is a list of external or 3rd party services FormKiQ uses.

Typesense

Typesense Architecture

Typesense is an open source search solution and can be used a replacement for Elastic search. FormKiQ uses it to provide fulltext search ability for document metadata.

FormKiQ uses change data capture for DynamoDB to recorder all data changes in DynamoDB and then update Typesense.

Document Events

Document events are a powerful feature of FormKiQ. These events allow operations to be triggered on documents automatically, whenever a change occurs. For example, when a document is created, a document event can be triggered to perform one or many actions, such as:

  • sending an email notification

  • scanning for viruses

  • inserting data into a database

  • etc.

Document event are created and sent through Amazon Simple Notification Service (SNS). Amazon SNS is a messaging service that can be used for application-to-application communication. FormKiQ uses it as a publish/subscribe service, where applications can listen to the SNS service and be notified about different document events.

FormKiQ creates a single SnsDocumentEvent topic where all document events are sent. You can then use Amazon SNS subscription filter policies to set up actions for a specific type of event.

FormKiQ provides the following message attributes that you can filter on:

Message Attribute

Possible Value(s)

Description

type

create, delete, update

Document Event(s) for create, update, or delete document

siteId

default, (custom siteId)

Site Tenant Document Event was created in