Running Watchful Scale in Production
Overview
The Watchful product is a set of containerized applications designed to work together to allow a team of annotators to quickly and efficiently label data. The product consists of two parts - the application (APP) and the hub (HUB). Both components are distributed as docker images stored in AWS ECR and can be run using the container orchestration tool of your choice.
Versioning
We follow the SemVer versioning system using the <major>.<minor>.<patch>
pattern. All updates will be announced with Release Notes to let you know what has changed. Major and minor updates will also contain Upgrade notes, if necessary.
Pulling the Images
Component | URL |
---|---|
App | 610410161133.dkr.ecr.us-west-1.amazonaws.com/production/watchful:{TAG} |
Hub | 610410161133.dkr.ecr.us-west-1.amazonaws.com/production/watchful-hub:{TAG} |
We do not offer a latest
tag on the images. You must specify the major, minor, and patch versions when pulling images.
This guide assumes you have downloaded and configured the AWS CLI and Docker.
## Login
aws ecr get-login-password --region us-west-1 | \
docker login \
--username AWS \
--password-stdin \
610410161133.dkr.ecr.us-west-1.amazonaws.com
## Pull the image
docker pull \
610410161133.dkr.ecr.us-west-1.amazonaws.com/production/watchful-hub:3.0.1234
Updates
The Watchful product is designed to be run on air-gapped systems as well as those connected to the outside world. As we publish new versions, the Customer Support team will notify you of the new version and you can always check the Product Release Notes for the latest version and what is in it.
To take the new version, you must update the tag you’re using to pull the images from ECR. We will never push updates to your system.
Recommendations
When updating the App or Hub, we recommend taking the following steps:
- Inform users of an impending update and ask them to stop any active sessions
- Backup the persistent volume storage attached to each instance
- Update your manifest files to point to the new tagged version
- Restart the running containers
- Verify persistent volumes are re-attached as necessary
- Inform users they can restart their sessions (they will need to re-authenticate and then reload projects)
Logs
Logs for both the App and the Hub are shipped to STDERR
. You can change the log level at runtime for both the App and the Hub by setting the WATCHFUL_LOG
environment variable (described later in this document).
Running the App
The App is the core of the Watchful product and contains the all the components necessary for an individual to complete a labeling task. The App can be run in “standalone” mode and does not require the Hub. Without the Hub, however, sharing projects will not be available.
The App is a self-contained application written in Rust containing a server, the core functionality, and a frontend that are all compiled and distributed as a single binary within the Docker image.
Requirements
- Each unique Watchful user requires a dedicated App instance. Do not attempt to put App instances behind a load balancer or otherwise share instances by different people during concurrent sessions.
- The Watchful App works by keeping project state in memory and by writing project files to the local file system. Ensure app instances are durable (NOT ephemeral) and are backed by persistent storage.
- For general workloads, App instances require a minimum of four (4) dedicated CPU cores. To determine memory requirements, look at the largest data set you’ll label and allocate 4x that amount in RAM. For a 200mb project, you’ll need at least 800mb of RAM per instance.
Docker
Running the app via Docker is a great way to work on a dataset locally. To run the app,
## Login
aws ecr get-login-password --region us-west-1 | \
docker login \
--username AWS \
--password-stdin \
610410161133.dkr.ecr.us-west-1.amazonaws.com
## Run the Container
docker run \
--name watchful \
-d -p 9001:9001 \
-v ~/watchful:/root/watchful \
-e WATCHFUL_LOG=info \
610410161133.dkr.ecr.us-west-1.amazonaws.com/production/watchful:3.0.1234
The following environment variables are available when running the App:
Name | Required | Key | Description |
---|---|---|---|
Log level | N | WATCHFUL_LOG | Set the log level you desire. Available options are info , trace , error , warn , and debug . The default is info . |
Running the Hub
What is Watchful Hub?
The Hub is responsible for facilitating project collaboration, user management, and sharing. It is a self-contained application written in Rust containing a server, the core functionality, and a frontend that are all compiled and distributed as a single binary within the Docker image.
Requirements
- The Watchful Hub requires a durable file system for project storage. Additionally, it uses a SQLite database to track users and permissions. Ensure the attached volume will persist after the container restarts.
- The Hub is designed to be stateless and can handle restarts. However, like the app, do not run the Hub behind a load balancer as both the database and file system access need to be dedicated to the single running Hub instance.
- For general workloads, Hub instances require a single dedicated CPU core and 1GB of RAM. Provision enough disk storage to handle 4x the total volume of projects you’ll run across all App instances.
Docker
Running the Hub via Docker is fairly straightforward, though won’t open much functionality on its own. To run the Hub via Docker,
## Login
aws ecr get-login-password --region us-west-1 | \
docker login \
--username AWS \
--password-stdin \
610410161133.dkr.ecr.us-west-1.amazonaws.com
## Run the container
docker run \
--name watchful-hub \
-d -p 9005:9005 \
-v ~/hub:/root \
-e DATABASE_URL=/root/watchful.db \
-e SHARED_SECRET=$(openssl rand -hex 36) \
-e AWS_ACCESS_KEY_ID=OMITTED \
-e AWS_SECRET_ACCESS_KEY=OMITTED \
-e AWS_REGION=OMITTED \
-e AWS_CUSTOMER_BUCKET=OMITTED \
-e WATCHFUL_LOG=info \
610410161133.dkr.ecr.us-west-1.amazonaws.com/production/watchful-hub:3.0.1234
Note:
When running Watchful within Azure, be sure to pay attention to the recommended mount options for Azure Kubernetes found here. Specifically, use thenobrl
mount option. Failure to do so can result in a "database is locked"
error that is the result of a windows explorer process sending a byte-range lock to the sqlite database file.
The following environment variables are available when running the Hub
Name | Required | Key | Description |
---|---|---|---|
Database URL | Y | DATABASE_URL | Path to the SQLite database. This must be a path to the mounted, durable storage volume or you will lose data every time you update the hub. |
Shared Secret | Y | SHARED_SECRET | A seed value used in authenticating and authorizing users. Any sufficient length random string is fine for a value.$(openssl rand -hex 36) will generate a secure random string for you. |
AWS Access Key | N | AWS_ACCESS_KEY_ID | Optional. Set this value if you want to use AWS S3 Buckets for data import + export. |
AWS secret key | N | AWS_SECRET_ACCESS_KEY | Optional. Set this value if you want to use AWS S3 Buckets for data import + export. |
AWS storage region | N | AWS_REGION | Optional. Set this value if you want to use AWS S3 Buckets for data import + export. |
AWS customer bucket | N | AWS_CUSTOMER_BUCKET | Optional. Set this value if you want to use AWS S3 Buckets for data import + export. |
Log level | N | WATCHFUL_LOG | Set the log level you desire. Available options are info , trace , error , warn , and debug . The default is info . |
Common deployment scenarios
The following sections show how to run an App and Hub instance in both Kubernetes and via Docker Compose
Docker Compose
version: '3'
services:
hub:
image: 610410161133.dkr.ecr.us-west-1.amazonaws.com/production/watchful-hub:3.0.1234
ports:
- "9005"
volumes:
- ~/hub/:/root/remote/
environment:
- CUSTOMER_ID=thisisanid
- WATCHFUL_KEY=thisisakey
- WATCHFUL_SECRET=thisisasecret
- SHARED_SECRET=thisisnotasecurenorrandomsecret
- DATABASE_URL=/root/watchful.db
- AUTH_CONF=/app/auth/auth_model.conf
app:
image: 610410161133.dkr.ecr.us-west-1.amazonaws.com/production/watchful:3.0.1234
ports:
- "9001:9001"
volumes:
- ~/watchful/:/root/watchful/
Kubernetes
apiVersion: v1
kind: Namespace
metadata:
labels:
app: watchful
name: watchful
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app: watchful
name: watchful-secret-robot
namespace: watchful
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
labels:
app: watchful
name: role-watchful-secret-robot
namespace: watchful
rules:
- apiGroups:
- ""
resources:
- secrets
verbs:
- create
- delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
app: watchful
name: role-watchful-secret-robot-binding
namespace: watchful
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: role-watchful-secret-robot
subjects:
- kind: ServiceAccount
name: watchful-secret-robot
namespace: watchful
---
apiVersion: v1
kind: Service
metadata:
labels:
app: watchful
name: watchful-app
namespace: watchful
spec:
ports:
- name: app-port
port: 9001
selector:
app: watchful
type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
labels:
app: watchful
name: watchful-hub
namespace: watchful
spec:
ports:
- name: hub-port
port: 9005
selector:
app: watchful
type: LoadBalancer
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app: watchful
name: watchful-app-pvc
namespace: watchful
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app: watchful
name: watchful-hub-pvc
namespace: watchful
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: watchful
name: watchful-app
namespace: watchful
spec:
replicas: 1
selector:
matchLabels:
app: watchful
template:
metadata:
labels:
app: watchful
spec:
automountServiceAccountToken: false
containers:
- image: 610410161133.dkr.ecr.us-west-1.amazonaws.com/production/watchful:3.0.1234
imagePullPolicy: Always
name: watchful-app
ports:
- containerPort: 9001
volumeMounts:
- mountPath: /root/watchful
name: watchful-app-storage
volumes:
- name: watchful-app-storage
persistentVolumeClaim:
claimName: watchful-app-pvc
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: watchful
name: watchful-hub
namespace: watchful
spec:
replicas: 1
selector:
matchLabels:
app: watchful
template:
metadata:
labels:
app: watchful
spec:
automountServiceAccountToken: false
containers:
- env:
- name: DATABASE_URL
value: /root/watchful.db
- name: SHARED_SECRET
value: <insert_value>
- name: AWS_CUSTOMER_BUCKET
value: null
- name: AWS_ACCESS_KEY_ID
value: null
- name: AWS_SECRET_ACCESS_KEY
value: null
- name: AWS_REGION
value: us-west-1
image: 610410161133.dkr.ecr.us-west-1.amazonaws.com/production/watchful-hub:3.0.1234
imagePullPolicy: Always
livenessProbe:
httpGet:
path: /ruok
port: 9005
periodSeconds: 5
name: watchful-hub
ports:
- containerPort: 9005
volumeMounts:
- mountPath: /root
name: watchful-hub-storage
volumes:
- name: watchful-hub-storage
persistentVolumeClaim:
claimName: watchful-hub-pvc
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
labels:
app: watchful
name: ecr-cred-helper
namespace: watchful
spec:
concurrencyPolicy: Allow
failedJobsHistoryLimit: 1
jobTemplate:
metadata:
creationTimestamp: null
labels:
app: watchful
spec:
template:
metadata:
creationTimestamp: null
labels:
app: watchful
spec:
containers:
- command:
- /bin/sh
- -c
- |-
SECRET_NAME=watchful-${AWS_REGION}-ecr-registry
[email protected]
TOKEN=`aws ecr get-login --region ${AWS_REGION} --registry-ids ${ACCOUNT_ID} | cut -d' ' -f6`
echo "ENV variable setup done"
kubectl delete secret --ignore-not-found $SECRET_NAME
kubectl create secret docker-registry $SECRET_NAME \
--docker-server=https://${ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com \
--docker-username=AWS \
--docker-password="${TOKEN}" \
--docker-email="${EMAIL}"
echo "Secret created with name: $SECRET_NAME"
echo "Finished"
env:
- name: ACCOUNT_ID
value: "610410161133"
- name: AWS_REGION
value: us-west-1
- name: AWS_ACCESS_KEY_ID
value: < Watchful, Inc. will provide this value to you >
- name: AWS_SECRET_ACCESS_KEY
value: < Watchful, Inc. will provide this value to you >
image: odaniait/aws-kubectl:latest
imagePullPolicy: IfNotPresent
name: ecr-cred-helper
resources: {}
securityContext:
capabilities: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: Default
hostNetwork: true
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
serviceAccountName: watchful-secret-robot
terminationGracePeriodSeconds: 30
schedule: 0 */6 * * *
successfulJobsHistoryLimit: 3
suspend: false
Updated about 1 year ago