The moment has come - Vitastor S3 implementation based on Zenko CloudServer is released.
Highlights
- Zenko CloudServer is implemented in node.js.
- Object metadata is stored in MongoDB.
- Modified Zenko CloudServer version is used for Vitastor. It is slightly different from the original, has an optimised build and unneeded dependencies are stripped off.
- Object data is stored in Vitastor block volumes, but the volume metadata is stored in the same MongoDB, not in Vitastor etcd.
- Objects are written to volumes sequentially one after another. The space is allocated with rounding to the sector size (4 KB), so each object takes at least 4 KB.
- An important property of such storage scheme is that small objects aren’t chunked into parts in Vitastor EC N+K pools and thus don’t require reads from all N disks when downloading.
- Deleted objects are marked as deleted, but the space is only actually freed during asynchronously executed “defragmentation” process. Defragmentation runs automatically in the background when a volume reaches configured amount of “garbage” (20% by default). Defragmentation copies actual objects to new volume(s) and then removes the old volume. Defragmentation can be configured in locationConfig.json.
Plans for future development
- User account storage in the DB instead of a static file. Original Zenko uses a separate closed-source “Scality Vault” service for it, that’s why we use a static file for now.
- More detailed documentation.
- Support for other (and faster) key-value DBMS for object metadata storage.
- Other performance optimisations, for example, related to the used hash function - MD5 used for Amazon compatibility purposes is relatively slow.
- Object Lifecycle support. There is a Lifecycle implementation for Zenko called Backbeat but it’s not adapted for Vitastor yet.
- Quota support. Original Zenko uses a separate “SCUBA” service for quotas, but it’s also proprietary and not available publicly.
Installation
In a few words:
- Install MongoDB, create a user for S3 metadata DB.
- Create a Vitastor pool for S3 data.
- Download and setup the Docker container
vitalif/vitastor-zenko
.
Setup MongoDB
You can setup MongoDB yourself, following the MongoDB manual.
Or you can follow the instructions below - it describes a simple example of MongoDB setup in Docker (through docker-compose) with 3 replicas.
- On each host, create a file
docker-compose.yml
with the content listed below. Replace<YOUR_PASSWORD>
with your future mongodb administrator password, and optionally replace0.0.0.0
withlocalhost,<server_IP>
. It’s recommended to either use a private IP or setup TLS afterwards.
version: '3.1'
services:
mongo:
container_name: mongo
image: mongo:7-jammy
restart: always
environment:
MONGO_INITDB_ROOT_USERNAME: root
MONGO_INITDB_ROOT_PASSWORD: <YOUR_PASSWORD>
network_mode: host
volumes:
- ./keyfile:/opt/keyfile
- ./mongo-data/db:/data/db
- ./mongo-data/configdb:/data/configdb
entrypoint: /bin/bash -c
command: [ "chown mongodb /opt/keyfile && chmod 600 /opt/keyfile && . /usr/local/bin/docker-entrypoint.sh mongod --replSet rs0 --keyFile /opt/keyfile --bind_ip 0.0.0.0" ]
-
Generate a shared cluster key using
openssl rand -base64 756 > ./keyfile
and copy thatkeyfile
to all hosts. -
Start MongoDB on all hosts with
docker compose up -d mongo
. -
Enter Mongo Shell with
docker exec -it mongo mongosh -u root -p <YOUR_PASSWORD> localhost/admin
and execute the following command (replace IP addresses10.10.10.{1,2,3}
with your host IPs):
rs.initiate({ _id: 'rs0', members: [ { _id: 1, host: '10.10.10.1:27017' }, { _id: 2, host: '10.10.10.2:27017' }, { _id: 3, host: '10.10.10.3:27017' } ] })
- Stay in Mongo Shell and create a user for the future S3 database:
db.createUser({ user: 's3', pwd: '<YOUR_S3_PASSWORD>', roles: [ { role: 'readWrite', db: 's3' }, { role: 'dbAdmin', db: 's3' }, { role: 'readWrite', db: 'vitastor' }, { role: 'dbAdmin', db: 'vitastor' } ] })
Setup Vitastor
Create a pool in Vitastor for S3 object data, for example:
vitastor-cli create-pool --ec 2+1 -n 512 s3-data --used_for_app s3:standard
The --used_for_app
options works as fool-proofing and prevents you from
accidentally creating a regular block volume in the S3 pool and overwriting some S3 data.
Also it hides inode space statistics from Vitastor etcd.
Retrieve the ID of your pool with vitastor-cli ls-pools s3-data --detail
.
Setup Vitastor S3
- Add the following lines to
docker-compose.yml
(instead ofnetwork_mode: host
, you can useports: [ "8000:8000", "8002:8002" ]
):
zenko:
container_name: zenko
image: vitalif/vitastor-zenko
restart: always
security_opt:
- seccomp:unconfined
ulimits:
memlock: -1
network_mode: host
volumes:
- /etc/vitastor:/etc/vitastor
- /etc/vitastor/s3:/conf
-
Download Docker image:
docker pull vitalif/vitastor-zenko
-
Extract configuration file examples from the Docker image:
docker run --rm -it -v /etc/vitastor:/etc/vitastor -v /etc/vitastor/s3:/conf vitalif/vitastor-zenko configure.sh
-
Edit configuration files in
/etc/vitastor/s3/
:config.json
- common settings.authdata.json
- user accounts and access keys.locationConfig.json
- S3 storage class list with placement settings. Note: it actually contains storage classes (like STANDARD, COLD, etc) instead of “locations” (zones like us-east-1) as in the original Zenko CloudServer.- Put your MongoDB connection data into
config.json
andlocationConfig.json
. - Put your Vitastor pool ID into
locationConfig.json
. - For now, the complete list of Vitastor backend settings is only available in the code.
Start Zenko
Start the S3 server with:
docker run --restart always --security-opt seccomp:unconfined --ulimit memlock=-1 --network=host \
-v /etc/vitastor:/etc/vitastor -v /etc/vitastor/s3:/conf --name zenko vitalif/vitastor-zenko
If you use default settings, Zenko CloudServer starts on port 8000.
The default access key is accessKey1
with a secret key of verySecretKey1
.
Now you can access your S3 with, for example, s3cmd:
s3cmd --access_key=accessKey1 --secret_key=verySecretKey1 --host=http://localhost:8000 mb s3://testbucket
Or even mount it with GeeseFS:
AWS_ACCESS_KEY_ID=accessKey1 \
AWS_SECRET_ACCESS_KEY=verySecretKey1 \
geesefs --endpoint http://localhost:8000 testbucket mountdir
Author & License
- Zenko CloudServer author is Scality, licensed under Apache License, version 2.0
- Vitastor and Zenko Vitastor backend author is Vitaliy Filippov, licensed under VNPL-1.1 (a “network copyleft” license based on AGPL/SSPL, but worded in a better way)
- Vitastor S3 repository: https://git.yourcmc.ru/vitalif/zenko-cloudserver-vitastor
- Vitastor S3 backend code: https://git.yourcmc.ru/vitalif/zenko-arsenal/src/branch/master/lib/storage/data/vitastor/VitastorBackend.ts