OpenMCF logoOpenMCF

Loading...

AWS FSx Lustre File System

Deploys an Amazon FSx for Lustre file system with configurable deployment type, storage capacity, throughput tiers, optional S3 data integration, CloudWatch audit logging, and automatic backups. The component supports both ephemeral scratch file systems for temporary high-performance processing and persistent file systems with intra-AZ data replication and backup support.

What Gets Created

When you deploy an AwsFsxLustreFileSystem resource, OpenMCF provisions:

  • FSx for Lustre File System — an aws_fsx_lustre_file_system resource placed in the specified subnet with the configured deployment type, storage capacity, encryption settings, and optional S3 import/export, CloudWatch log configuration, backup schedule, and metadata performance tuning

Prerequisites

  • AWS credentials configured via environment variables or OpenMCF provider config
  • A subnet in the target Availability Zone — Lustre file systems are single-AZ, exactly one subnet is required
  • A security group allowing Lustre traffic between the file system and its clients: TCP port 988 (Lustre protocol) and TCP ports 1018-1023 (data channels)
  • A KMS key ARN if using customer-managed encryption at rest (all Lustre file systems are encrypted by default with an AWS-managed key)
  • A CloudWatch Logs log group with an FSx resource policy if enabling audit logging
  • An S3 bucket if configuring import/export paths on scratch file systems

Quick Start

Create a file fsx-lustre.yaml:

apiVersion: aws.openmcf.org/v1
kind: AwsFsxLustreFileSystem
metadata:
  name: my-fsx-lustre
  labels:
    openmcf.org/provisioner: pulumi
    pulumi.openmcf.org/organization: my-org
    pulumi.openmcf.org/project: my-project
    pulumi.openmcf.org/stack.name: dev.AwsFsxLustreFileSystem.my-fsx-lustre
spec:
  storageCapacityGib: 1200
  subnetId: subnet-0a1b2c3d4e5f00001

Deploy:

openmcf apply -f fsx-lustre.yaml

This creates a SCRATCH_2 SSD file system with 1200 GiB in the specified subnet. No data replication, no backups — suitable for temporary processing workloads.

Configuration Reference

Required Fields

FieldTypeDescriptionValidation
regionstringAWS region where the FSx Lustre file system will be created (e.g., us-east-1).Required; non-empty
storageCapacityGibint32Storage capacity in GiB. Valid increments depend on deployment type and storage type. Can be increased after creation but never decreased.Minimum 1200
subnetIdstringSubnet ID for the file system's network interface. Lustre is single-AZ — exactly one subnet. ForceNew.Required
subnetId.valuestringDirect subnet ID value—
subnetId.valueFromobjectForeign key reference to an AwsVpc resourceDefault kind: AwsVpc

Optional Fields

FieldTypeDefaultDescription
deploymentTypestringSCRATCH_2Deployment type controlling durability and performance. ForceNew. Valid: SCRATCH_1, SCRATCH_2, PERSISTENT_1, PERSISTENT_2.
storageTypestringSSDStorage media type. ForceNew. SSD for sub-millisecond latency. HDD for lower cost — only available with PERSISTENT_1.
perUnitStorageThroughputint32—Throughput in MB/s/TiB. Required for PERSISTENT types, invalid for SCRATCH. PERSISTENT_1 + SSD: 50, 100, 200. PERSISTENT_1 + HDD: 12, 40. PERSISTENT_2 + SSD: 125, 250, 500, 1000.
dataCompressionTypestringNONEData compression algorithm. NONE or LZ4. Can be changed after creation without impact to Lustre operations.
fileSystemTypeVersionstring—Lustre version (e.g., 2.12, 2.15). ForceNew. Leave empty for the latest version supported by the deployment type.
securityGroupIdsstring[][]Security group IDs attached to the file system ENI. ForceNew. Must allow TCP 988 and 1018-1023. Can reference AwsSecurityGroup resources via valueFrom. Up to 50.
kmsKeyIdstring—Customer-managed KMS key ARN for encryption at rest. ForceNew. When omitted, uses the AWS-managed FSx key. Can reference AwsKmsKey resource via valueFrom.
importPathstring—S3 URI to import data from (e.g., s3://my-bucket/prefix). ForceNew. SCRATCH_1 and SCRATCH_2 only.
exportPathstring—S3 URI for exporting data back to S3. ForceNew. Requires importPath to be set.
logConfiguration.destinationstring—CloudWatch Logs log group ARN for audit events. Can reference AwsCloudwatchLogGroup resource via valueFrom.
logConfiguration.levelstringWARN_ERRORAudit log level. Valid: DISABLED, WARN_ONLY, ERROR_ONLY, WARN_ERROR.
automaticBackupRetentionDaysint320Days to retain automatic backups. Range: 0-90. Set to 0 to disable. PERSISTENT deployments only.
dailyAutomaticBackupStartTimestring—UTC time to start daily backups in HH:MM format (e.g., 05:00).
copyTagsToBackupsboolfalseCopy file system tags to automatic backups. ForceNew.
skipFinalBackupbooltrueSkip creating a final backup on deletion. PERSISTENT deployments only.
weeklyMaintenanceStartTimestring—Weekly UTC maintenance window in d:HH:MM format where d is 1=Monday through 7=Sunday (e.g., 1:05:00 for Monday 05:00 UTC).
metadataConfiguration.modestringAUTOMATICMetadata IOPS mode. PERSISTENT_2 only. AUTOMATIC scales with storage capacity. USER_PROVISIONED allows explicit IOPS.
metadataConfiguration.iopsint32—Metadata IOPS when mode is USER_PROVISIONED. Valid values: 1500 through 192000 in documented increments. Ignored in AUTOMATIC mode.

Examples

Scratch File System with S3 Import

A temporary file system that imports data from S3 for batch processing jobs:

apiVersion: aws.openmcf.org/v1
kind: AwsFsxLustreFileSystem
metadata:
  name: batch-fsx
  labels:
    openmcf.org/provisioner: pulumi
    pulumi.openmcf.org/organization: my-org
    pulumi.openmcf.org/project: my-project
    pulumi.openmcf.org/stack.name: dev.AwsFsxLustreFileSystem.batch-fsx
spec:
  region: us-west-2
  storageCapacityGib: 3600
  subnetId: subnet-private-az1
  securityGroupIds:
    - sg-lustre-clients
  dataCompressionType: LZ4
  importPath: s3://my-data-bucket/training-data
  exportPath: s3://my-data-bucket/results

Persistent High-Throughput for ML Training

PERSISTENT_2 with maximum throughput tier, LZ4 compression, automatic backups, and metadata IOPS scaling for production ML workloads:

apiVersion: aws.openmcf.org/v1
kind: AwsFsxLustreFileSystem
metadata:
  name: ml-training-fsx
  labels:
    openmcf.org/provisioner: pulumi
    pulumi.openmcf.org/organization: my-org
    pulumi.openmcf.org/project: my-project
    pulumi.openmcf.org/stack.name: prod.AwsFsxLustreFileSystem.ml-training-fsx
spec:
  region: us-east-1
  deploymentType: PERSISTENT_2
  storageCapacityGib: 4800
  storageType: SSD
  perUnitStorageThroughput: 1000
  dataCompressionType: LZ4
  subnetId: subnet-private-az1
  securityGroupIds:
    - sg-lustre-ml
  kmsKeyId: arn:aws:kms:us-east-1:123456789012:key/mrk-example
  automaticBackupRetentionDays: 7
  dailyAutomaticBackupStartTime: "04:00"
  copyTagsToBackups: true
  weeklyMaintenanceStartTime: "7:03:00"
  metadataConfiguration:
    mode: AUTOMATIC

HDD Data Lake with Cost-Optimized Storage

PERSISTENT_1 HDD for large-capacity, sequential-throughput workloads where cost per GiB is the primary concern:

apiVersion: aws.openmcf.org/v1
kind: AwsFsxLustreFileSystem
metadata:
  name: datalake-fsx
  labels:
    openmcf.org/provisioner: pulumi
    pulumi.openmcf.org/organization: my-org
    pulumi.openmcf.org/project: my-project
    pulumi.openmcf.org/stack.name: prod.AwsFsxLustreFileSystem.datalake-fsx
spec:
  deploymentType: PERSISTENT_1
  storageCapacityGib: 6000
  storageType: HDD
  perUnitStorageThroughput: 12
  dataCompressionType: LZ4
  subnetId: subnet-private-az1
  securityGroupIds:
    - sg-lustre-data
  automaticBackupRetentionDays: 14
  dailyAutomaticBackupStartTime: "02:00"
  copyTagsToBackups: true
  weeklyMaintenanceStartTime: "1:05:00"
  logConfiguration:
    destination: arn:aws:logs:us-east-1:123456789012:log-group:/aws/fsx/datalake
    level: WARN_ERROR

Full-Featured with Logging and Custom Metadata IOPS

Production PERSISTENT_2 deployment with CloudWatch audit logging, customer-managed KMS encryption, explicit metadata IOPS, and final backup on deletion:

apiVersion: aws.openmcf.org/v1
kind: AwsFsxLustreFileSystem
metadata:
  name: prod-fsx
  labels:
    openmcf.org/provisioner: pulumi
    pulumi.openmcf.org/organization: my-org
    pulumi.openmcf.org/project: my-project
    pulumi.openmcf.org/stack.name: prod.AwsFsxLustreFileSystem.prod-fsx
spec:
  region: us-east-1
  deploymentType: PERSISTENT_2
  storageCapacityGib: 7200
  storageType: SSD
  perUnitStorageThroughput: 500
  dataCompressionType: LZ4
  fileSystemTypeVersion: "2.15"
  subnetId: subnet-private-az1
  securityGroupIds:
    - sg-lustre-prod
    - sg-lustre-admin
  kmsKeyId: arn:aws:kms:us-east-1:123456789012:key/mrk-prod-key
  automaticBackupRetentionDays: 30
  dailyAutomaticBackupStartTime: "03:00"
  copyTagsToBackups: true
  skipFinalBackup: false
  weeklyMaintenanceStartTime: "7:02:00"
  logConfiguration:
    destination: arn:aws:logs:us-east-1:123456789012:log-group:/aws/fsx/prod
    level: WARN_ERROR
  metadataConfiguration:
    mode: USER_PROVISIONED
    iops: 12000

Using Foreign Key References

Reference other OpenMCF-managed resources instead of hardcoding IDs:

apiVersion: aws.openmcf.org/v1
kind: AwsFsxLustreFileSystem
metadata:
  name: ref-fsx
  labels:
    openmcf.org/provisioner: pulumi
    pulumi.openmcf.org/organization: my-org
    pulumi.openmcf.org/project: my-project
    pulumi.openmcf.org/stack.name: prod.AwsFsxLustreFileSystem.ref-fsx
spec:
  region: us-west-2
  deploymentType: PERSISTENT_2
  storageCapacityGib: 2400
  storageType: SSD
  perUnitStorageThroughput: 250
  subnetId:
    valueFrom:
      kind: AwsVpc
      name: my-vpc
      field: status.outputs.private_subnets[0].id
  securityGroupIds:
    - valueFrom:
        kind: AwsSecurityGroup
        name: lustre-sg
        field: status.outputs.security_group_id
  kmsKeyId:
    valueFrom:
      kind: AwsKmsKey
      name: fsx-key
      field: status.outputs.key_arn
  logConfiguration:
    destination:
      valueFrom:
        kind: AwsCloudwatchLogGroup
        name: fsx-logs
        field: status.outputs.log_group_arn
    level: WARN_ERROR

Presets

OpenMCF includes preset configurations for common FSx Lustre deployment patterns. Each preset is a ready-to-customize manifest with placeholder values for subnet and security group IDs.

Scratch Development

File: presets/01-scratch-development.yaml

SCRATCH_2 SSD with 1200 GiB — the smallest and cheapest Lustre configuration. No data replication, no backups. Use for development and test environments, short-lived batch processing, CI/CD scratch space, and experiments where data loss is acceptable.

Persistent High Throughput

File: presets/02-persistent-high-throughput.yaml

PERSISTENT_2 SSD with 2400 GiB and 1000 MB/s/TiB throughput. LZ4 compression, 7-day automatic backups at 04:00 UTC, AUTOMATIC metadata IOPS, and Sunday 03:00 UTC maintenance window. Use for distributed ML training, HPC simulations, video rendering, and production workloads requiring maximum I/O performance with data durability.

Persistent Capacity Data Lake

File: presets/03-persistent-capacity-datalake.yaml

PERSISTENT_1 HDD with 6000 GiB and 12 MB/s/TiB throughput. LZ4 compression, 14-day automatic backups at 02:00 UTC, and Monday 05:00 UTC maintenance window. Use for data lake staging, genomics pipelines, log analysis, and workloads where cost per GiB matters more than latency.

Stack Outputs

After deployment, the following outputs are available in status.outputs:

OutputTypeDescription
file_system_idstringID of the file system (e.g., fs-0123456789abcdef0). Primary identifier for EKS PersistentVolumes, ECS task definitions, and AWS Batch compute environments.
file_system_arnstringARN of the file system. Used in IAM policies and for creating data repository associations.
dns_namestringDNS name for the file system (e.g., fs-0123456789abcdef0.fsx.us-east-1.amazonaws.com). Used in mount commands with mount_name.
mount_namestringLustre mount name (e.g., fsx or 2p5wpbwj). Auto-generated by AWS. Mount command: mount -t lustre <dns_name>@tcp:/<mount_name> /mnt/fsx.
network_interface_idsstring[]Network interface IDs created for the file system. Lustre creates one ENI in the specified subnet. Useful for security group debugging and network troubleshooting.
vpc_idstringVPC ID in which the file system was created. Computed from the subnet.
file_system_type_versionstringActual Lustre version deployed (e.g., 2.12, 2.15). May differ from the requested version if the field was left empty.
owner_idstringAWS account ID of the file system owner.

Related Components

  • AwsElasticFileSystem — alternative managed file system for general-purpose NFS workloads
  • AwsVpc — provides the subnet for file system placement
  • AwsSecurityGroup — controls Lustre protocol traffic (TCP 988, 1018-1023) to the file system
  • AwsKmsKey — provides a customer-managed encryption key for data at rest
  • AwsCloudwatchLogGroup — receives audit log events from the file system

Next article

AWS FSx ONTAP Storage VM

AWS FSx ONTAP Storage VM Deploys a Storage Virtual Machine (SVM) on an existing FSx for NetApp ONTAP file system, providing multi-protocol data access endpoints for NFS, iSCSI, and optionally SMB via Active Directory integration. The SVM serves as the data access layer and parent container for ONTAP volumes. What Gets Created When you deploy an AwsFsxOntapStorageVirtualMachine resource, OpenMCF provisions: ONTAP Storage Virtual Machine — an awsfsxontapstoragevirtualmachine resource within the...
Read next article
Presets
3 ready-to-deploy configurationsView presets →