Tarek Cheikh
Founder & AWS Cloud Architect
SQS and SNS are the two core messaging services on AWS. SQS is a message queue -- producers send messages, consumers poll and process them. SNS is a pub/sub service -- publishers send messages to a topic, and all subscribers receive a copy. Together, they enable decoupled, event-driven architectures where services communicate asynchronously without direct dependencies.
This article covers both services from fundamentals to production patterns: queue types, message lifecycle, dead-letter queues, SNS fan-out, message filtering, Lambda integration, FIFO ordering, encryption, pricing, and the architectural patterns that determine reliability.
# SQS message lifecycle:
#
# 1. Producer sends message to queue
# 2. Message is stored redundantly across multiple AZs
# 3. Consumer polls the queue and receives the message
# 4. Message becomes invisible (visibility timeout starts)
# 5. Consumer processes the message
# 6. Consumer deletes the message from the queue
# 7. If consumer fails to delete within visibility timeout,
# message becomes visible again for another consumer to retry
#
# Key properties:
# - At-least-once delivery (standard queues)
# - Exactly-once processing (FIFO queues)
# - Message retention: 1 minute to 14 days (default 4 days)
# - Max message size: 256 KB (use S3 for larger payloads)
# - Unlimited throughput for standard queues
# - 300 msg/s for FIFO (3,000 msg/s with batching and high throughput mode)
# Create a standard queue
aws sqs create-queue \
--queue-name order-processing \
--attributes '{
"VisibilityTimeout": "60",
"MessageRetentionPeriod": "1209600",
"ReceiveMessageWaitTimeSeconds": "20",
"DelaySeconds": "0"
}'
# VisibilityTimeout (default 30s, max 12 hours):
# How long a message is hidden after a consumer receives it.
# Set this to at least 6x your average processing time.
# If processing takes 10 seconds, set visibility timeout to 60 seconds.
#
# MessageRetentionPeriod (default 345600 = 4 days, max 1209600 = 14 days):
# How long unprocessed messages are kept in the queue.
#
# ReceiveMessageWaitTimeSeconds (default 0, max 20):
# Long polling wait time. Set to 20 to reduce empty responses and API costs.
# With 0 (short polling), SQS responds immediately even if no messages exist.
# With 20 (long polling), SQS waits up to 20 seconds for a message to arrive.
#
# DelaySeconds (default 0, max 900 = 15 minutes):
# Messages are invisible for this duration after being sent.
# Send a message
aws sqs send-message \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/order-processing \
--message-body '{"order_id":"ORD-001","amount":99.50,"customer":"C-123"}' \
--message-attributes '{
"order_type": {"DataType":"String","StringValue":"standard"},
"priority": {"DataType":"Number","StringValue":"5"}
}'
# Send a batch (up to 10 messages per call)
aws sqs send-message-batch \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/order-processing \
--entries '[
{"Id":"1","MessageBody":"{\"order_id\":\"ORD-001\"}"},
{"Id":"2","MessageBody":"{\"order_id\":\"ORD-002\"}"},
{"Id":"3","MessageBody":"{\"order_id\":\"ORD-003\"}"}
]'
# Receive messages (long polling)
aws sqs receive-message \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/order-processing \
--max-number-of-messages 10 \
--wait-time-seconds 20 \
--attribute-names All \
--message-attribute-names All
# Delete a message after processing
aws sqs delete-message \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/order-processing \
--receipt-handle "AQEBwJnKyrHigUMZj6rYigCgxlaS3SLy0a..."
# Lambda handler for SQS event source mapping
import json
def lambda_handler(event, context):
failed_ids = []
for record in event['Records']:
try:
body = json.loads(record['body'])
order_id = body['order_id']
# Process the order
process_order(body)
# No need to delete -- Lambda deletes successful messages automatically
except Exception as e:
print(f"[FAIL] order {record['messageId']}: {e}")
failed_ids.append({"itemIdentifier": record["messageId"]})
# Partial batch failure reporting:
# Return failed message IDs so only those are retried.
# Without this, the entire batch is retried on any failure.
return {"batchItemFailures": failed_ids}
# Create a Lambda event source mapping for SQS
aws lambda create-event-source-mapping \
--function-name order-processor \
--event-source-arn arn:aws:sqs:us-east-1:123456789012:order-processing \
--batch-size 10 \
--maximum-batching-window-in-seconds 5 \
--function-response-types ReportBatchItemFailures
# batch-size: max messages per Lambda invocation (1-10000, default 10)
# maximum-batching-window: wait up to N seconds to fill the batch
# ReportBatchItemFailures: enable partial batch failure reporting
# FIFO queues guarantee message ordering and exactly-once processing
aws sqs create-queue \
--queue-name order-processing.fifo \
--attributes '{
"FifoQueue": "true",
"ContentBasedDeduplication": "true",
"DeduplicationScope": "messageGroup",
"FifoThroughputLimit": "perMessageGroupId",
"VisibilityTimeout": "60"
}'
# Queue name MUST end with .fifo
# ContentBasedDeduplication: SQS uses a SHA-256 hash of the message body
# as the deduplication ID. Duplicate messages within the 5-minute
# deduplication interval are discarded.
# DeduplicationScope + FifoThroughputLimit = "high throughput mode"
# Allows up to 3,000 msg/s per message group with batching.
# Send a FIFO message
aws sqs send-message \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/order-processing.fifo \
--message-body '{"order_id":"ORD-001","action":"created"}' \
--message-group-id "customer-C-123" \
--message-deduplication-id "ORD-001-created"
# MessageGroupId: messages with the same group ID are processed in order.
# Different group IDs are processed in parallel.
# Use customer ID, order ID, or entity ID as the group ID.
#
# MessageDeduplicationId: prevents duplicate messages within a 5-minute window.
# If ContentBasedDeduplication is enabled, this is optional.
# Standard vs FIFO comparison:
#
# Feature Standard Queue FIFO Queue
# -------------------------------------------------------------------
# Throughput Unlimited 300 msg/s (3,000 with
# batching + high throughput)
# Delivery At-least-once Exactly-once
# Message ordering Best-effort Strict (per message group)
# Deduplication None 5-minute deduplication window
# Pricing per million $0.40 (first 1M free) $0.50 (first 1M free)
# Queue name Any Must end with .fifo
# Lambda batch size Up to 10,000 Up to 10
#
# Use Standard when: high throughput is critical and your consumer
# can handle duplicate or out-of-order messages (idempotent processing).
#
# Use FIFO when: message ordering matters (e.g., state changes for an entity)
# or exactly-once processing is required (e.g., financial transactions).
# A DLQ captures messages that fail processing after a specified number of retries.
# Without a DLQ, poison messages (messages that always fail) block the queue forever.
# Step 1: Create the DLQ
aws sqs create-queue \
--queue-name order-processing-dlq \
--attributes '{"MessageRetentionPeriod":"1209600"}'
# Get the DLQ ARN
DLQ_ARN=$(aws sqs get-queue-attributes \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/order-processing-dlq \
--attribute-names QueueArn \
--query 'Attributes.QueueArn' --output text)
# Step 2: Configure the main queue to use the DLQ
aws sqs set-queue-attributes \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/order-processing \
--attributes '{
"RedrivePolicy": "{\"deadLetterTargetArn\":\"arn:aws:sqs:us-east-1:123456789012:order-processing-dlq\",\"maxReceiveCount\":\"3\"}"
}'
# maxReceiveCount: number of times a message can be received before
# being moved to the DLQ. Set to 3-5 for most workloads.
# After 3 failed attempts, the message is moved to the DLQ.
# Step 3: Configure a redrive allow policy on the DLQ
# This controls which source queues can use this DLQ
aws sqs set-queue-attributes \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/order-processing-dlq \
--attributes '{
"RedriveAllowPolicy": "{\"redrivePermission\":\"byQueue\",\"sourceQueueArns\":[\"arn:aws:sqs:us-east-1:123456789012:order-processing\"]}"
}'
# Step 4: Set an alarm on the DLQ
aws cloudwatch put-metric-alarm \
--alarm-name dlq-has-messages \
--namespace AWS/SQS \
--metric-name ApproximateNumberOfMessagesVisible \
--dimensions Name=QueueName,Value=order-processing-dlq \
--statistic Sum \
--period 300 \
--evaluation-periods 1 \
--threshold 1 \
--comparison-operator GreaterThanOrEqualToThreshold \
--alarm-actions arn:aws:sns:us-east-1:123456789012:ops-alerts
# Step 5: Redrive messages from DLQ back to the source queue
aws sqs start-message-move-task \
--source-arn arn:aws:sqs:us-east-1:123456789012:order-processing-dlq \
--destination-arn arn:aws:sqs:us-east-1:123456789012:order-processing
SNS is a pub/sub messaging service. Publishers send messages to a topic, and all subscribers receive a copy. Subscribers can be SQS queues, Lambda functions, HTTP endpoints, email addresses, SMS, or mobile push notifications.
# Create a topic
aws sns create-topic --name order-events
# Returns: TopicArn: arn:aws:sns:us-east-1:123456789012:order-events
# Subscribe an SQS queue
aws sns subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:order-events \
--protocol sqs \
--notification-endpoint arn:aws:sqs:us-east-1:123456789012:order-analytics
# Subscribe a Lambda function
aws sns subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:order-events \
--protocol lambda \
--notification-endpoint arn:aws:lambda:us-east-1:123456789012:function:order-notification
# Subscribe an email address
aws sns subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:order-events \
--protocol email \
--notification-endpoint ops-team@company.com
# Subscribe an HTTPS endpoint
aws sns subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:order-events \
--protocol https \
--notification-endpoint https://api.partner.com/webhooks/orders
# Publish a message
aws sns publish \
--topic-arn arn:aws:sns:us-east-1:123456789012:order-events \
--message '{"order_id":"ORD-001","event":"created","amount":99.50}' \
--message-attributes '{
"event_type": {"DataType":"String","StringValue":"order_created"},
"priority": {"DataType":"Number","StringValue":"1"}
}'
# Message filtering lets subscribers receive only the messages they care about.
# Filtering happens at the SNS level -- filtered messages are never delivered,
# so the subscriber does not pay for them.
# Subscribe with a filter policy
aws sns subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:order-events \
--protocol sqs \
--notification-endpoint arn:aws:sqs:us-east-1:123456789012:high-value-orders \
--attributes '{
"FilterPolicy": "{
\"event_type\": [\"order_created\", \"order_updated\"],
\"amount\": [{\"numeric\": [\">\", 1000]}]
}",
"FilterPolicyScope": "MessageAttributes"
}'
# This subscriber only receives messages where:
# - event_type is "order_created" OR "order_updated"
# - AND amount is greater than 1000
# Filter policy operators:
# Exact match: ["value1", "value2"]
# Prefix match: [{"prefix": "order_"}]
# Numeric match: [{"numeric": [">", 100]}]
# [{"numeric": [">=", 0, "<=", 1000]}] (range)
# Exists: [{"exists": true}]
# Negation: [{"anything-but": "test"}]
# IP address: [{"cidr": "10.0.0.0/8"}]
#
# FilterPolicyScope options:
# "MessageAttributes" (default) -- filter on message attributes
# "MessageBody" -- filter on JSON fields in the message body
# SNS FIFO topics pair with SQS FIFO queues for ordered fan-out
aws sns create-topic \
--name order-events.fifo \
--attributes '{
"FifoTopic": "true",
"ContentBasedDeduplication": "true"
}'
# Subscribe a FIFO queue to a FIFO topic
aws sns subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:order-events.fifo \
--protocol sqs \
--notification-endpoint arn:aws:sqs:us-east-1:123456789012:order-analytics.fifo
# Publish with message group ID
aws sns publish \
--topic-arn arn:aws:sns:us-east-1:123456789012:order-events.fifo \
--message '{"order_id":"ORD-001","event":"shipped"}' \
--message-group-id "customer-C-123"
# FIFO topic constraints:
# - Subscribers must be SQS FIFO queues (not standard queues, not Lambda directly)
# - Maximum 100 subscriptions per FIFO topic
# - Throughput: 300 publishes/s (3,000 with batching)
# The fan-out pattern uses SNS to broadcast a message to multiple SQS queues.
# Each queue processes the message independently and at its own pace.
#
# +---> [SQS: order-fulfillment] ---> Fulfillment Lambda
# |
# [SNS: order-events] ---> [SQS: order-analytics] ---> Analytics Lambda
# |
# +---> [SQS: order-notification] --> Notification Lambda
#
# Benefits:
# - Producers do not know about consumers (decoupled)
# - Each consumer processes at its own speed
# - Adding a new consumer = add a new SQS subscription (no code changes)
# - Each queue has its own DLQ and retry logic
# - If one consumer fails, others are not affected
# SQS queue policy: allow SNS to send messages to the queue
aws sqs set-queue-attributes \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/order-fulfillment \
--attributes '{
"Policy": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"sns.amazonaws.com\"},\"Action\":\"sqs:SendMessage\",\"Resource\":\"arn:aws:sqs:us-east-1:123456789012:order-fulfillment\",\"Condition\":{\"ArnEquals\":{\"aws:SourceArn\":\"arn:aws:sns:us-east-1:123456789012:order-events\"}}}]}"
}'
# Without this policy, SNS cannot deliver messages to the queue.
# SQS encryption at rest
# SSE-SQS (AWS managed key, free, enabled by default on new queues)
aws sqs create-queue \
--queue-name secure-queue \
--attributes '{"SqsManagedSseEnabled":"true"}'
# SSE-KMS (customer managed key, $1/month per key + API costs)
aws sqs create-queue \
--queue-name secure-queue-kms \
--attributes '{
"KmsMasterKeyId": "alias/sqs-key",
"KmsDataKeyReusePeriodSeconds": "300"
}'
# KmsDataKeyReusePeriodSeconds (default 300, max 86400):
# How long SQS reuses a data encryption key before requesting a new one.
# Higher values reduce KMS API costs but reuse keys longer.
# At 300s with 1000 msg/s, you get ~12 KMS API calls/hour instead of 3.6M.
# SNS encryption at rest
aws sns create-topic \
--name secure-events \
--attributes '{"KmsMasterKeyId":"alias/sns-key"}'
# In-transit: both SQS and SNS enforce TLS by default.
# SQS can enforce HTTPS-only with a queue policy:
# Condition: {"Bool": {"aws:SecureTransport": "false"}} -> Deny
# SQS queue policies control who can send to and receive from the queue.
# Common pattern: allow a specific AWS account or service to send messages.
aws sqs set-queue-attributes \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/order-processing \
--attributes '{
"Policy": "{
\"Version\": \"2012-10-17\",
\"Statement\": [
{
\"Effect\": \"Allow\",
\"Principal\": {\"AWS\": \"arn:aws:iam::987654321098:root\"},
\"Action\": \"sqs:SendMessage\",
\"Resource\": \"arn:aws:sqs:us-east-1:123456789012:order-processing\"
},
{
\"Effect\": \"Deny\",
\"Principal\": \"*\",
\"Action\": \"sqs:*\",
\"Resource\": \"arn:aws:sqs:us-east-1:123456789012:order-processing\",
\"Condition\": {\"Bool\": {\"aws:SecureTransport\": \"false\"}}
}
]
}"
}'
# The second statement enforces HTTPS-only access.
# Key SQS CloudWatch metrics:
# ApproximateNumberOfMessagesVisible -- messages available for processing
# ApproximateNumberOfMessagesNotVisible -- messages being processed (in flight)
# ApproximateNumberOfMessagesDelayed -- messages in delay period
# NumberOfMessagesSent -- messages sent to the queue
# NumberOfMessagesReceived -- messages received by consumers
# NumberOfMessagesDeleted -- messages deleted (successfully processed)
# ApproximateAgeOfOldestMessage -- age of the oldest message in seconds
# SentMessageSize -- size of messages sent
# Critical alarms:
# Queue depth growing (consumers falling behind)
aws cloudwatch put-metric-alarm \
--alarm-name queue-depth-high \
--namespace AWS/SQS \
--metric-name ApproximateNumberOfMessagesVisible \
--dimensions Name=QueueName,Value=order-processing \
--statistic Average \
--period 300 \
--evaluation-periods 3 \
--threshold 10000 \
--comparison-operator GreaterThanThreshold \
--alarm-actions arn:aws:sns:us-east-1:123456789012:ops-alerts
# Oldest message too old (messages stuck)
aws cloudwatch put-metric-alarm \
--alarm-name oldest-message-age \
--namespace AWS/SQS \
--metric-name ApproximateAgeOfOldestMessage \
--dimensions Name=QueueName,Value=order-processing \
--statistic Maximum \
--period 300 \
--evaluation-periods 1 \
--threshold 3600 \
--comparison-operator GreaterThanThreshold \
--alarm-actions arn:aws:sns:us-east-1:123456789012:ops-alerts
# Key SNS CloudWatch metrics:
# NumberOfMessagesPublished -- messages published to the topic
# NumberOfNotificationsDelivered -- successfully delivered to subscribers
# NumberOfNotificationsFailed -- delivery failures
# SQS pricing (us-east-1):
#
# Standard queues:
# First 1 million requests/month: free
# $0.40 per million requests after that
#
# FIFO queues:
# First 1 million requests/month: free
# $0.50 per million requests after that
#
# A "request" = 1 API action (SendMessage, ReceiveMessage, DeleteMessage, etc.)
# A single request can contain up to 10 messages (batch).
# Each 64 KB chunk of a message counts as 1 request.
# A 256 KB message = 4 requests.
#
# Data transfer: free within the same region.
# SNS pricing (us-east-1):
#
# Publishes:
# First 1 million/month: free
# $0.50 per million after that
#
# Deliveries:
# SQS: free
# Lambda: free
# HTTP/S: $0.06 per 100,000
# Email: $2.00 per 100,000
# SMS: varies by country ($0.00645/msg in US)
#
# SNS message filtering: free (no extra cost)
# Data transfer: free within the same region.
# Cost example: 10 million messages/month through SQS + SNS fan-out to 3 queues
#
# SNS publishes: 10M * $0.50/M = $5.00
# SNS to SQS deliveries: free
# SQS receives (3 queues): 30M receives + 30M deletes = 60M * $0.40/M = $24.00
# SQS sends (original): 10M * $0.40/M = $4.00
# Total: $33.00/month for 10M messages processed by 3 consumers
# When to use each:
#
# SQS (queue):
# - Point-to-point communication (one producer, one consumer)
# - Buffering: consumer processes at its own pace
# - Retry with DLQ: failed messages are retried automatically
# - Batch processing: accumulate messages and process in batches
#
# SNS (pub/sub):
# - Fan-out: one message to many consumers
# - Push-based: messages delivered immediately to subscribers
# - Multiple protocols: SQS, Lambda, HTTP, email, SMS
# - Message filtering: subscribers receive only relevant messages
#
# EventBridge (event bus):
# - Complex routing rules with 28+ targets
# - Schema registry and discovery
# - Cross-account event delivery
# - Scheduled events (cron/rate)
# - Third-party SaaS integrations (Shopify, Zendesk, PagerDuty)
# - Archive and replay events
# - $1.00 per million events (2x SNS price)
#
# Common combinations:
# SNS + SQS: fan-out with buffered processing
# EventBridge + SQS: complex routing with buffered processing
# SQS + Lambda: event-driven processing with automatic scaling
WaitTimeSeconds: 20) to reduce empty responses and API costs. Short polling returns immediately even when the queue is empty.ReportBatchItemFailures) so that only failed messages are retried, not the entire batch.ApproximateAgeOfOldestMessage to detect stuck messages and ApproximateNumberOfMessagesVisible to detect growing backlogs.This article is just the start. Get the full picture with our free whitepaper - 8 chapters covering IAM, S3, VPC, monitoring, agentic AI security, compliance, and a prioritized action plan with 50+ CLI commands.
Six production-proven AWS architecture patterns: three-tier web apps, serverless APIs, event-driven processing, static websites, data lakes, and multi-region disaster recovery with diagrams and implementation guides.
Complete guide to AWS cost optimization covering Cost Explorer, Compute Optimizer, Savings Plans, Spot Instances, S3 lifecycle policies, gp2 to gp3 migration, scheduling, budgets, and production best practices.
Complete guide to AWS AI services including Rekognition, Comprehend, Textract, Polly, Translate, Transcribe, and Bedrock with CLI commands, pricing, and production best practices.