AEM 6.3 – Amazon S3 Data Store vs File Data Store

AEM 6.3 – Amazon S3 Data Store vs File Data Store


In Adobe Experience Manager (AEM) , binary data can be stored independently from the content nodes. The binary data is stored in data store, where as content nodes are stored in a node store.When dealing with large number of binaries, it is recommended that external common data store to be used instead of the default node stores in order to maximize performance.

Separate binary data store is recommended for larger digital assets implementation to maximize application performance. Currently binary data and content nodes will be stored as Segment Store if you use TarMK or it will be stored as document store with Mongo MK unless you made any configurations to change this. In AEM 6.3, they made some enhancements to separate data store from segment store by default with TarMK.

When you separate data store, that can be used as common data store across author and publishers farms. So that it will reduce your storage and gives you better performance. It will also reduce replication load because you don’t need to replicate entire binary to publish instances. You just need to configure replication agents as Binary less replication. What it will do is, it will just replicate meta information of the asset to publishers. Publisher instances configured to common data store, so it will fetch binary information when needed.

Once your requirements need  a separate data store, you have two options

  • Amazon S3 Data Store
  • File Data Store ( SAN/NAS)

Amazon S3 Data Store:

It provides a way to store the binary data in Amazon’s Simple Storage Service (S3). It’s cloud based storage and need to buy storage to use this feature. AEM provides you S3 connector to setup this connection.

With this service you can scale your application data storage to unlimited. It internally creates copy of your data to support disaster recovery. Its also supports other features with AEM

  • Local Cache : You can cache specific size of binaries in your local file system to access faster. When you upload assets into system and  S3 connection not established, it will keep all the information in your local file system , once S3 connection available it will push to S3 storage.
  • It supports multi threaded content migration from File System to S3
  • You can configure asynchronous upload to S3.

File Data Store (SAN/NAS):

This is default option to store your binary content into file system. You can use NAS ( Network Attached Storage ) or SAN ( Storage Area Network) to store this information. It requires high performance devices in order to avoid performance issues.

If you go with File Data Store, you need to plan for disaster recovery setup. You should sync your content from active data center to passive data center. If active data center went down, manually you need bring up passive node to support application. It requires some maintenance and down time. You need plan for sync between two data centers without loosing data.

It requires planning and design to scale application one point of time. It will not support local cache feature what S3 provides.

S3 gives better performance with less maintenance and resources. If you have enough resources and infrastructure to support application SAN or NAS will be better. NAS price is high compared to SAN Storage.

Common data store will work for both TarMK and Mongo MK implementations.  TarMK will not support clustering so if you want to setup disaster recovery instance you may setup stand by instance which requires other configuration. In this case common data store is recommended between primary and stand by instances. Because if you upload larger assets it will not sync stand by instance properly.




0 comment

Please wait...


Subscribe to our Adobe Experience Beginners Guide newsletter to get the latest updates and posts.